Trending

Stories

MIT Scientists Developed New Machine-learning Model to Solve Mystery

Tidio Live Chat Software - Add Tidio live chat software to your website in minutes. Contact visitors and turn them into happy customers. Enhance their experience and boost your sales. Get it for Free

WP Rocket - WordPress Caching Plugin

Must Read

Yusuf Balogun
Yusuf Balogunhttps://mssg.me/q19uh
Yusuf is a fresh law graduate and freelance journalist with a special interest in tech reporting. He joined the tech sphere in 2019 and has written several articles. He believes in tech innovations as an aspiring health law expert, in the future, Yusuf hopes to use the same for solving global health challenges.

Scientists from MIT, Google Research, and Stanford University, led by Ekin Akyürek, have found that massive neural network models that are similar to large language models are capable of containing smaller linear models inside their hidden layers, which the large models could train to complete a new task using simple learning algorithms.

The study demonstrates how powerful language models, like GPT-3, are capable of learning a new task from a small number of instances without the need for fresh training data. Large language models like OpenAI’s GPT-3 are neural networks that are capable of composing poetry and computer code in a manner that resembles human writing.

Also Read

These machine-learning models take a small piece of input text and then predict the text that will probably come next. They were trained using enormous amounts of internet data.

Generally, a machine-learning model like GPT-3 would need to be retrained with new data for this new task. The model’s parameters are updated as it processes fresh data to learn the task throughout this training phase. However, with in-context learning, the model’s parameters aren’t changed, giving the impression that the model has just picked up a new duty.

Solving Machine-learning Mystery

Elegant Themes - The most popular WordPress theme in the world and the ultimate WordPress Page Builder. Get a 30-day money-back guarantee. Get it for Free

Theoretical findings by the researchers demonstrate that these enormous neural network models are capable of concealing smaller, more straightforward linear models. Using just the data already present in the larger model, the larger model might then use a straightforward learning technique to train this smaller, linear model to accomplish a new task.

According to the primary author, Ekin Akyürek, this research opens the door to further investigation into the learning algorithms these huge models can employ and represents an essential step toward understanding the mechanics underlying in-context learning. Researchers could make it possible for models to fulfill new jobs without the need for expensive retraining by having a better knowledge of in-context learning.

“Usually, if you want to fine-tune these models, you need to collect domain-specific data and do some complex engineering.” But now we can just feed it input – five examples, and it accomplishes what we want. “In-context learning is a pretty exciting phenomenon,” Akyürek says.

Developing A New Model: “Transformer”

Akyürek put forth the theory that in-context learners are truly learning to carry out new tasks rather than just matching previously observed patterns. He and others have conducted experiments utilizing fake data that they could not have seen anywhere else to inspire these models, and they discovered that the models could still learn from a limited number of examples.

Akyürek and his partners hypothesized that these neural network models would contain more compact machine-learning models that they might train to carry out a new task.

The researchers employed a transformer neural network model, which has the same architecture as GPT-3 but was trained expressly for in-context learning, to verify this notion. They theoretically demonstrated that this transformer can write a linear model within its hidden states by examining its architectural design. Multiple layers of connected nodes that process data make up a neural network. The layers between the input and output layers are the hidden states.

Their mathematical analyses demonstrate that this linear model is encoded somewhere in the transformer’s earliest levels. The transformer can then apply straightforward learning techniques to update the linear model. The model essentially trains and mimics a smaller version of itself.

Akyürek’s Future Plans

Akyürek intends to carry on investigating in-context learning with functions that are more intricate than the linear models they looked at in this work in the future. Large language models could potentially be the subject of these experiments to determine whether basic learning algorithms can also adequately represent their actions. He also wants to understand more about the kinds of pre-training data that can support in-context learning.

“With this work, people can now visualize how these models can learn from examples. So, my hope is that it changes some people’s views about in-context learning,” Akyürek says. “These models are not as dumb as people think. They don’t just memorize these tasks. They can learn new tasks, and we have shown how that can be done.”

The research work will be presented at the International Conference on Learning Representations. Assisting Ekin Akyürek in the work are Dale Schuurmans, a research scientist at Google Brain and professor of computing science at the University of Alberta; as well as senior authors Jacob Andreas, the X Consortium Assistant Professor in the MIT Department of Electrical Engineering and Computer Science; Tengyu Ma, an assistant professor of computer science and statistics at Stanford; and Danny Zhou, principal scientist, and research director at Google Brain.

iThemes WordPress Hosting

Stay updated

Subscribe to our newsletter and never miss an update on the latest tech, gaming, startup, how to guide, deals and more.

Latest

Stories

- Advertisement -
- Advertisement -

Latest

Grow Your Business

Place your brand in front of tech-savvy audience. Partner with us to build brand awareness, increase website traffic, generate qualified leads, and grow your business.

- Advertisement -

Grow Your Business

Get these business solutions, tools and services to help your business grow.
Elementor

Elementor -Join 5,000,000+ Professionals Who Build Better Sites With Elementor. Build your website with 100% visual design that loads faster and speeds up the process of building them.

WP Rocket

WP Rocket - Speed up your website with the most powerful caching plugin in the world. The website speed increase means better SEO ranking, user experience, and conversation. It’s a fact that Google loves a fast site.

Kinsta

Kinsta - If you are looking for WordPress managed hosting, Kinsta is in the leading front. Kinsta provides WordPress hosting for a small or large business that helps take care of all your needs regarding your website with cutting-edge technology.

OptinMonster

OptinMonster - Instantly boost leads and grow revenue with the #1 most powerful conversion optimization toolkit in the world. 700,000+ websites are using OptinMonster to turn their traffic into leads, subscribers, and sales.

Related

- Advertisement -
- Advertisement -
Valve Confirms Counter Strike 2 Coming this summer Realme C55 Launched With Mini Capsule OPPO Find X6 Pro Launched in China Apple Pay Launched in South Korea HTC VIVE Self-tracking VR Tracker Render Candela Uncovers 30-passenger Electric Ferry – P-12 Samsung Galaxy F14 5G Launching March 24 in India New Zealand Ban TikTok Over Security Concerns iPhone 15 Pro Max To Get World’s Thinnest Screen YouTube Music Get Song and Album Options
Valve Confirms Counter Strike 2 Coming this summer Realme C55 Launched With Mini Capsule OPPO Find X6 Pro Launched in China Apple Pay Launched in South Korea HTC VIVE Self-tracking VR Tracker Render Candela Uncovers 30-passenger Electric Ferry – P-12 Samsung Galaxy F14 5G Launching March 24 in India New Zealand Ban TikTok Over Security Concerns