2024 Nobel Prize in Physics goes to AI godfather
2024 Nobel Prize in Physics announced. John J. Hopfield and Geoffrey E. Hinton were awarded the prize "for their fundamental discoveries and inventions for machine learning using artificial neural networks."
Their research laid an important theoretical foundation for modern artificial intelligence, enabling computers to simulate human memory and learning processes.
This award of the Nobel Prize in Physics.It marks the highest global academic recognition of the importance of AI research, particularly in the field of machine learning and neural networks.Hopfield and Hinton's research not only significantly advanced modern computing technology, but also crossed the disciplinary boundaries between physics, computer science and neuroscience, with far-reaching implications.
John Hopfield is best known for proposing the "Hopfield network", a framework for storing and reconstructing information that became an important model for early artificial neural networks.Hopfield's work introduced neural networks to the fields of memory and pattern recognition, inspiring later deep learning techniques. His research not only contributed to the early development of neural networks, but also helped people understand how the brain works from a whole new perspective.
Jeffrey Hinton's contributions, on the other hand, have focused on the development of backpropagation algorithms, a key technique in the training of modern neural networks.Backpropagation allows artificial neural networks to autonomously learn and discover complex patterns in data by automatically adjusting their internal weights as they process the data. This technique is critical to today's deep learning field, widely used in key AI applications such as speech recognition, image processing and natural language understanding. (Yuan Ning)
The following is from the official Nobel Prize presentation:
Nobel Prize in Physics 2024 - Science Background
They use physics to find patterns in information
This year's Nobel Prize winner in Physics utilized the tools of physics to build methods that laid the foundation for today's powerful machine learning. John Hopfield created a structure that can store and reconstruct information. Geoffrey Hinton invented a way to independently discover properties in data, which is critical to today's large artificial neural networks.
Many people have experienced the ability of computers to translate between languages, interpret images, and even carry on reasonable conversations. What is perhaps less well known is that such techniques have long been important for research, including the classification and analysis of large amounts of data. The last fifteen to twenty years have seen an explosion in the development of machine learning, which utilizes a structure called an "artificial neural network". Today, when we talk about artificial intelligence, this is usually the technology we are referring to.
While computers cannot think, machines are now able to mimic functions such as memory and learning. This year's Nobel Prize winners in physics are the very people who helped make this possible. Using basic concepts and methods from physics, they have developed technologies that can process information using the structure of networks.
Mimicry of the brain
Artificial neural networks, which use the entire structure of the network to process information, were originally inspired by an understanding of how the brain works.In the 1940s, researchers began to think about the math behind the network of neurons and synapses in the brain. Another part of the inspiration came from psychology, where neuroscientist Donald Hebb developed hypotheses about how learning occurs, suggesting that connections between neurons are strengthened when they work together.
These ideas were then applied to the construction of artificial neural networks through computer simulations. In these simulations, neurons in the brain were modeled as nodes given different values, and synapses were represented by connections between nodes that could be made stronger or weaker. Herb's hypothesis is still used today as one of the basic rules for updating artificial networks, a process known as "training".
associative memory
Imagine you're trying to recall a word you rarely use, such as the term for the sloping floors common in movie theaters and lecture halls. As you search your memory, you might think of "slope" ...... Maybe "gradient"? No, it's "rake", that's it!
This process of searching for the correct word through similar words is similar to the discovery of associative memory by physicist John Hopfield in 1982. The Hopfield network is capable of storing patterns and has a way of finding the closest stored pattern when it receives an incomplete or slightly distorted pattern.
Hopfield had used his background in physics to study theoretical problems in molecular biology. At a neuroscience conference, he was introduced to and inspired by studies of brain structure and began to think about the dynamics of simple neural networks. When neurons work together, they can generate new and powerful properties that are not visible when the parts of the network are viewed individually.
How the Hopfield network works
The Hopfield network is constructed through nodes and connections, each of which can store independent values - in Hopfield's initial research, these can be 0 or 1, similar to pixels in a black-and-white picture.
Hopfield describes the overall state of the network in terms of an energy property similar to that of spin systems in physics. The energy is calculated by a formula that contains all the values of the nodes and the strength of the connections between them. The network is programmed by inputting an image, where the nodes are given a value of black (0) or white (1), and then the energy formula is used to adjust the network's connections so that the stored image has a lower energy. When another pattern is fed into the network, there is a rule that checks the nodes one by one and changes the color of the node if its value is changed with a decrease in energy. This process continues until no further improvement is possible, and eventually the network tends to reproduce the original image it had been trained on.
The web saves images in a "landscape".
The nodes in the network constructed by Hopfield are connected to each other through connections of varying strengths. Each node can store an independent value - in Hopfield's original research, these values could be 0 or 1, similar to pixels in a black-and-white picture.
Hopfield describes the overall state of the network in terms of a property similar to the energy of a spin system in physics. The energy is calculated through a formula that involves all the values of the nodes and the strength of the connections between them. The network is programmed by inputting an image, and nodes are assigned a value of black (0) or white (1). The energy formula is then used to adjust the network's connections so that the saved image has a lower energy. When the network is fed a new pattern, it checks each node in turn and decides whether to change the node's value depending on whether the energy is lowered. If changing a black pixel to a white pixel lowers the energy, it will change. This process continues until no further improvements can be found. When this point is reached, the network usually reproduces the original image it was trained on.
Perhaps this doesn't seem very unusual if you just save a pattern. You might wonder why not just save the image itself and compare it to the input image. What's special about Hopfield's approach is that it can save multiple images at once, and the network is usually able to distinguish between them.
Hopfield compares the network's search for a storage state to rolling a small ball across a landscape of peaks and valleys, where the ball's movement is slowed by friction. If the ball is released from a particular location, it will roll to the nearest valley and stop there. Similarly, when the network receives an input that is close to a storage pattern, it keeps "rolling forward" until it reaches the bottom of one of the valleys in the energy landscape, thus finding the closest storage pattern.
Hopfield networks are able to reconstruct data that contains noise or is partially lost.
Classification using nineteenth-century physics
Memorizing an image is one thing, but interpreting what is depicted in the image requires more skill.
Even very young children can confidently point out different animals, such as dogs, cats or squirrels. Although sometimes they may get it wrong, they will soon be able to get it right almost every time. Children don't need to see any charts or explanations of species or mammals; by encountering several examples of animals, their minds naturally organize the categories.
Jeffrey Hinton was working at Carnegie Mellon University in Pittsburgh, USA, when Hopfield published his article on associative memory. He studied experimental psychology and artificial intelligence early on and pondered whether machines could process patterns like humans, discovering categories and interpreting information for themselves. With colleague Terrence Sejnowski, Hinton started with Hopfield networks and used ideas from statistical physics to build something new.
Statistical physics describes systems composed of many similar elements, such as the molecules in a gas. It is difficult or nearly impossible to trace each individual molecule in a gas, but it is possible to determine the overall properties of a gas, such as pressure or temperature, by ensembling them. Statistical physics allows you to analyze the joint likelihood of individual states in a system and calculate the probability of their occurrence. Some states are more likely to occur than others, depending on the amount of energy available. The distribution of this energy can be described by the equations of the 19th century physicist Ludwig Boltzmann. Hinton's network uses this equation, and the method was published in 1985 under the name "Boltzmann Machine".
Identify new examples of the same type
Boltzmann machines typically use two different types of nodes. One group of nodes are visible nodes into which information is fed. The other group are hidden nodes whose values and connections also affect the energy of the entire network.
The machine operates by a rule of updating the values of the nodes one by one, and eventually the machine enters a state where the patterns of the nodes can change but the properties of the network as a whole remain the same. Each possible pattern will have a specific probability according to the energy equation of the network. When the machine stops running, it creates a new pattern, which makes the Boltzmann machine one of the early examples of generative modeling.
A trained Boltzmann machine can recognize familiar features in information it has never seen before. It's like how you can immediately tell when you meet your friend's siblings for the first time that they're related. Similarly, a Boltzmann machine can recognize a completely new example as long as it belongs to one of the categories in the training data and distinguish it from dissimilar material.
Different types of networks
There are some important differences between Hopfield networks, Boltzmann machines, and restricted Boltzmann machines.
- Hopfield networkis an associative memory network where all nodes are connected to each other and information is entered and read between all nodes.
- Boltzmann machineIt usually consists of two layers where information is input and read through the visible node layer. The hidden node layer affects the operation of the entire network.
- restricted Boltzmann machineThen there are no connections between nodes of the same layer. They are usually used in a chained fashion, one after the other. After training the first restricted Boltzmann machine, the contents of the hidden nodes are used to train the next machine, and so on.
A Boltzmann machine can learn by examples rather than by explicit instructions. It trains by updating the values of the connections in the network to ensure that the example patterns input to the visible nodes during training have the highest probability of occurrence when the machine is run. If a pattern is repeated during training, that pattern will have a higher probability of occurrence. Training also affects the probability of outputting a new pattern that is similar to the trained pattern.
A trained Boltzmann machine is able to recognize features it has not seen before. Like when meeting a friend's siblings, you might immediately see that they are related. Similarly, a Boltzmann machine can recognize new examples that belong to a particular class of the training material and distinguish material that is not similar to it.
Machine Learning - Today and Tomorrow
Thanks to their contributions starting in the 1980s, John Hopfield and Jeffrey Hinton laid the groundwork for the machine learning revolution that sprang up around 2010.
The developments we are seeing now are due to the large amount of data used to train the networks and the huge increase in computational power. Today's artificial neural networks tend to be very large and consist of many layers. These are called deep neural networks and the training method is called deep learning.
A brief review of Hopfield's 1982 article on associative memory provides some appreciation of this development. In his article, he used a network containing 30 nodes. If all nodes are connected to each other, there are 435 connections. The nodes have their values, the connections have different strengths, and in total there are just under 500 parameters to keep track of. He also tried a network of 100 nodes, but the calculations were complicated by the limitations of the computers used at the time. Today's large language models are much larger in comparison, and their networks can contain more than a trillion parameters (i.e., a million of millions).
Many researchers are currently developing application areas for machine learning. It remains to be seen which area is ultimately the most viable, and there is also a broad ethical discussion surrounding the development and use of this technology.
Since physics has provided the tools for the development of machine learning, it is interesting to note that physics as a field of study is also benefiting from artificial neural networks. Machine learning has long been used in some familiar physics Nobel Prize areas, such as the use of machine learning to sift through and process the massive amounts of data needed to discover the Higgs particle. Other applications include reducing noise in measuring gravitational waves from black hole collisions, or searching for exoplanets.
In recent years, the technique has also begun to be used to calculate and predict the properties of molecules and materials - for example, calculating the structure of protein molecules, which determines their function, or deducing which new materials might have properties best suited for use in more efficient solar cells.
John Hopfield.
Born in 1933 in Chicago, Illinois, U.S.A. He received his Ph.D. from Cornell University in 1958. He is currently a professor at Princeton University, USA.
Jeffrey Hinton.
Born in 1947 in London, England, he received his PhD from the University of Edinburgh, England, in 1978. He is currently a professor at the University of Toronto, Canada.
The Royal Swedish Academy of Sciences has decided to award the 2024 Nobel Prize in Physics:
"Their foundational discoveries and inventions to enable machine learning using artificial neural networks."
© Copyright notes
The copyright of the article belongs to the author, please do not reprint without permission.
Related posts
No comments...