The winners of the Turing Award are worried about becoming the 'Oppenheimer' of the AI community.

DeepFlowTech

Author: Moonshot

In 1947, Alan Turing mentioned in a speech, “What we want is a machine that can learn from experience.”

78 years later, named after Turing, the Turing Award, known as the “Nobel Prize of the computing world,” was awarded to two scientists who devoted their lives to solving the Turing problem.

Andrew Barto and Richard Sutton jointly won the 2024 Turing Award. The two, who are nine years apart, are master and apprentice, and are the pioneers of AlphaGo and ChatGPT technologies, as well as technological pioneers in the field of machine learning.

Turing Award winners Andrew Barto and Richard Sutton

Image source: Turing Award official website

Google’s Chief Scientist Jeff Dean wrote in the award speech, ‘The reinforcement learning technology created by Barto and Sutton directly answers Turing’s question. Their work has been a key factor in the progress of AI over the past few decades. The tools they developed remain the core pillar of AI prosperity… Google is honored to sponsor the ACM A.M. Turing Award.’

The only sponsor of the Turing Award’s $1 million prize is Google.

And after winning, the two scientists standing in the spotlight pointed their fingers at the AI giants, issuing a “thank-you speech” to the media: Today’s AI companies are “driven by commercial incentives” rather than focusing on technical research, building “an untested bridge in society for people to cross and test.”

Coincidentally, the last time the Turing Award was awarded to scientists in the field of artificial intelligence was in the 2018 session, when Joshua Bengio, Geoffrey Hinton, and Yann LeCun were awarded for their contributions in the field of deep learning.

2018 Turing Award winners

Image source: eurekalert

Among them, Joshua Benhao and Jeffrey Sinton, also the winners of the 2024 Nobel Prize in Physics, two ‘fathers of artificial intelligence’, have also frequently called for global society and the scientific community to be vigilant against the abuse of artificial intelligence by large companies in the recent AI wave.

Jeffrey Sutton also resigned directly from Google in order to “speak freely”, this award-winning Sutton also served as a research scientist at DeepMind from 2017 to 2023.

As the highest honor in the computer field is awarded time and time again to the founders of AI core technology, an intriguing phenomenon gradually emerges:

Why do these top scientists always turn around in the spotlight and sound the alarm for AI?

The “bridge builder” of artificial intelligence

If Alan Turing is the guiding figure of artificial intelligence, then Andrew Barton and Richard Sutton are the “bridge builders” on this path.

At the time when artificial intelligence is surging ahead, after being praised, they are reexamining the bridges they have built to see if they can bear the safe passage of humans?

Perhaps the answer lies in their academic career spanning half a century - only by tracing how they constructed ‘machine learning’, can we understand why they are wary of ‘technical out of control’.

Image source: Carnegie Mellon University

In 1950, Alan Turing raised a philosophical and technological question at the beginning of his famous paper ‘Computer Machinery and Intelligence’:

Can machines think?

Thus, Turing designed the ‘imitation game’, which is widely known in later generations as the ‘Turing test’.

Turing also proposed that machine intelligence can be obtained through learning rather than just relying on pre-programming. He envisaged the concept of ‘Child Machine’, which means that machines can gradually learn like children through training and experience.

The core goal of artificial intelligence is to build intelligent agents that can perceive and take better actions, and the measure of intelligence is the ability of intelligent agents to judge “some actions are better than others”.

The purpose of machine learning is to give machines feedback after taking actions and enable them to learn autonomously from the feedback experience. In other words, Turing’s idea of machine learning based on rewards and punishments is no different from Pavlov’s training of dogs.

I am getting stronger and stronger as I play more in the game, which is also a kind of ‘reinforcement learning’

Image source: zequance.ai

The road of machine learning introduced by Turing, after thirty years, was finally built by a master and apprentice - Reinforcement Learning (RL).

In 1977, Andrew Barto, inspired by psychology and neuroscience, began to explore a new theory of human intelligence: neurons are like “hedonists,” with billions of neurons in the human brain, each trying to maximize pleasure (reward) and minimize pain (punishment). Neurons do not simply receive and transmit signals mechanically; if the activity pattern of a neuron leads to positive feedback, it tends to repeat that pattern, thus collectively driving the human learning process.

In the 1980s, Barto brought his doctoral student Richard Sutton and tried to apply the neural theory of ‘continuously trying, adjusting connections based on feedback, and finding the optimal behavioral pattern’ to artificial intelligence. Reinforcement learning was born.

“Reinforcement Learning: An Introduction” has become a classic textbook, cited nearly 80000 times.

Image source: IEEE

The two of them used the mathematical foundation of Markov decision processes to develop and write many core algorithms of reinforcement learning, systematically constructed the theoretical framework of reinforcement learning, and wrote the textbook ‘Reinforcement Learning: An Introduction’, enabling tens of thousands of researchers to enter the field of reinforcement learning, both of them can be regarded as the fathers of reinforcement learning.

And their goal in studying reinforcement learning is to explore the most efficient, accurate, reward-maximizing, and optimal machine learning methods.

The ‘God’s Hand’ of Reinforcement Learning

If machine learning is “cramming” learning, then reinforcement learning is “grazing” learning.

Traditional machine learning is to feed the model with a large amount of well-annotated data, establish a fixed mapping relationship between input and output. The most classic scenario is to show a bunch of photos of cats and dogs to the computer, tell it which one is a cat, which one is a dog, as long as enough images are fed, the computer will be able to recognize cats and dogs.

Reinforcement learning is when a machine adjusts its behavior gradually to optimize results through trial and error and a system of rewards and punishments without explicit guidance. It’s like a robot learning to walk: it doesn’t need humans to constantly tell it ‘this step is right, that step is wrong.’ It just needs to try, fall, adjust, and eventually it will be able to walk on its own, even developing its unique gait.

Obviously, the principle of reinforcement learning is closer to human intelligence, like every toddler learning to walk in falls, learning to grab in exploration, capturing syllables in babble, and learning language.

The ‘Roundhouse Kick Robot’ on fire is also trained by reinforcement learning.

Image source: Yushu Technology

The ‘highlight moment’ of reinforcement learning was the ‘divine move’ of AlphaGo in 2016. At that time, AlphaGo made a surprising move on the 37th move in the game against Lee Sedol, reversing the losing situation with one move and winning against Lee Sedol.

Top players and commentators in the Go world did not anticipate that AlphaGo would play at this position, because in the experience of human players, this move was “unfathomable.” After the game, Lee Sedol admitted that he had never considered this move at all.

AlphaGo did not rely on memorizing the moves to come up with the ‘divine move,’ but rather, it was discovered through numerous self-play games, trial and error, long-term planning, and strategy optimization, which is the essence of reinforcement learning.

Lee Sedol, whose rhythm was disrupted by AlphaGo’s ‘divine’ move

Image source: AP

Reinforcement learning even subverts the principal contradiction, impacting human intelligence. Just like when AlphaGo revealed its ‘divine move,’ human Go players began to study and research AI’s moves in the game. Scientists are also using the algorithms and principles of reinforcement learning to try to understand the learning mechanisms of the human brain. One of the research achievements of Barto and Santoyo is the establishment of a computational model to explain the role of dopamine in human decision-making and learning.

And reinforcement learning is particularly good at dealing with complex rules and changing environments, finding the optimal solution, such as Go, autonomous driving, robot control, and having witty conversations with humans who are vague in their language.

These are the most cutting-edge and popular AI application areas, especially in large language models, almost all leading large language models use the RLHF (Reinforcement Learning from Human Feedback) training method, that is, humans score the model’s answers, and the model improves according to the feedback.

But this is exactly Barto’s concern: after the big companies build the bridge, they test the safety of the bridge by letting people walk back and forth on it.

“Pushing software directly to millions of users without any security measures is not a responsible practice,” Barto said in an interview after winning the award.

“Technological development should be accompanied by the control and avoidance of potential negative impacts, but I have not seen these AI companies truly do this,” he added.

What is the top AI worried about after all?

The AI threat theory never ends, because scientists are most afraid of the future they have created getting out of control.

Barto and Sutton’s ‘acceptance speech’ is not a harsh criticism of current AI technology, but rather a litany of complaints against AI companies.

In interviews, they all warned that the current development of artificial intelligence relies on big companies rushing to launch powerful but error-prone models, through which they have raised a large amount of funds and then continue to invest tens of billions of dollars in a competition of chips and data.

Major investment banks are revaluing the AI industry

Image source: Goldman Sachs

Indeed, according to research by Deutsche Bank, the total investment of technology giants in the field of AI is approximately $340 billion, a scale that has exceeded the annual GDP of Greece. Industry leader OpenAI, with a valuation of $260 billion, is preparing to launch a new round of $40 billion in financing.

In fact, many AI experts agree with Barto and Sutton’s views.

Former Microsoft executive Stephen Sinofsky has previously stated that the AI industry is in a predicament of scale, relying on burning money to make technological progress, which goes against the historical trend of decreasing costs rather than increasing.

On March 7, former Google CEO Eric Schmidt, Scale AI founder Alex Wang, and AI Security Center Director Dan Hendricks jointly published a warning paper.

Three top players in the tech industry believe that the current development of artificial intelligence in the forefront is similar to the competition that gave rise to the Manhattan Project for nuclear weapons. AI companies are quietly conducting their own ‘Manhattan Projects’, with their investments in AI doubling every year over the past decade. Without intervention and regulation, AI could become the most unstable technology since the nuclear bomb.

“Super Intelligence Strategy” and co-authors

Image source: nationalsecurity.ai

Yoshua Bengio, who won the Turing Award for deep learning in 2019, also issued a long warning in his blog that the AI industry now has tens of trillions of dollars in value for capital to chase and grab, with enough influence to seriously disrupt the current world order.

Many technology professionals believe that the current AI industry has deviated from the research of technology, the scrutiny of intelligence, and the vigilance against the abuse of technology, and has moved towards a capital-profit model of investing money in chips.

Building huge data centers, taking users’ money and letting them use software that may not be secure is not a motivation I endorse," Barto said in an interview after receiving the award.

The first International Scientific Report on Advanced Artificial Intelligence Security, co-authored by 75 AI experts from 30 countries, states, “The methods for managing general artificial intelligence risks often rely on the assumption that AI developers and policymakers can accurately assess the capabilities and potential impacts of AGI models and systems. However, scientific understanding of the internal workings, capabilities, and social impacts of AGI is actually very limited.”

Joshua Benhao’s Warning Long Text

Image source: Yoshua Bengio

It is not difficult to see that the current ‘AI threat theory’ has shifted its focus from technology to large companies.

Experts are warning big companies: you burn money, pile up materials, roll parameters, but do you really understand the products you are developing? This is also the origin of Barto and Sandton borrowing the metaphor of “building bridges”, because technology belongs to all mankind, but capital belongs only to big companies.

Not to mention Barto and Sutton’s long-standing research area: reinforcement learning. Its principles are more in line with human intelligence, and it has a ‘black box’ feature, especially in deep reinforcement learning, where AI behavior patterns become complex and difficult to explain.

This is also the concern of human scientists: they have contributed to and witnessed the growth of artificial intelligence, but find it difficult to interpret its intentions.

The Turing Award winners who pioneered deep learning and reinforcement learning technologies are not worried about the development of AGI (Artificial General Intelligence), but are concerned about the arms race between major companies, which may lead to an “intelligence explosion” in the AGI field, accidentally creating ASI (Artificial Superintelligence). The distinction between the two is not just a technical issue, but also concerns the future destiny of human civilization.

ASI, which surpasses human intelligence, will far exceed human understanding in terms of the amount of information mastered, decision speed, and level of self-evolution. Without extremely careful design and governance of ASI, it may become the last and most insurmountable technological singularity in human history.

In the current AI frenzy, these scientists may be the most qualified to “pour cold water”. After all, fifty years ago, when the computer was still a behemoth, they had already started research in the field of artificial intelligence. They have shaped the present from the past and have the position to doubt the future.

Will AI leaders face an outcome like Oppenheimer?

Image source: The Economist

In the February interview with The Economist, the CEOs of DeepMind and Anthropic said:

Will lose sleep worrying about becoming the next Alzheimer’s.

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments