Pioneers of Reinforcement Learning Win the Turing Award

Last updated: March 5, 2025 10:28 am

By MT HANNACH

4 Min Read

Disclosure: This website may contain affiliate links, which means I may earn a commission if you click on the link and make a purchase. I only recommend products or services that I personally use and believe will add value to my readers. Your support is appreciated!

In the 1980s, Andrew Barto And Rich Sutton were considered eccentric devotees to an elegant but ultimately condemned idea – to learn the machines, as humans and animals do, from experience.

Decades, with the technique they owed more and more critical for modern for modern artificial intelligence and programs like CatBarto and Sutton received the Turing Prize, the highest distinction in the field of computer science.

Barto, professor emeritus at the University of Massachusetts Amherst, and Sutton, professor at the University of Alberta, have a pioneer a technique known as the reinforcement learning, which involves having a computer to perform tasks through Experimentation combined with positive or negative feedback.

“When this work started for me, it was extremely old -fashioned,” recalls Barto with a smile, speaking on Zoom from his home in the Massachusetts. “It was remarkable that [it has] has made a certain influence and a certain attention, ”adds Barto.

The learning of strengthening was perhaps the most famous Used by Google Deepmind in 2016 to build alphagoA program that has learned himself how to play the incredibly complex and subtle board game to go to an expert level. This demonstration has aroused new interest in the technique, which was used in advertising, Optimization of energy consumption of the data centerFinance, and flea design. The approach also has a long history in roboticsWhere it can help machines learn to perform physical tasks thanks to trials and errors.

More recently, strengthening learning has been crucial to guide the release of large language models (LLMS) and produce extraordinarily competent chatbot programs. The same method is also used to form AI models for Mimim human reasoningand to build More competent AI agents.

Sutton notes, however, that the methods used to guide LLM imply that humans provide objectives rather than an algorithm only learning by their own exploration. He says that the fact that machines learn entirely themselves could ultimately be more fruitful. “The big division is whether [AI is] Learn people or if it is about learning from your own experience, ”he says.

The “work of Barto and Sutton has been a Lynchpin of progress in AI in recent decades”, “,” Jeff Deansaid a main vice-president of Google, in a statement published by the Association for computer machines (ACM) which distributes the Turing price. “The tools they have developed remain a central pillar of the BOOM of the AI and have made major progress.”

Reinforcement has a long and checked history within AI. It was there at the dawn of the field, when Alan Turing suggested that machines could learn from experience and comments in his famous 1950 article “IT and Intelligence Machines“, Who examines the idea that a machine could one day think like a human. Arthur Samuel, an AI pioneer, used strengthening learning to build one of the first automatic learning programs, A system capable of playing ladiesin 1955.