Teaching machines in the way in which animal coaches shape the behavior of dogs or horses has been an important method for developing artificial intelligence and was recognized on Wednesday with the Top Computing Sciences Award.
Two pioneers in the field of reinforcement learning, Andrew Barto and Richard Sutton, are the winners of this year’s Aming Award, the equivalent of the Technological World of the Nobel Prize.
The investigation that Barto, 76, and Sutton, 67, began at the end of the 1970s raided the way for some of the advances in the last decade. In the heart of his work he was channeling the so -called “hedonist” machines that could continually adapt their behavior in response to positive signals.
Reinforcement learning is what led to a Google computer program to overcome the The best human players in the world of the old Chinese board game go in 2016 and 2017. It has also been a key technique to improve popular AI tools such as Chatgpt, optimize financial trade and help a robotic hand to solve a Rubik’s cube.
But Barto said the field “was not fashionable” when he and his doctoral student, Sutton, began to create their theories and algorithms at the University of Massachusetts, Amherst.
“We were in the desert,” Barto said in an interview with Associated Press. “That is why it is so gratifying to receive this award, see that this becomes more recognized as something relevant and interesting. In the first days, it was not. “
Google sponsors the annual award of $ 1 million, which was announced Wednesday by the Association for Computing Machinery.
Barto, now retired from the University of Massachusetts, and Sutton, a professor for a long time at the University of Alberta de Canada, are not the first Ai pioneers to win The prize is named after the British mathematician, the code code and the first Ai thinker Alan Turing. But his research has tried directly to respond to the 1947 Turing call for a machine that “can learn from experience”, which Sutton describes as “possibly the essential idea of reinforcement learning.”
In particular, they borrowed from ideas in psychology and neuroscience about the way in which neurons seek pleasure respond to rewards or punishment. In a historical article published in the early 1980s, Barto and Sutton established their new focus on a specific task in a simulated world: to balance a post in a moving cart to prevent it from falling. The two computer scientists later were co -authors of a textbook widely used on reinforcement learning.
“The tools they developed remain a central pillar of the AI boom and have made important advances, attracted legions of young researchers and promoted billions of dollars in investments,” said Google chief scientist Jeff Dean, in a written statement.
In a joint interview with the AP, Barto and Sutton did not always agree on how to evaluate the risks of AI agents who constantly seek to improve. They also distinguished their work from the generative technology branch that is currently fashionable: the great language models behind the chatbots manufactured by OpenAi, Google and other technological giants that mimic human writing and other media.
“The great option is, are you trying to learn from people’s data or do you try to learn from the life of an agent (AI) and their own experience?” Sutton said.
Sutton has ruled out what he describes as exaggerated concerns about the threat of AI for humanity, while Barto did not agree and said: “You must be aware of possible unexpected consequences.”
Barto, retired for 14 years, describes himself as a ludite, while Sutton is adopting a future that hopes to have greater intelligence beings than current humans, an idea sometimes known as posthumanism.
“People are machines. They are wonderful and wonderful machines, “but they are not the” final product “and they could work better, Sutton said.
“It is intrinsically part of the AI company,” said Sutton. “We are trying to understand ourselves and, of course, to do things that can work even better. Maybe to become those things. “