ai-program-plays-the-long-game-to-solve-decades-old-math-problems

A chess match demands that its participants anticipate several maneuvers in advance, a talent that computer algorithms have perfected over time. In 1996, an IBM supercomputer famously triumphed over the reigning world chess champion, Garry Kasparov. Subsequently, in 2017, an AI program created by Google DeepMind, named AlphaZero, outperformed the most advanced computerized chess engines of that period after educating itself on the game in just a few hours.

Recently, some mathematicians have started to actively explore whether AI algorithms can also assist in solving some of the most challenging mathematical conundrums globally. However, while a standard chess match typically consists of around 30 to 40 moves, these advanced mathematical problems demand solutions that involve a million or more actions.

In a preprint document, a group led by Caltech’s Sergei Gukov, who holds the John D. MacArthur Professorship in Theoretical Physics and Mathematics, details the development of a novel machine-learning algorithm capable of solving math problems that require extremely lengthy sequences of steps. They employed this innovative algorithm to address families of issues linked to a long-standing mathematical dilemma known as the Andrews–Curtis conjecture. Essentially, the algorithm can project further ahead than even sophisticated programs such as AlphaZero.

“Our program seeks to discover long sequences of moves that are uncommon and challenging to identify,” states lead author Ali Shehper, a postdoctoral researcher at Rutgers University who will soon transition to Caltech as a research scientist. “It’s akin to navigating through a maze the size of the Earth. These routes are extensive and must be carefully tested, with only one correct path.”

The application of AI in solving mathematical issues has garnered increasing attention. Google DeepMind’s AlphaProof performed at an achievement level equivalent to a silver medalist in the 2024 International Mathematical Olympiad, a prestigious competition at the high school level. Moreover, OpenAI’s o3 program has recently deduced solutions for benchmark challenges across mathematics, science, and computer programming.

The mathematicians at Caltech are concentrating on the most formidable problems within their discipline, not on routine ones. In the recent study, they exploited AI to tackle two families of challenges pertaining to the Andrews–Curtis conjecture, a group theory issue initially proposed 60 years ago.

Although they did not resolve the primary conjecture itself, they invalidated families of challenges, known as potential counterexamples, which had remained unresolved for nearly 25 years; they further advanced their understanding on another family of counterexamples that have been in question for 44 years. Counterexamples are essentially mathematical situations that could disprove an original conjecture. If the counterexamples are disproven, the original conjecture may still hold true.

“Eliminating some counterexamples boosts our confidence in the integrity of the original conjecture and aids in evolving our intuition regarding the principal problem. It provides us with new perspectives,” Shehper remarks.

Gukov mentions that traversing these mathematical challenges is akin to “traveling from A to B” through intricate, convoluted paths that necessitate thousands, millions, or even billions of steps. He likens these challenges to unraveling an extremely complex Rubik’s Cube.

“Can you return this disordered, intricate Rubik’s Cube to its original configuration? You must explore these extensive sequences of moves, and you won’t ascertain if you’re on the right track until the conclusion,” explains Gukov, who also directs Caltech’s new Richard N. Merkin Center for Pure and Applied Mathematics.

The AI program developed by the team learned to generate lengthy sequences of moves—referred to by the researchers as “super moves”—that are unusual, or what they consider outliers. This operates in contrast to how AI systems like ChatGPT function.

“When you ask ChatGPT to compose a letter, it will produce something conventional. It’s improbable that it will generate anything truly original and distinctive. It’s an effective imitator,” Gukov notes. “Our system excels in producing outliers.”

To train their AI program, the researchers utilized a machine-learning approach known as reinforcement learning. Initially, the team presented the AI with straightforward problems to solve, progressively introducing more complex challenges. “It experiments with different moves and receives rewards for solving the problems,” Shehper clarifies. “We encourage the program to pursue more of the same while maintaining a degree of curiosity. Ultimately, it invents new strategies that surpass human capabilities. That’s the essence of reinforcement learning.”

Currently, AI systems generally lack proficiency in predicting rare, outlier events with significant repercussions, such as stock market crashes. The team’s new algorithm cannot make such predictions either, yet it may hold the foundational elements necessary for making intelligent forecasts of this kind. “Essentially, our program learns how to learn,” Gukov states. “It’s engaged in thinking beyond conventional boundaries.”

The newly developed algorithm has already generated considerable excitement within the mathematics community. “We’ve achieved substantial advancements in an area of mathematics that has persisted for decades,” Gukov remarks. “Progress had been rather slow, but now it’s thriving and bustling.” Indeed, three new mathematicians have joined the initiative—Lucas Fagan and Zhenghan Wang from UC Santa Barbara and Yang Qiu from Nankai University in Tianjin, China—and the team has published another preprint document detailing solutions to even more families of potential counter instances related to the Andrews–Curtis conjecture.

Rather than amplifying the AI models, the group’s method has been to discover innovative tricks and strategies that do not necessitate vast amounts of computing resources. “We aim to showcase efficient performance on small-scale computers, readily available to a small academic collaboration, allowing our colleagues worldwide to easily replicate these outcomes.”

This endeavor was made feasible thanks to contributions from private benefactors. The donations also facilitated the establishment of a new math and AI group at Caltech, concentrating on the creation of AI systems capable of addressing challenging research-level mathematical issues.

Other contributors to the initial preprint investigation titled “What makes math problems hard for reinforcement learning: a case study” include Anibal M. Medina-Mardones from Western University in Canada, Bartłomie Lewandowski and Piotr Kucharski from the University of Warsaw in Poland, alongside Angus Gruen (PhD ’23) from Polygon Zero.


Leave a Reply

Your email address will not be published. Required fields are marked *

Share This