Can AI Create New Mathematics? In A Single Second, AlphaEvolve Can Find A Solution That Scientists Have Been Searching For Years.

AlphaEvolve has proven that machines can participate in fundamental science.

Google DeepMind has introduced the AlphaEvolve system, which uses large language models to search for and test new combinatorial structures. These structures advance research in theoretical computer science, particularly in the complexity of approximate computations.

Researchers note that modern language models have already shown strong results in mathematics and programming, but they have hardly participated in the discovery of new theorems. The main problem here is absolute correctness, which is necessary in mathematics. Any statement must be either formally proven or verified by an expert.

In the article ‘Reinforced Generation of Combinatorial Structures: Applications to Complexity Theory,’ the authors describe how AlphaEvolve helps find new structures, which are then automatically verified by a computer. The system works on the principle of evolution: it generates a set of program fragments, selects the most successful ones, and gradually improves them. This approach has led to progress in two areas: the MAX-4-CUT problem and the study of the properties of random graphs.

For the MAX-4-CUT problem, it was previously known that it was impossible to solve it with an approximation better than a coefficient of 0.9883. AlphaEvolve found a new so-called ‘gadget’ — a special construction with 19 variables and a complex system of weights. This made it possible to improve the result to 0.987. In the field of approximation theory, such steps are considered a significant achievement, since each new barrier is difficult to overcome.

In addition, the system investigated the average complexity of problems on random graphs, where Ramanujan graphs play a key role. AlphaEvolve was able to find such graphs with hundreds of vertices — significantly more than had been possible before. This helped to refine the boundaries of computational difficulty and bring the lower and upper estimates closer together.

The main feature of the work is that all the structures found were checked for correctness, not only by accelerated methods, but also by the original ‘rough’ algorithm. This guaranteed the reliability of the results.

The authors emphasise that we are not yet talking about AI being able to prove new theorems on its own. But already, such systems are capable of creating elements of proof, which are then ‘raised’ to more general universal results. In the future, the key problem will be verification of correctness, since the amount of computation required for this will grow along with the complexity of the tasks.