The Merging Was Always the Point
An internal OpenAI model disproved Erdős's 1946 unit-distance conjecture. Which is what Wiles did, what Langlands is doing, what every mathematical breakthrough has ever been. Does that mean AI can think?
Yesterday, an internal model at OpenAI disproved the Erdős unit-distance conjecture. The conjecture is from 1946. It is, depending on which discrete geometer you ask, either the best-known or the most-tried open problem in combinatorial geometry. Paul Erdős attached a $500 prize to it. The disproof is, according to the nine mathematicians who examined it line by line, correct.
Let me say what the problem actually is, because It took me forever to understand it (did i mention I am not a mathematician??)
Put n points in a plane, anywhere you want. Count the pairs of points that are exactly distance 1 from each other. Call that count the "unit distances." How big can the count get as n grows? Erdős showed in 1946 that you can get the count to grow a tiny bit faster than n itself. He conjectured that you cannot do much better than that, formally that the count is bounded by n raised to (1 + o(1)), where the o(1) shrinks toward zero as n grows. Eighty years of human effort have, broadly, been on the side of proving that ceiling. The model showed there is no such ceiling. You can construct point sets that beat the bound by a fixed exponent forever.
The way it did this is, if you squint at it, the most ordinary thing mathematics has ever done.
The original Erdős construction was a square grid of points where the coordinates were ordinary integers, and the unit distances came from the algebraic structure of the Gaussian integers, the ring Z[i], the same object you saw the first time you took a course on complex numbers. The natural generalization is to swap the Gaussian integers for some other algebraic object of the same shape and see whether you get more unit distances No matter what they tried, the bound would stubbornly recover Erdős's original number. Will Sawin's reflection in the companion paper explains why: when you actually compute it out, the natural generalization gives you the same answer as the original construction, so there is no apparent reason to try anything else.
So this is where we get to the novelty - here the model tried something else.
Specifically, it took the construction and varied the wrong thing, and instead of fixing the field and varying which primes you use inside it, it fixed the primes and varied the field, letting the field's degree grow to infinity along a particular tower of fields known to algebraic number theorists since the 1960s. This regime is, according to Jacob Tsimerman, who briefly tried it himself, "very scary." It's hard to hold in your head as the obvious calculations do not give you any signal that it is going to work. Humans who got this far typically wrote it off as a dead end and turned around.
The model did not turn around. It also did not need to be intuitive about whether the conjecture was true. As Arul Shankar observed in his reflection, a "significant majority" of the model's chain-of-thought was spent trying to construct a counterexample, not a proof. Erdős believed his conjecture for forty-six years until he died, and no one else had a reason to disbelieve one of the greatest mathematicians of all time. The model did not believe anything.
Every meaningful breakthrough in modern mathematics has, at its core, been a merging. Andrew Wiles proved Fermat's Last Theorem by realizing it was the same problem as a question about elliptic curves and modular forms, three areas that, before Wiles, were not visibly the same conversation. Guth and Katz almost-solved the distinct distances problem by importing the polynomial method from algebraic geometry into combinatorics. The Langlands program is one giant unfinished exercise in merging seemingly-separate domains into one structure. You can pull on this thread for a long time. The history of mathematics is the history of noticing that things you thought were separate are the same thing seen from different angles.
The OpenAI model did the merging. The model took a discrete geometry problem, recognized it as algebraic number theory wearing a hat, walked over to the algebraic number theory shelf, picked up a tool from 1964 (Golod-Shafarevich), combined it with a tool from 2007 (Ellenberg-Venkatesh), combined those with a tool from 2021 (Hajir-Maire-Ramakrishna), and used the combination to do something none of those tools had previously been used for. This is no different in kind from what humans have been doing in mathematics for four hundred years.
Then we get to the more fundamental question which is... is this what thinking is?
When you have an insight, the experience of insight is the experience of recognizing that something you knew from over there applies to something you are stuck on over here. It is not, despite what the romantic version says, the arrival of a new fact from nowhere. There is no atomic operation called "having an idea." There is pattern-matching across stored experience, and there is the moment when the pattern lights up and you see that two things are the same thing. That moment is what thought is.
If that is true, then "is AI really thinking?" becomes a less interesting question. Thinking is a specific operation that can be performed by anything that can go through that motion. The OpenAI model has access to a much larger stored library than any individual mathematician, runs the pattern-match faster, does not get tired, and crucially does not feel embarrassed when the pattern-match takes it into a domain where it is not formally credentialed. That last one matters more than people give it credit for, in my opinion. Mathematicians have careers. Careers have specialties. Specialties have social costs for stepping outside them. The model has no specialty and no social cost.
This isn't AGI though... not yet.
The model did not pick the problem, nor did the model did not decide its output was worth listening to. Nine mathematicians spent serious unpaid weekend time turning the raw output into a paper that other mathematicians could read. Melanie Matchett Wood makes the sharpest version of this point in her reflection: if the same nine experts had been assembled a month ago to look for a counterexample, she thinks they would have found one. The reason no one assembled them is that no one knew to ask. The model's contribution was not just the proof; it was the act of producing a thing convincing enough that experts would spend a weekend taking it seriously. That convincing-enough threshold is a thing humans have spent centuries building social machinery to enforce.
So the part that is genuinely new is not "AI can think." We have known that since at least DeepMind's AlphaGo move 37, and probably longer. The genuinely new part is that the operation can now be aimed at problems human mathematicians had not given themselves permission to seriously work on, and produce output that crosses the threshold where humans agree to look at it.
The asking is still ours. The deciding-to-look is still ours. The "this is interesting enough that I will spend my Saturday on it" is still ours. Those turn out to be different operations from the merging, and we did not know that before. We do now.
So while the the model disproved a conjecture, the mathematicians disproved a quieter one: that the part of mathematics that requires merging across distant fields is the part that requires a mathematician. We are going to have to find a new place to draw that line. The drawing of the line is also, of course, ours.
Want to learn how intelligent data pipelines can reduce your AI costs? Check out Expanso. Or don't. Who am I to tell you what to do.*
NOTE: I'm currently writing a book based on what I have seen about the real-world challenges of data preparation for machine learning, focusing on operational, compliance, and cost. I'd love to hear your thoughts!