# The Neural Network Method Can Decipher the Mathematics Of AI

By | 16/09/2022

More than 70 years agone, researchers at the forefront of bogus intelligence inquiry introduced neural networks equally a revolutionary way to recollect about how the encephalon works. In the human brain, networks of billions of connected neurons make sense of sensory data, allowing us to learn from experience. Artificial neural networks can besides filter huge amounts of data through connected layers to brand predictions and recognize patterns, following rules they taught themselves.

By now, people treat neural networks as a kind of AI panacea, capable of solving tech challenges that tin can be restated every bit a trouble of design recognition. They provide natural-sounding language translation. Photo apps use them to recognize and categorize recurrent faces in your collection. And programs driven by neural nets have defeated the earth’s best players at games including Get and chess.

However, neural networks have always lagged in one conspicuous area: solving difficult symbolic math bug. These include the hallmarks of calculus courses, similar integrals or ordinary differential equations. The hurdles arise from the nature of mathematics itself, which demands precise solutions. Neural nets instead tend to excel at probability. They learn to recognize patterns — which Spanish translation sounds best, or what your face up looks similar — and tin can generate new ones.

The situation changed late last year when Guillaume Lample and François Charton, a pair of calculator scientists working in Facebook’s AI inquiry grouping in Paris, unveiled a successful first arroyo to solving symbolic math issues with neural networks. Their method didn’t involve number crunching or numerical approximations. Instead, they played to the strengths of neural nets, reframing the math issues in terms of a trouble that’due south practically solved: linguistic communication translation.

“We both majored in math and statistics,” said Charton, who studies applications of AI to mathematics. “Math was our original language.”

Every bit a outcome, Lample and Charton’s program could produce precise solutions to complicated integrals and differential equations — including some that stumped popular math software packages with explicit problem-solving rules built in.

The new program exploits i of the major advantages of neural networks: They develop their own implicit rules. As a issue, “there’s no separation between the rules and the exceptions,” said Jay McClelland, a psychologist at Stanford University who uses neural nets to model how people acquire math. In practise, this means that the program didn’t stumble over the hardest integrals. In theory, this kind of approach could derive unconventional “rules” that could make headway on problems that are currently unsolvable, by a person or a machine — mathematical problems like discovering new proofs, or agreement the nature of neural networks themselves.

Non that that’s happened yet, of course. But it’s articulate that the team has answered the decades-old question — tin AI exercise symbolic math? — in the affirmative. “Their models are well established. The algorithms are well established. They postulate the problem in a clever fashion,” said Wojciech Zaremba, co-founder of the AI research group OpenAI.

“They did succeed in coming up with neural networks that could solve bug that were across the scope of the rule-post-obit car system,” McClelland said. “Which is very heady.”

## Didactics a Estimator to Speak Math

Computers have ever been skillful at crunching numbers. Computer algebra systems combine dozens or hundreds of algorithms hard-wired with preset instructions. They’re typically strict rule followers designed to perform a specific operation just unable to accommodate exceptions. For many symbolic problems, they produce numerical solutions that are close enough for applied science and physics applications.

Neural nets are different. They don’t have hard-wired rules. Instead, they train on large data sets — the larger the amend — and utilise statistics to make very practiced approximations. In the process, they learn what produces the all-time outcomes. Language translation programs particularly smoothen: Instead of translating word by word, they translate phrases in the context of the whole text. The Facebook researchers saw that equally an advantage to solving symbolic math problems, not a hindrance. Information technology gives the programme a kind of trouble-solving freedom.

That freedom is peculiarly useful for certain open-concluded problems, like integration. At that place’s an former maxim among mathematicians: “Differentiation is mechanics; integration is art.” It means that in gild to find the derivative of a role, y’all just have to follow some well-defined steps. Just to observe an integral often requires something else, something that’s closer to intuition than calculation.

The Facebook grouping suspected that this intuition could be approximated using design recognition. “Integration is one of the nigh pattern recognition-like problems in math,” Charton said. So fifty-fifty though the neural cyberspace may non empathize what functions do or what variables mean, they do develop a kind of instinct. The neural net begins to sense what works, fifty-fifty without knowing why.

For example, a mathematician asked to integrate an expression similar $latex y y^{\prime}\left(y^{two}+1\correct)^{-ane / 2}$ will intuitively doubtable that the primitive — that is, the expression that was differentiated to requite rise to the integral — contains something that looks like the square root of
+ 1.

To allow a neural internet to process the symbols like a mathematician, Charton and Lample began past translating mathematical expressions into more useful forms. They ended up reinterpreting them as trees — a format similar in spirit to a diagrammed sentence. Mathematical operators such as addition, subtraction, multiplication and division became junctions on the tree. And then did operations similar raising to a ability, or trigonometric functions. The arguments (variables and numbers) became leaves. The tree structure, with very few exceptions, captured the fashion operations can be nested within longer expressions.

“When we see a large part, we can see that it’s composed of smaller functions and have some intuition about what the solution tin can exist,” Lample said. “Nosotros call up the model tries to find clues in the symbols nearly what the solution can be.” He said this process parallels how people solve integrals — and really all math problems — by reducing them to recognizable sub-problems they’ve solved earlier.

Subsequently coming up with this architecture, the researchers used a bank of uncomplicated functions to generate several training data sets totaling about 200 million (tree-shaped) equations and solutions. They and then “fed” that data to the neural network, and then it could learn what solutions to these issues look similar.

After the training, it was time to come across what the net could do. The computer scientists gave it a test set of 5,000 equations, this fourth dimension without the answers. (None of these test problems were classified as “unsolvable.”) The neural internet passed with flying colors: It managed to get the right solutions — precision and all — to the vast majority of problems. Information technology particularly excelled at integration, solving near 100% of the test issues, but information technology was slightly less successful at ordinary differential equations.

For almost all the problems, the program took less than 1 second to generate right solutions. And on the integration issues, it outperformed some solvers in the popular software packages Mathematica and Matlab in terms of speed and accuracy. The Facebook team reported that the neural cyberspace produced solutions to bug that neither of those commercial solvers could tackle.

## Into the Black Box

Despite the results, the mathematician Roger Germundsson, who heads research and development at Wolfram, which makes Mathematica, took issue with the direct comparison. The Facebook researchers compared their method to only a few of Mathematica’s functions —“integrate” for integrals and “DSolve” for differential equations — but Mathematica users can access hundreds of other solving tools.

Germundsson also noted that despite the enormous size of the training data gear up, it just included equations with one variable, and only those based on unproblematic functions. “Information technology was a thin slice of possible expressions,” he said. The neural net wasn’t tested on messier functions often used in physics and finance, like mistake functions or Bessel functions. (The Facebook group said it could be, in futurity versions, with very simple modifications.)

And Frédéric Gibou, a mathematician at the University of California, Santa Barbara who has investigated ways to utilize neural nets to solve partial differential equations, wasn’t convinced that the Facebook grouping’s neural net was infallible. “You need to exist confident that information technology’s going to work all the time, and not just on some chosen bug,” he said, “and that’s not the case here.” Other critics have noted that the Facebook group’s neural internet doesn’t really sympathise the math; it’s more of an exceptional guesser.

Still, they agree that the new approach will evidence useful. Germundsson and Gibou believe neural nets volition have a seat at the table for adjacent-generation symbolic math solvers — it will just be a large table. “I think that it will be i of many tools,” Germundsson said.

Besides solving this specific problem of symbolic math, the Facebook group’due south piece of work is an encouraging proof of principle and of the ability of this kind of approach. “Mathematicians volition in general be very impressed if these techniques allow them to solve problems that people could not solve earlier,” said Anders Hansen, a mathematician at the University of Cambridge.

Another possible direction for the neural net to explore is the development of automatic theorem generators. Mathematicians are increasingly investigating ways to use AI to generate new theorems and proofs, though “the state of the art has not fabricated a lot of progress,” Lample said. “It’south something we’re looking at.”

Charton describes at least two ways their arroyo could move AI theorem finders forward. First, information technology could human action as a kind of mathematician’due south assistant, offer assistance on existing problems past identifying patterns in known conjectures. 2d, the machine could generate a list of potentially provable results that mathematicians have missed. “We believe that if you tin do integration, you should be able to do proving,” he said.

Offering assist for proofs may ultimately be the killer app, even beyond what the Facebook team described. One common manner to disprove a theorem is to come up up with a counterexample that shows it can’t hold. And that’s a task that these kinds of neural nets may one day be uniquely suited for: finding an unexpected wrench to throw in the motorcar.

Another unsolved trouble where this approach shows promise is one of the near disturbing aspects of neural nets: No one actually understands how they work. Training \$.25 enter at one end and prediction bits emerge from the other, but what happens in between — the exact procedure that makes neural nets into such good guessers — remains a critical open up question.

Symbolic math, on the other mitt, is decidedly less mysterious. “Nosotros know how math works,” said Charton. “By using specific math problems as a test to run across where machines succeed and where they fail, we can learn how neural nets piece of work.”

Soon, he and Lample plan to feed mathematical expressions into their networks and trace the style the program responds to small changes in the expressions. Mapping how changes in the input trigger changes in the output might assistance expose how the neural nets operate.

Zaremba sees that kind of understanding equally a potential step toward education neural nets to reason and to actually understand the questions they’re answering. “It’s piece of cake in math to move the needle and come across how well [the neural network] works if expressions are condign dissimilar. We might truly learn the reasoning, instead of just the answer,” he said. “The results would be quite powerful.”