“This is a really amazing result,” said François Le Gall, a mathematician at Nagoya University in Japan who was not involved in the work. “Matrix multiplication is used everywhere in engineering,” he said. “Anything you want to solve numerically, you usually use matrices.”
Despite the ubiquity of the calculation, it is still poorly understood. A matrix is simply a grid of numbers, representing anything you want. Multiplying two matrices usually involves multiplying the rows of one matrix by the columns of the other. Basic techniques for problem solving are taught in high school. “It’s like the ABCs of computers,” said Pushmeet Kohli, AI team leader for Science at DeepMind.
But things get complicated when you try to find a faster method. “No one knows the best algorithm to solve it,” says Le Gall. “It’s one of the biggest open problems in computer science.”
This is because there are more ways to multiply two matrices than there are atoms in the universe (10 to powers of 33, for some of the cases the researchers looked at). “The number of possible actions is almost limitless,” said Thomas Hubert, an engineer at DeepMind.
The trick is to turn the problem into a kind of three-dimensional board game, called TensorGame. The table represents the multiplication problem to be solved, and each move represents the next step in solving that problem. Thus, the series of moves made in a game represent an algorithm.
The researchers trained a new version of AlphaZero, called AlphaTensor, to play this game. Instead of learning a series of best moves to make in Go or chess, AlphaTensor learned a series of best moves to make when multiplying matrices. It was rewarded for winning the game in as few moves as possible.
“We turned it into a game, our favorite kind of framework,” said Hubert, one of the principal researchers on AlphaZero.