2 min read

Google’s DeepMind introduced AlphaZero last year as a reinforcement learning program that masters three different types of board games, Chess, Shogi and Go to beat world champions in each case. Yesterday, they announced that a full evaluation of AlphaZero has been published in the journal Science, which confirms and updates the preliminary results. The research paper describes how Deepmind’s AlphaZero learns each game from scratch, without any human intervention or no inbuilt domain knowledge but the basic rules of the game.

Unlike traditional game playing programs, Deepmind’s AlphaZero uses deep neural networks, a general-purpose reinforcement learning algorithm, and a general-purpose tree search algorithm. The first play by the program is completely random. Over-time the system uses RL algorithms to learn from wins, losses and draws to adjust the parameters of the neural network. The amount of training varies taking approximately 9 hours for chess, 12 hours for shogi, and 13 days for Go. For searching, it uses Monte-Carlo Tree Search (MCTS)  to select the most promising moves in games.

Testing and Evaluation

Deepmind’s AlphaZero was tested against the best engines for chess (Stockfish), shogi (Elmo), and Go (AlphaGo Zero). All matches were played for three hours per game, plus an additional 15 seconds for each move. AlphaZero was able to beat all its component in each evaluation.

Per Deepmind’s blog:

In chess, Deepmind’s AlphaZero defeated the 2016 TCEC (Season 9) world champion Stockfish, winning 155 games and losing just six games out of 1,000. To verify the robustness of AlphaZero, it was also played on a series of matches that started from common human openings. In each opening, AlphaZero defeated Stockfish.

It also played a match that started from the set of opening positions used in the 2016 TCEC world championship, along with a series of additional matches against the most recent development version of Stockfish, and a variant of Stockfish that uses a strong opening book. In all matches, AlphaZero won.

In shogi, AlphaZero defeated the 2017 CSA world champion version of Elmo, winning 91.2% of games.

In Go, AlphaZero defeated AlphaGo Zero, winning 61% of games.

AlphaZero’s ability to master three different complex games is an important progress towards building a single AI system that can solve a wide range of real-world problems and generalize to new situations.

People on the internet are also highly excited about this new achievement.

Read Next

Deepmind’s AlphaFold is successful in predicting the 3D structure of a protein making major inroads for AI use in healthcare.

Google makes major inroads into healthcare tech by absorbing DeepMind Health.

AlphaZero: The genesis of machine intuition

Content Marketing Editor at Packt Hub. I blog about new and upcoming tech trends ranging from Data science, Web development, Programming, Cloud & Networking, IoT, Security and Game development.