Saturday, October 1, 2016

AlphaToe

AlphaGo

Is an AI developed by Google Deepmind that recently became the first machine to beat a top level human Go player.

AlphaToe

Is an attempt to apply the same techniques used in AlphaGo to Tic-Tac-Toe. Why? I hear you ask. Tic-tac-toe is a very simple game and can be solved using basic min-max.

Because it's a good platform to experiment with some of the AlphaGo techniques which it turns out they work at this scale. Also the neural networks involved can also be trained on my laptop in under an hour as opposed too the weeks on an array of super computers that AlphaGo required.

The project is written in Python using TensorFlow, the Github is here https://github.com/DanielSlater/AlphaToe and contains code for each step that AlphaGo used in it's learning. It also contains code for Connect 4 and this ability to build games of Tic-Tac-Toe on larger boards.

Here is a sneak peak at how it did in the 3x3 game. In this graph it is training as first player and gets too an 85% win rate against a random opponent after 300000 games.




I will do a longer write up of this at some point, but in the mean time here is a talk I did about AlphaToe at a recent DataScienceFestival event in London. Which gives a broad overview of the project: