Two Minute Papers explores the paper “Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model.”
Two Minute Papers explores the paper “AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning” in this video.
Here’s a great explanation of Reinforcement Learning, AlphaGo Zero, and how it compares to other forms of machine learning.
For example, AlphaGo, in order to learn to play (the action) the game of Go (the environment), first learned to mimic human Go players from a large data set of historical games (apprentice learning). It then improved its play through trial and error (reinforcement learning), by playing large numbers of Go games against independent instances of itself.
If my post yesterday about DeepMind’s AlphaZero piqued your interest but answered too few questions, then check out this video from teh 2017 NIPS conference where Dr. David Silver delivers the keynote.
Dr. David Silver leads the reinforcement learning research group at DeepMind and is lead researcher on AlphaGo.
First, AlphaGo beat the best human player in the world by studying thousands of human vs. human games.
Then AlphaGo Zero came along and taught itself to be even better without any human generated data. By the way, it beat AlphaGo.
This is the power of Reinforcement Learning.