Two Minute Papers discusses the paper “Dota 2 with Large Scale Deep Reinforcement Learning” from OpenAI.
I always knew that reinforcement learning would teach us more about ourselves than any other kind of AI approach. This feeling was backed up in a paper published recently in Nature.
DeepMind, Alphabet’s AI subsidiary, has once again used lessons from reinforcement learning to propose a new theory about the reward mechanisms within our brains.
The hypothesis, supported by initial experimental findings, could not only improve our understanding of mental health and motivation. It could also validate the current direction of AI research toward building more human-like general intelligence.
It turns out the brain’s reward system works in much the same way—a discovery made in the 1990s, inspired by reinforcement-learning algorithms. When a human or animal is about to perform an action, its dopamine neurons make a prediction about the expected reward.
Machine Learning with Phil has got another interesting look at Deep Q Learning as part of a preview of his course.
The two biggest innovations in deep Q learning were the introduction of the target network and the replay memory. One would think that simply bolting a deep neural network to the Q learning algorithm would be enough for a robust deep Q learning agent, but that isn’t the case. In this video I’ll show you how this naive implementation of the deep q learning agent fails, and spectacularly at that.
This is an excerpt from my new course, Deep Q Learning From Paper to Code which you can get on sale with this link
Machine Learning with Phil posted this tutorial to apply experience replay to the actor critic algorithm.
It seems smart, but it turns out that it doesn’t work.
Despite the fact that the replay memory is critical to the success of the deep Q learning algorithm, it completely breaks the actor critic network.
edureka! covers the basics of reinforcement learning in this tutorial for beginners.
Samuel Arzt shows off a project where an AI learns to park a car in a parking lot in a 3D physics simulation.
The simulation was implemented using Unity’s ML-Agents framework (https://unity3d.com/machine-learning).
From the video description:
The AI consists of a deep Neural Network with 3 hidden layers of 128 neurons each. It is trained with the Proximal Policy Optimization (PPO) algorithm, which is a Reinforcement Learning approach.
Microsoft Research jsut posted this talk from Reinforcement Learning Day 2019: “Towards Using Batch Reinforcement Learning to Identify Treatment Options in Healthcare.”
Dani, a game developer, recently made a game and decided to train an AI to play it.
A couple of weeks ago I made a video “Making a Game in ONE Day (12 Hours)”, and today I’m trying to teach an A.I to play my game!
Basically I’m gonna use Neural Networks to make the A.I learn to play my game.
This is something I’ve always wanted to do, and I’m really happy I finally got around to do it. Some of the biggest inspirations for this is obviously carykh, Jabrils & Codebullet!
In this DataPoint, Frank runs into Matt Kirk on the expo floor at Strata NYC. ress the play button below to listen here or visit the show page at DataDriven.tv