background preloader


Facebook Twitter

Roboschool. We are releasing Roboschool: open-source software for robot simulation, integrated with OpenAI Gym.


Three control policies running on three different robots, racing each other in Roboschool. You can re-enact this scene by running agent_zoo/ Each time you run the script, a random set of robots appears. Roboschool provides new OpenAI Gym environments for controlling robots in simulation. Eight of these environments serve as free alternatives to pre-existing MuJoCo implementations, re-tuned to produce more realistic motion. Roboschool also makes it easy to train multiple agents together in the same environment. After we launched Gym, one issue we heard from many users was that the MuJoCo component required a paid license (though MuJoCo recently added free student licenses for personal and class work). For the existing MuJoCo environments, besides porting them to Bullet, we have modified them to be more realistic.

Two agents learning to play RoboschoolPong against each other.


Gaming. Neural Networks. Kaggle. TensorFlow. Deep Learning. Machine Learning. Business. Prolog. Issues. A Gentle Introduction to Data Structures: How Graphs Work. OpenAI Universe. Markov Chain Monte Carlo Without all the Bullshit. I have a little secret: I don’t like the terminology, notation, and style of writing in statistics.

Markov Chain Monte Carlo Without all the Bullshit

I find it unnecessarily complicated. This shows up when trying to read about Markov Chain Monte Carlo methods. Take, for example, the abstract to the Markov Chain Monte Carlo article in the Encyclopedia of Biostatistics. Markov chain Monte Carlo (MCMC) is a technique for estimating by simulation the expectation of a statistic in a complex model. Successive random selections form a Markov chain, the stationary distribution of which is the target distribution. I can only vaguely understand what the author is saying here (and really only because I know ahead of time what MCMC is).

So to counter, here’s my own explanation of Markov Chain Monte Carlo, inspired by the treatment of John Hopcroft and Ravi Kannan. The Problem is Drawing from a Distribution Markov Chain Monte Carlo is a technique to solve the problem of sampling from a complicated distribution. . And use if the coin lands heads. Google's Deep Mind is trained to complete Montezuma's Revenge. Google's Deep Mind has learned how to play yet another game - this time because it had been 'incentivised' to want to win.

Google's Deep Mind is trained to complete Montezuma's Revenge

"Intrinsic rewards" meant the AI obtained "significantly improved exploration in a number of hard games, including the infamously difficult Montezuma's Revenge", wrote Google researchers in a paper. Intrinsic motivation (IM) algorithms typically use signals to make the AI more 'curious' and are inspired by classic, human-based psychological ideas. Montezuma's Revenge was a 1984 platform game for the Atari 2600 in which a character navigates a series of complex rooms in an underground Aztec temple.

The model, which had inbuilt rewards, explored 15 rooms out of a potential 24 – the old model, which was not incentivised, explored only two. Deep Mind has already been trained to play Atari games, learning how to play 49 games by itself, and earlier this year beat Go champion of the world 4-1. Using Artificial Intelligence to Help Blind People ‘See’ Facebook. By Shaomei Wu, Software Engineer and Hermes Pique, Software Engineer on iOS and Jeffrey Wieland, Head of Accessibility Every day, people share more than 2 billion photos across Facebook, Instagram, Messenger and WhatsApp.

Using Artificial Intelligence to Help Blind People ‘See’ Facebook

While visual content provides a fun and expressive way for people to communicate online, consuming and creating it poses challenges for people who are blind or severely visually impaired. With more than 39 million people who are blind, and over 246 million who have a severe visual impairment, many people may feel excluded from the conversation around photos on Facebook. We want to build technology that helps the blind community experience Facebook the same way others enjoy it.

That’s why today we’re introducing automatic alternative text. Automatic alternative text, or automatic alt text, is a new development that generates a description of a photo using advancements in object recognition technology. Read more about the development of automatic alt text here and here.