Open AI researchers taught artificial intelligence how to play Minecraft. The neural network was trained using gameplay videos. As a result, the neural network was able to find the required amount of resources and make a diamond pickaxe.
To train the neural network, engineers developed a Video PreTraining (VPT) method that allows using a large amount of unlabeled data. At the same time, artificial intelligence used emulation of a standard mouse and keyboard for the gameplay, which, according to experts, will help in the future to train the neural network to use a computer without additional connecting interfaces.
At the first stage of training, engineers used tagged Minecraft gameplay videos with a total duration of 2,000 hours. For markup, we used data about the keys that users pressed during the game. As a result, the researchers got a neural network that could independently process videos, guess keystrokes and record them. With the help of this neural network, another 70 thousand hours of gameplay from open sources were automatically labeled.
As a result, the neural network has learned to perform not only basic actions in Minecraft, but also quite complex ones that require consistent decision making. For example, a neural network is able to extract resources and make objects from them, run, swim, bypass obstacles, hunt animals in search of food and eat food, replenishing the hunger scale. Also, artificial intelligence has learned, bouncing, to put blocks under the character, which allows you to climb the hill.
After that, the researchers decided to fine-tune the neural network and for this they asked the users participating in the project to create a new world in the game, collect all the necessary resources to start and make essential items from them. At the same time, users recorded the gameplay, and the resulting data was used for training. At the output, the neural network learned how to start correctly in the game and no longer wandered aimlessly, but sought to make a workbench with which you can make game items. Also, some users built basic shelters while recording data, and artificial intelligence adopted this skill.
Next, the engineers used a reinforcement learning method to set up the neural network, which eventually allowed artificial intelligence to independently produce a diamond pickaxe. The whole path to get the tool is broken down into steps. The neural network was rewarded for each completed stage. As a result, artificial intelligence managed to complete this task.
Open AI engineers believe that the method can be used for fast and high-quality training of neural networks using videos. At the same time, the pre-video training method provides a stable base that can be further trained and adjusted using other available methods. So far, the researchers have tested the performance of the neural network only in Minecraft, but they believe that this method will help train artificial intelligence to use other computer programs using the keyboard and mouse.
Scientists have published a detailed study and posted the source code of the project. Also, the Open AI development team announced the MineRL NeurIPS contest, where participants can use the neural network to solve more complex problems in Minecraft.
Sources: Python.Engineering, OpenAI.com