Thread by Sharon Zhou
- Tweet
- Jun 29, 2022
- #ArtificialIntelligence
Thread
The OpenAI Minecraft paper is a great push to getting AI to work in Photoshop, Figma, or any software product — using just the keyboard & mouse, like a person would.
Steps in the paper, explained 🧵
1/
openai.com/blog/vpt/
Steps in the paper, explained 🧵
1/
openai.com/blog/vpt/
1. First, hire people to play Minecraft, who are OK at it. Record their screen and keyboard & mouse strokes. This costs $2k for 2k hrs of video in total.
This is your small dataset.
2/
This is your small dataset.
2/
2. Train a model on this small dataset. Let the model to look a little bit in the past and a little bit in the future in the videos. Let it predict the key & mouse strokes the person used, aligned to the video.
This is your small model.
2/
This is your small model.
2/
3. After it's trained, use your small model to predict key & mouse strokes on 70k hrs of video, which you scraped from the internet. You didn't hire anyone for these and don't have recorded key & mouse strokes.
This is your large dataset.
3/
This is your large dataset.
3/
4. Train another model on this large dataset. Train it to learn to press the right keys/mouse.
This is your large model.
4/
This is your large model.
4/
5. Then, watch your large model play on its own, and it does some pretty nifty things.
All of this is impressive because:
- The different moves you can make are much more open-ended than anything AI has been able to do before.
...
5/
All of this is impressive because:
- The different moves you can make are much more open-ended than anything AI has been able to do before.
...
5/
- It's only $2k to get enough data for the smaller dataset to successfully get us here.
- The AI acts like a person playing Minecraft — there isn't some special set of keys we gave it to make this whole thing a little easier.
6/
- The AI acts like a person playing Minecraft — there isn't some special set of keys we gave it to make this whole thing a little easier.
6/
OK wow this blew up. Let me acknowledge @jeffclune and his team for their amazing work on this! Jeff had also worked on helping wildlife in camera trap photos before. He's a really kind and humble soul, and I'm so v excited to see the work he'll be brewing up next.
7/
7/
Additional great authors, tagged to the best of my Twitter search abilities:
@bobabowen (great username)
Ilge Akkaya
Peter Zhokhov
@Joost_Huizinga
Jie Tang
@AdreinLE
Brandon Houghton
Raul Sampedro
8/
@bobabowen (great username)
Ilge Akkaya
Peter Zhokhov
@Joost_Huizinga
Jie Tang
@AdreinLE
Brandon Houghton
Raul Sampedro
8/
The AI Minecraft never ends. At nearly the same time, NVIDIA dropped some amazing work: Imagine if you could go from writing a sentence or two to getting the Minecraft scene you described before you.
Implication: This could be the future of how we control software platforms.
9/
Implication: This could be the future of how we control software platforms.
9/
To get there, NVIDIA has released a gigantic dataset of millions of written comments and 300k hours of narrated game play.
Wow, people play a lot of video games!
My incredibly brilliant friends @DrJimFan & @AnimaAnandkumar are behind this effort.
More:
Wow, people play a lot of video games!
My incredibly brilliant friends @DrJimFan & @AnimaAnandkumar are behind this effort.
More: