Research Scientist at FAIR / @MetaAI
Nice work on grounding LLMs using RL. Particularly liked the comparison between RL and BC, demonstrating the importance of learning from feedback and interaction for robust decision making.
Great work on improving sample efficiency in RL via resets, which allows more updates and thus better scaling with compute. Love the simplicity of the method and the many insights in the paper.