Post
  • From Twitter

Nice work on grounding LLMs using RL. Particularly liked the comparison between RL and BC, demonstrating the importance of learning from feedback and interaction for robust decision making.

Replies
No replies yet