Mechanistic Interpretability research @DeepMind. Formerly @AnthropicAI, independent In this to reduce AI X-risk. Neural networks can be understood, let's do it!
Paper
May 4, 2023
Article
Apr 19, 2023