Sign In
Sign Up

Explore
Search

Miles Turpin

No followers

community-curated profile

Language model alignment @nyuniversity, @CohereAI

Overview Content

Featured content

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

by Miles Turpin
⚡️New paper!⚡️ It’s tempting to interpret chain-of-thought explanations as the LLM's process for solving a task. In this new work, we show that CoT explanations can systematically misrepresent the true reason for model predictions. arxiv.org/abs/2305.

by Miles Turpin

upcarta ©2024
Home
About
Terms
Privacy
Cookies

@upcarta