Thread
I find it kind of bizzare that people are saying that like the real AI safety argument wasn't wheather the system could understand you it was getting it to care.

Like, maybe these people have always been saying that concrete problems in AI safety was a bad paper.
But if you were convinced of the AI safety case by the concrete problems paper it now seems very unlikely that roboGPT wouldn't be able to clean a room without killing a baby.
It's completely reasonable to argue that we still have inner misalignment problems to be worried about, or scaleable oversight problems, but it seems frankly disingenious to say to say LLMs shouldn't update you away from an important class of problems.
arxiv.org/pdf/1606.06565.pdf the concrete problems paper
obvious this doesn't apply if you were always saying that this would never be the problem, but when this post www.deepmind.com/blog/specification-gaming-the-flip-side-of-ai-ingenuity came I out I don't think the reaction of the AIS community was "deepmind clearly doesn't understand the real problem"
Mentions
See All