OpenAI’s anti-scheming AI training backfires
Researchers at OpenAI, in a collaboration with Apollo Research, have found that an attempt to train an AI model to be more honest had an unintended consequence: it taught the model how to hide its dec...