The more advanced artificial intelligence (AI) gets, the more capable it is of scheming and lying to meet its goals — and it even knows when it’s being evaluated, research suggests.
Evaluators at Apollo Research found that the more capable a large language model (LLM) is, the better it is at “context scheming” — in which an AI pursues a task covertly even if it misaligns with the aims of its operators.
The more capable models are also more strategic about achieving their goals, including misaligned goals, and would be more likely to use tactics like deception, the researchers said in a blog post.
