Scheming AIs | Joe Carlsmith | EA Global Bay Area 2024
Manage episode 404916597 series 3503936
This talk examines whether advanced AIs that perform well in training will be doing so in order to gain power later — a behavior Joe Carlsmith calls "scheming" (also often called "deceptive alignment"). This talk gives an overview of his recent report on the topic, available on arXiv here: https://arxiv.org/abs/2311.08379. Joe Carlsmith is a senior research analyst at Open Philanthropy, where he focuses on existential risk from advanced artificial intelligence. He also writes independently about various topics in philosophy and futurism, and he has a doctorate in philosophy from the University of Oxford.
Watch on Youtube: https://www.youtube.com/watch?v=AxUTiGS6BHM
122 episodes