Go offline with the Player FM app!
Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs")
Manage episode 387620347 series 3402048
This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Chapters
1. Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs") (00:00:00)
2. 2.3.2 Non-classic stories (00:00:36)
3. 2.3.2.1 AI coordination (00:00:55)
4. 2.3.2.2 AIs with similar values by default (00:05:57)
5. 2.3.2.3 Terminal values that happen to favor escape/takeover (00:07:51)
6. 2.3.2.4 Models with false beliefs about whether scheming is a good strategy (00:11:59)
7. 2.3.2.5 Self-deception (00:13:33)
8. 2.3.2.6 Goal-uncertainty and haziness (00:15:46)
9. 2.3.2.7 Overall assessment of the non-classic stories (00:18:19)
10. 2.4 Take-aways re: the requirements of scheming (00:20:08)
11. 2.5 Path dependence (00:20:51)
58 episodes
Manage episode 387620347 series 3402048
This is section 2.3.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Chapters
1. Non-classic stories about scheming (Section 2.3.2 of "Scheming AIs") (00:00:00)
2. 2.3.2 Non-classic stories (00:00:36)
3. 2.3.2.1 AI coordination (00:00:55)
4. 2.3.2.2 AIs with similar values by default (00:05:57)
5. 2.3.2.3 Terminal values that happen to favor escape/takeover (00:07:51)
6. 2.3.2.4 Models with false beliefs about whether scheming is a good strategy (00:11:59)
7. 2.3.2.5 Self-deception (00:13:33)
8. 2.3.2.6 Goal-uncertainty and haziness (00:15:46)
9. 2.3.2.7 Overall assessment of the non-classic stories (00:18:19)
10. 2.4 Take-aways re: the requirements of scheming (00:20:08)
11. 2.5 Path dependence (00:20:51)
58 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.