Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")

Joe Carlsmith Audio

Content provided by Joe Carlsmith. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Joe Carlsmith or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

11M ago 22:54

MP3•Episode home

This is section 2.3.1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power

Chapters

1. Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") (00:00:00)

2. 2.3.1.2 Adequate future empowerment (00:00:33)

3. 2.3.1.2.1 When is the “pay off” supposed to happen? (00:01:22)

4. 2.3.1.2.2 Even if the model’s values survive this generation of training, will they survive long (00:04:20)

5. 2.3.1.2.3 Will escape/take-over be suitably likely to succeed? (00:08:16)

6. 2.3.1.2.4 Will the time horizon of the model’s goals extend to cover escape/take-over? (00:10:05)

7. 2.3.1.2.5 Will the model’s values get enough power after escape/takeover? (00:11:56)

8. 2.3.1.2.6 How much does the model stand to gain from not training-gaming? (00:13:23)

9. 2.3.1.2.7 How “ambitious” is the model? (00:16:38)

10. 2.3.1.3 Overall assessment of the classic goal-guarding story (00:21:43)

56 episodes

#Society #Philosophy #Joe

Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")

Joe Carlsmith Audio

published 11M ago

MP3•Episode home

This is section 2.3.1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Chapters

1. Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") (00:00:00)

2. 2.3.1.2 Adequate future empowerment (00:00:33)

3. 2.3.1.2.1 When is the “pay off” supposed to happen? (00:01:22)

4. 2.3.1.2.2 Even if the model’s values survive this generation of training, will they survive long (00:04:20)

5. 2.3.1.2.3 Will escape/take-over be suitably likely to succeed? (00:08:16)

6. 2.3.1.2.4 Will the time horizon of the model’s goals extend to cover escape/take-over? (00:10:05)

7. 2.3.1.2.5 Will the model’s values get enough power after escape/takeover? (00:11:56)

8. 2.3.1.2.6 How much does the model stand to gain from not training-gaming? (00:13:23)

9. 2.3.1.2.7 How “ambitious” is the model? (00:16:38)

10. 2.3.1.3 Overall assessment of the classic goal-guarding story (00:21:43)

56 episodes

#Society #Philosophy #Joe

All episodes

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

Listen to 500+ topics

Similar to Joe Carlsmith Audio

Podcasts Worth a Listen

Joe Carlsmith Audio « » Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")

Chapters

1. Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") (00:00:00)

2. 2.3.1.2 Adequate future empowerment (00:00:33)

3. 2.3.1.2.1 When is the “pay off” supposed to happen? (00:01:22)

4. 2.3.1.2.2 Even if the model’s values survive this generation of training, will they survive long (00:04:20)

5. 2.3.1.2.3 Will escape/take-over be suitably likely to succeed? (00:08:16)

6. 2.3.1.2.4 Will the time horizon of the model’s goals extend to cover escape/take-over? (00:10:05)

7. 2.3.1.2.5 Will the model’s values get enough power after escape/takeover? (00:11:56)

8. 2.3.1.2.6 How much does the model stand to gain from not training-gaming? (00:13:23)

9. 2.3.1.2.7 How “ambitious” is the model? (00:16:38)

10. 2.3.1.3 Overall assessment of the classic goal-guarding story (00:21:43)

Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")

Chapters

1. Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs") (00:00:00)

2. 2.3.1.2 Adequate future empowerment (00:00:33)

3. 2.3.1.2.1 When is the “pay off” supposed to happen? (00:01:22)

4. 2.3.1.2.2 Even if the model’s values survive this generation of training, will they survive long (00:04:20)

5. 2.3.1.2.3 Will escape/take-over be suitably likely to succeed? (00:08:16)

6. 2.3.1.2.4 Will the time horizon of the model’s goals extend to cover escape/take-over? (00:10:05)

7. 2.3.1.2.5 Will the model’s values get enough power after escape/takeover? (00:11:56)

8. 2.3.1.2.6 How much does the model stand to gain from not training-gaming? (00:13:23)

9. 2.3.1.2.7 How “ambitious” is the model? (00:16:38)

10. 2.3.1.3 Overall assessment of the classic goal-guarding story (00:21:43)

Podcasts Worth a Listen

Welcome to Player FM!

Similar to Joe Carlsmith Audio

Quick Reference Guide

Joe Carlsmith Audio « »
Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")