Artwork

Content provided by Joe Carlsmith. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Joe Carlsmith or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Two sources of beyond-episode goals (Section 2.2.2 of "Scheming AIs")

21:25
 
Share
 

Manage episode 386304331 series 3402048
Content provided by Joe Carlsmith. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Joe Carlsmith or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
  continue reading

Chapters

1. Two sources of beyond-episode goals (Section 2.2.2 of "Scheming AIs") (00:00:00)

2. 2.2.2 Two sources of beyond-episode goals (00:00:28)

3. 2.2.2.1 Training-game-independent beyond-episode goals (00:01:32)

4. 2.2.2.1.1 Are beyond-episode goals the default? (00:03:26)

5. 2.2.2.1.2 How will models think about time? (00:05:02)

6. 2.2.2.1.3 The role of “reflection” (00:08:09)

7. 2.2.2.1.4 Pushing back on beyond-episode goals using adversarial training (00:10:56)

8. 2.2.2.2 Training-game-dependent beyond-episode goals (00:12:45)

9. 2.2.2.2.1 Can gradient descent “notice” the benefits of turning a non-schemer into a schemer? (00:14:47)

10. 2.2.2.2.2 Is SGD pulling scheming out of models by any means necessary? (00:18:51)

56 episodes

Artwork
iconShare
 
Manage episode 386304331 series 3402048
Content provided by Joe Carlsmith. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Joe Carlsmith or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
  continue reading

Chapters

1. Two sources of beyond-episode goals (Section 2.2.2 of "Scheming AIs") (00:00:00)

2. 2.2.2 Two sources of beyond-episode goals (00:00:28)

3. 2.2.2.1 Training-game-independent beyond-episode goals (00:01:32)

4. 2.2.2.1.1 Are beyond-episode goals the default? (00:03:26)

5. 2.2.2.1.2 How will models think about time? (00:05:02)

6. 2.2.2.1.3 The role of “reflection” (00:08:09)

7. 2.2.2.1.4 Pushing back on beyond-episode goals using adversarial training (00:10:56)

8. 2.2.2.2 Training-game-dependent beyond-episode goals (00:12:45)

9. 2.2.2.2.1 Can gradient descent “notice” the benefits of turning a non-schemer into a schemer? (00:14:47)

10. 2.2.2.2.2 Is SGD pulling scheming out of models by any means necessary? (00:18:51)

56 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide