EA - AI governance needs a theory of victory by Corin Katzke

The Nonlinear Library

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

1M ago 34:54

MP3•Episode home

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI governance needs a theory of victory, published by Corin Katzke on June 22, 2024 on The Effective Altruism Forum. This post is part of a series by Convergence Analysis. An earlier post introduced scenario planning for AI risk. In this post, we argue for the importance of a related concept: theories of victory. In order to improve your game you must study the endgame before everything else; for, whereas the endings can be studied and mastered by themselves, the middlegame and the opening must be studied in relation to the endgame." - José Raúl Capablanca, Last Lectures, 1966 Overview The central goal of AI governance should be achieving existential security: a state in which existential risk from AI is negligible either indefinitely or for long enough that humanity can carefully plan its future. We can call such a state an AI governance endgame. A positive framing for AI governance (that is, achieving a certain endgame) can provide greater strategic clarity and coherence than a negative framing (or, avoiding certain outcomes). A theory of victory for AI governance combines an endgame with a plausible and prescriptive strategy to achieve it. It should also be robust across a range of future scenarios, given uncertainty about key strategic parameters. Nuclear risk provides a relevant case study. Shortly after the development of nuclear weapons, scientists, policymakers, and public figures proposed various theories of victory for nuclear risk, some of which are striking in their similarity to theories of victory for AI risk. Those considered included an international nuclear development moratorium, as well as unilateral enforcement of a monopoly on nuclear weapons. We discuss three potential theories of victory for AI governance. The first is an AI moratorium, in which international coordination indefinitely prevents AI development beyond a certain threshold. This would require an unprecedented level of global coordination and tight control over access to compute. Key challenges include the strong incentives to develop AI for strategic advantage, and the historical difficulty of achieving this level of global coordination. The second theory of victory is an "AI Leviathan" - a single well-controlled AI system or AI-enhanced agency that is empowered to enforce existential security. This could arise from either a first-mover unilaterally establishing a moratorium, or voluntary coordination between actors. Challenges include the potential for dystopic lock-in if mistakes are made in its construction, and significant ambiguity on key details of implementation. The third theory of victory is defensive acceleration. The goal is an endgame where the defensive applications of advanced AI sustainably outpace offensive applications, mitigating existential risk. The strategy is to coordinate differential technology development to preferentially advance defensive and safety-enhancing AI systems. Challenges include misaligned incentives in the private sector and the difficulty of predicting the offensive and defensive applications of future technology. We do not decide between these theories of victory. Instead, we encourage actors in AI governance to make their preferred theories of victory explicit - and, when appropriate, public. An open discussion and thorough examination of theories of victory for AI governance is crucially important to humanity's long-term future. Introduction What is the goal of AI governance? The goal of AI governance is (or should be) to bring about a state of existential security from AI risks. In such a state, AI existential risk would be indefinitely negligible, allowing humanity to carefully design its long-term relationship with AI. It might be objected that risk management should be conceived of as a continual process in which risk is never negligibl...

2442 episodes

#Podcasting Education #The Nonlinear Fund