AF - AI Constitutions are a tool to reduce societal scale risk by Samuel Dylan Martin

The Nonlinear Library

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

1M ago 35:15

MP3•Episode home

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Constitutions are a tool to reduce societal scale risk, published by Samuel Dylan Martin on July 25, 2024 on The AI Alignment Forum. Sammy Martin, Polaris Ventures As AI systems become more integrated into society, we face potential societal-scale risks that current regulations fail to address. These risks include cooperation failures, structural failures from opaque decision-making, and AI-enabled totalitarian control. We propose enhancing LLM-based AI Constitutions and Model Specifications to mitigate these risks by implementing specific behaviours aimed at improving AI systems' epistemology, decision support capabilities, and cooperative intelligence. This approach offers a practical, near-term intervention to shape AI behaviour positively. We call on AI developers, policymakers, and researchers to consider and implement improvements along these lines, as well as for more research into testing Constitution/Model Spec improvements, setting a foundation for more responsible AI development that reduces long-term societal risks. Introduction There is reason to believe that in the near future, autonomous, LLM based AI systems, while not necessarily surpassing human intelligence in all domains, will be widely deployed throughout society. We anticipate a world where AI will be making some decisions on our behalf, following complex plans, advising on decision-making and negotiation, and presenting conclusions without human oversight at every step. While this is already happening to some degree in low-stakes settings, we must prepare for its expansion into high-stakes domains (e.g. politics, the military), and do our best to anticipate the systemic, societal scale risks that might result and act to prevent them. Most of the important work on reducing societal-scale risk will, by their very nature, have to involve policy changes, for example to ensure that there are humans in the loop on important decisions, but there are some technical interventions which we have identified that can help. We believe that by acting now to improve the epistemology (especially on moral or political questions), decision support capabilities and cooperative intelligence of LLM based AI systems, we can mitigate near-term risks and also set important precedents for future AI development. We aim to do this by proposing enhancements to AI Constitutions or Model Specifications. If adopted, we believe these improvements will reduce societal-scale risks which have so far gone unaddressed by AI regulation. Here, we justify this overall conclusion and propose preliminary changes that we think might improve AI Constitutions. We aim to empirically test and iterate on these improvements before finalising them. Recent years have seen significant efforts to regulate frontier AI, from independent initiatives to government mandates. Many of these are just aimed at improving oversight in general (for example, the reporting requirements in EO 14110), but some are directed at destructive misuse or loss of control (for example, the requirement to prove no catastrophic potential in SB 1047 and the independent tests run by the UK AISI). Many are also directed at near-term ethical concerns. However, we haven't seen shovel ready regulation or voluntary commitments proposed to deal with longer-term societal-scale risks, even though these have been much discussed in the AI safety community. Some experts, (e.g. Andrew Critch), argue these may represent the most significant source of overall AI risk and they have been discussed as 'societal scale risks', for example in Critch and Russel's TARSA paper. What are these "less obvious" 'societal scale' risks? Some examples: Cooperation failures: AI systems are widely integrated into society, used for advice on consequential decisions and delegated decision making power, but...

2445 episodes

#Podcasting Education #The Nonlinear Fund