AF - Finding the Wisdom to Build Safe AI by Gordon Seidoh Worley

The Nonlinear Library

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

23d ago 13:27

MP3•Episode home

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Finding the Wisdom to Build Safe AI, published by Gordon Seidoh Worley on July 4, 2024 on The AI Alignment Forum. We may soon build superintelligent AI. Such AI poses an existential threat to humanity, and all life on Earth, if it is not aligned with our flourishing. Aligning superintelligent AI is likely to be difficult because smarts and values are mostly orthogonal and because Goodhart effects are robust, so we can neither rely on AI to naturally decide to be safe on its own nor can we expect to train it to stay safe. We stand a better chance of creating safe, aligned, superintelligent AI if we create AI that is "wise", in the sense that it knows how to do the right things to achieve desired outcomes and doesn't fall into intellectual or optimization traps. Unfortunately, I'm not sure how to create wise AI, because I'm not exactly sure what it is to be wise myself. My current, high-level plan for creating wise AI is to first get wiser myself, then help people working on AI safety to get wiser, and finally hope that wise AI safety researchers can create wise, aligned AI that is safe. For close to a decade now I've been working on getting wiser, and in that time I've figured out a bit of what it is to be wise. I'm starting to work on helping others get wiser by writing a book that explains some useful epistemological insights I picked up between pursuing wisdom and trying to solve a subproblem in AI alignment, and have vague plans for another book that will be more directly focused on the wisdom I've found. I thus far have limited ideas about how to create wise AI, but I'll share my current thinking anyway in the hope that it inspires thoughts for others. Why would wise AI safety researchers matter? My theory is that it would be hard for someone to know what's needed to build a wise AI without first being wise themself, or at least having a wiser person to check ideas against. Wisdom clearly isn't sufficient for knowing how to build wise and aligned AI, but it does seem necessary under my assumptions, in the same way that it would be hard to develop a good decision theory for AI if one could not reason for oneself how to maximize expected value in games. How did I get wiser? Mostly by practicing Zen Buddhism, but also by studying philosophy, psychology, mathematics, and game theory to help me think about how to build aligned AI. I started practicing Zen in 2017. I picked Zen with much reluctance after trying many other things that didn't work for me, or worked for a while and then had nothing else to offer me. Things I tried included Less Wrong style rationality training, therapy, secular meditation, and various positive psychology practices. I even tried other forms of Buddhism, but Zen was the only tradition I felt at home with. Consequently, my understanding of wisdom is biased by Zen, but I don't think Zen has a monopoly on wisdom, and other traditions might produce different but equally useful theories of wisdom than what I will discuss below. What does it mean to be wise? I roughly define wisdom as doing the right thing at the right time for the right reasons. This definition puts the word "right" through a strenuous workout, so let's break it down. The "right thing" is doing that which causes outcomes that we like upon reflection. The "right time" is doing the right thing when it will have the desired impact. And the "right reasons" is having an accurate model of the world that correctly predicts the right thing and time. How can the right reasons be known? The straightforward method is to have true beliefs and use correct logic. Alas, we're constantly uncertain about what's true and, in real-world scenarios, the logic becomes uncomputable, so instead we often rely on heuristics that lead to good outcomes and avoid optimization traps. That we rely on heuristi...

2446 episodes

#Podcasting Education #The Nonlinear Fund