LW - How do we know that "good research" is good? (aka "direct evaluation" vs "eigen-evaluation") by Ruby

The Nonlinear Library

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

7h ago 9:48

MP3•Episode home

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How do we know that "good research" is good? (aka "direct evaluation" vs "eigen-evaluation"), published by Ruby on July 19, 2024 on LessWrong. AI Alignment is my motivating context but this could apply elsewhere too. The nascent field of AI Alignment research is pretty happening these days. There are multiple orgs and dozens to low hundreds of full-time researchers pursuing approaches to ensure AI goes well for humanity. Many are heartened that there's at least some good research happening, at least in the opinion of some of the good researchers. This is reason for hope, I have heard. But how do we know whether or not we have produced "good research?" I think there are two main routes to determining that research is good, and yet only one applies in the research field of aligning superintelligent AIs. "It's good because it works" The first and better way to know that your research is good is because it allows you to accomplish some goal you care about[1] [1]. Examples: My work on efficient orbital mechanics calculation is good because it successfully lets me predict the trajectory of satellites. My work on the disruption of cell signaling in malign tumors is good because it helped me develop successful anti-cancer vaccines. My work on solid-state physics is good because it allowed me to produce superconductors at a higher temperature and lower pressure than previously attained.[2] In each case, there's some outcome I care about pretty inherently for itself, and if the research helps me attain that outcome it's good (or conversely if it doesn't, it's bad). The good researchers in my field are those who have produced a bunch of good research towards the aims of the field. Sometimes it's not clear-cut. Perhaps I figured out some specific cell signaling pathways that will be useful if it turns out that cell signaling disruption in general is useful, and that's TBD on therapies currently being trialed and we might not know how good (i.e. useful) my research was for many more years. This actually takes us into what I think is the second meaning of "good research". "It's good because we all agree it's good" If our goal is successfully navigating the creation of superintelligent AI in a way such that humans are happy with the outcome, then it is too early to properly score existing research on how helpful it will be. No one has aligned a superintelligence. No one's research has contributed to the alignment of an actual superintelligence. At this point, the best we can do is share our predictions about how useful research will turn out to be. "This is good research" = "I think this research will turn out to be helpful". "That person is a good researcher" = "That person produces much research that will turn out to be useful and/or has good models and predictions of which research will turn out to help". To talk about the good research that's being produced is simply to say that we have a bunch of shared predictions that there exists research that will eventually help. To speak of the "good researchers" is to speak of the people who lots of people agree their work is likely helpful and opinions likely correct. Someone might object that there's empirical research that we can see yielding results in terms of interpretability/steering or demonstrating deception-like behavior and similar. While you can observe an outcome there, that's not the outcome we really care about of aligning superintelligent AI, and the relevance of this work is still just prediction. It's being successful at kinds of cell signaling modeling before we're confident that's a useful approach. More like "good" = "our community pagerank Eigen-evaluation of research rates this research highly" It's a little bit interesting to unpack "agreeing that some research is good". Obviously, not everyone's opinion matters ...

2435 episodes

#Podcasting Education #The Nonlinear Fund