LW - So you want to work on technical AI safety by gw

The Nonlinear Library

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

5M ago 19:56

MP3•Episode home

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 26, 2024 16:04 (2M ago)

What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: So you want to work on technical AI safety, published by gw on June 24, 2024 on LessWrong.
I've been to two EAGx events and one EAG, and the vast majority of my one on ones with junior people end up covering some subset of these questions. I'm happy to have such conversations, but hopefully this is more efficient and wide-reaching (and more than I could fit into a 30 minute conversation).
I am specifically aiming to cover advice on getting a job in empirically-leaning technical research (interp, evals, red-teaming, oversight, etc) for new or aspiring researchers without being overly specific about the field of research - I'll try to be more agnostic than something like Neel Nanda's mechinterp quickstart guide but more specific than the wealth of career advice that already exists but that applies to ~any career.
This also has some overlap with this excellent list of tips from Ethan Perez but is aimed a bit earlier in the funnel.
This advice is of course only from my perspective and background, which is that I did a PhD in combinatorics, worked as a software engineer at startups for a couple of years, did the AI Futures Fellowship, and now work at Timaeus as the research lead for our language model track. In particular, my experience is limited to smaller organizations, so "researcher" means some blend of research engineer and research scientist rather than strictly one or the other.
Views are my own and don't represent Timaeus and so on.
Requisite skills
What kind of general research skills do I need?
There's a lot of tacit knowledge here, so most of what I can offer is more about the research process. Items on this list aren't necessarily things you're expected to just have all of or otherwise pick up immediately, but they're much easier to describe than e.g. research taste. These items are in no particular order:
Theory of change at all levels. Yes, yes, theories of change, they're great. But theories of change are most often explicitly spoken of at the highest levels: how is research agenda X going to fix all our problems? Really, it's theories of change all the way down. The experiment you're running today should have some theory of change for how you understand the project you're working on. Maybe it's really answering some question about a sub-problem that's blocking you.
Your broader project should have some theory of change for your research agenda, even though it probably isn't solving it outright. If you can't trace up the stack why the thing you're doing day to day matters for your ultimate research ambitions, it's a warning flag that you're just spinning your wheels.
Be ok with being stuck. From a coarse resolution, being stuck is a very common steady state to be in. This can be incredibly frustrating, especially if you feel external pressure from feeling that you're not meeting whatever expectations you think others have or if your time or money is running out (see also below, on managing burnout).
Things that might help for a new researcher are to have a mentor (if you don't have access to a human, frontier LLMs are (un)surprisingly good!) that can reassure you that your rate of progress is fine and to be more fine-grained about what progress means. If your experiment failed but you learned something new, that's progress!
Quickly prune bad ideas. Always look for cheap, fast ways to de-risk investing time (and compute) into ideas. If the thing you're doing is really involved, look for additional intermediates as you go that can disqualify it as a direction.
Communication. If you're collaborating with others, they should have some idea of what you're doing and why you're doing it, and your results should be clearly and quickly communicated. Good communication habits are kind of talked about to death, so I won't get into them too much here.
Write a lot. Wri...

2447 episodes

#Podcasting Education #The Nonlinear Fund