Artwork

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

LW - LLMs as a Planning Overhang by Larks

3:37
 
Share
 

Manage episode 428932184 series 3337129
Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: LLMs as a Planning Overhang, published by Larks on July 14, 2024 on LessWrong. It's quite possible someone has already argued this, but I thought I should share just in case not. Goal-Optimisers and Planner-Simulators When people in the past discussed worries about AI development, this was often about AI agents - AIs that had goals they were attempting to achieve, objective functions they were trying to maximise. At the beginning we would make fairly low-intelligence agents, which were not very good at achieving things, and then over time we would make them more and more intelligent. At some point around human-level they would start to take-off, because humans are approximately intelligent enough to self-improve, and this would be much easier in silicon. This does not seem to be exactly how things have turned out. We have AIs that are much better than humans at many things, such that if a human had these skills we would think they were extremely capable. And in particular LLMs are getting better at planning and forecasting, now beating many but not all people. But they remain worse than humans at other things, and most importantly the leading AIs do not seem to be particularly agentic - they do not have goals they are attempting to maximise, rather they are just trying to simulate what a helpful redditor would say. What is the significance for existential risk? Some people seem to think this contradicts AI risk worries. After all, ignoring anthropics, shouldn't the presence of human-competitive AIs without problems be evidence against the risk of human-competitive AI? I think this is not really the case, because you can take a lot of the traditional arguments and just substitute 'agentic goal-maximising AIs, not just simulator-agents' in wherever people said 'AI' and the argument still works. It seems like eventually people are going to make competent goal-directed agents, and at that point we will indeed have the problems of their exerting more optimisation power than humanity. In fact it seems like these non-agentic AIs might make things worse, because the goal-maximisation agents will be able to use the non-agentic AIs. Previously we might have hoped to have a period where we had goal-seeking agents that exerted influence on the world similar to a not-very-influential person, who was not very good at planning or understanding the world. But if they can query the forecasting-LLMs and planning-LLMs, as soon as the AI 'wants' something in the real world it seems like it will be much more able to get it. So it seems like these planning/forecasting non-agentic AIs might represent a sort of planning-overhang, analogous to a Hardware Overhang. They don't directly give us existentially-threatening AIs, but they provide an accelerant for when agentic-AIs do arrive. How could we react to this? One response would be to say that since agents are the dangerous thing, we should regulate/restrict/ban agentic AI development. In contrast, tool LLMs seem very useful and largely harmless, so we should promote them a lot and get a lot of value from them. Unfortunately it seems like people are going to make AI agents anyway, because ML researchers love making things. So an alternative possible conclusion would be that we should actually try to accelerate agentic AI research as much as possible, because eventually we are going to have influential AI maximisers, and we want them to occur before the forecasting/planning overhang (and the hardware overhang) get too large. I think this also makes some contemporary safety/alignment work look less useful. If you are making our tools work better, perhaps by understanding their internal working better, you are also making them work better for the future AI maximisers who will be using them. Only if the safety/alignment work applies directly to...
  continue reading

1764 episodes

Artwork
iconShare
 
Manage episode 428932184 series 3337129
Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: LLMs as a Planning Overhang, published by Larks on July 14, 2024 on LessWrong. It's quite possible someone has already argued this, but I thought I should share just in case not. Goal-Optimisers and Planner-Simulators When people in the past discussed worries about AI development, this was often about AI agents - AIs that had goals they were attempting to achieve, objective functions they were trying to maximise. At the beginning we would make fairly low-intelligence agents, which were not very good at achieving things, and then over time we would make them more and more intelligent. At some point around human-level they would start to take-off, because humans are approximately intelligent enough to self-improve, and this would be much easier in silicon. This does not seem to be exactly how things have turned out. We have AIs that are much better than humans at many things, such that if a human had these skills we would think they were extremely capable. And in particular LLMs are getting better at planning and forecasting, now beating many but not all people. But they remain worse than humans at other things, and most importantly the leading AIs do not seem to be particularly agentic - they do not have goals they are attempting to maximise, rather they are just trying to simulate what a helpful redditor would say. What is the significance for existential risk? Some people seem to think this contradicts AI risk worries. After all, ignoring anthropics, shouldn't the presence of human-competitive AIs without problems be evidence against the risk of human-competitive AI? I think this is not really the case, because you can take a lot of the traditional arguments and just substitute 'agentic goal-maximising AIs, not just simulator-agents' in wherever people said 'AI' and the argument still works. It seems like eventually people are going to make competent goal-directed agents, and at that point we will indeed have the problems of their exerting more optimisation power than humanity. In fact it seems like these non-agentic AIs might make things worse, because the goal-maximisation agents will be able to use the non-agentic AIs. Previously we might have hoped to have a period where we had goal-seeking agents that exerted influence on the world similar to a not-very-influential person, who was not very good at planning or understanding the world. But if they can query the forecasting-LLMs and planning-LLMs, as soon as the AI 'wants' something in the real world it seems like it will be much more able to get it. So it seems like these planning/forecasting non-agentic AIs might represent a sort of planning-overhang, analogous to a Hardware Overhang. They don't directly give us existentially-threatening AIs, but they provide an accelerant for when agentic-AIs do arrive. How could we react to this? One response would be to say that since agents are the dangerous thing, we should regulate/restrict/ban agentic AI development. In contrast, tool LLMs seem very useful and largely harmless, so we should promote them a lot and get a lot of value from them. Unfortunately it seems like people are going to make AI agents anyway, because ML researchers love making things. So an alternative possible conclusion would be that we should actually try to accelerate agentic AI research as much as possible, because eventually we are going to have influential AI maximisers, and we want them to occur before the forecasting/planning overhang (and the hardware overhang) get too large. I think this also makes some contemporary safety/alignment work look less useful. If you are making our tools work better, perhaps by understanding their internal working better, you are also making them work better for the future AI maximisers who will be using them. Only if the safety/alignment work applies directly to...
  continue reading

1764 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide