LW - My takes on SB-1047 by leogao

The Nonlinear Library

Content provided by The Nonlinear Fund. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Nonlinear Fund or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

1M ago 6:57

MP3•Episode home

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My takes on SB-1047, published by leogao on September 9, 2024 on LessWrong.
I recently decided to sign a letter of support for SB 1047. Before deciding whether to do so, I felt it was important for me to develop an independent opinion on whether the bill was good, as opposed to deferring to the opinions of those around me, so I read through the full text of SB 1047.
After forming my opinion, I checked my understanding of tort law basics (definitions of "reasonable care" and "materially contribute") with a law professor who was recommended to me by one of the SB 1047 sponsors, but who was not directly involved in the drafting or lobbying for the bill. Ideally I would have wanted to consult with a completely independent lawyer, but this would have been prohibitively expensive and difficult on a tight timeline. This post outlines my current understanding.
It is not legal advice.
My main impression of the final version of SB 1047 is that it is quite mild. Its obligations only cover models trained with $100M+ of compute, or finetuned with $10M+ of compute. [1] If a developer is training a covered model, they have to write an SSP, that explains why they believe it is not possible to use the model (or a post-train/finetune of the model costing <$10M of compute) to cause critical harm ($500M+ in damage or mass casualties).
This would involve running evals, doing red teaming, etc. The SSP also has to describe what circumstances would cause the developer to decide to shut down training and any copies of the model that the developer controls, and how they will ensure that they can actually do so if needed. Finally, a redacted copy of the SSP must be made available to the public (and an unredacted copy filed with the Attorney General).
This doesn't seem super burdensome, and is very similar to what labs are already doing voluntarily, but it seems good to codify these things because otherwise labs could stop doing them in the future. Also, current SSPs don't make hard commitments about when to actually stop training, so it would be good to have that.
If a critical harm happens, then the question for determining penalties is whether the developer met their duty to exercise "reasonable care" to prevent models from "materially contributing" to the critical harm. This is determined by looking at how good the SSP was (both in an absolute sense and when compared to other developers) and how closely it was adhered to in practice.
Reasonable care is a well-established concept in tort law that basically means you did a cost benefit analysis that a reasonable person would have done. Importantly, it doesn't mean the developer has to be absolutely certain that nothing bad can happen.
For example, suppose you release an open source model after doing dangerous capabilities evals to make sure it can't make a bioweapon, but then a few years later a breakthrough in scaffolding methods happens and someone makes a bioweapon using your model - as long as you were thorough in your dangerous capabilities evals you would not be liable, because it would not have been reasonable for you to anticipate that someone would make a breakthrough that invalidates your evaluations.
Also, if mitigating the risk would be too costly, and the benefit of releasing the model far outweighs the risks of release, this is also a valid reason not to mitigate the risk under the standard of reasonable care (e.g the benefits of driving a car at a normal speed far outweigh the costs of car accidents; so reasonable care doesn't require driving at 2 mph to fully mitigate the risk of car accidents).
My personal opinion is I think the reasonable care standard is too weak to prevent AI from killing everyone. However, this also means that I think people opposing the current version of the bill because of the reasonable care requireme...

2447 episodes

#Podcasting Education #The Nonlinear Fund