Discussion about this post

User's avatar
Felix Choussat's avatar

I remember talking with Rudolf Laine about a similar concept a few months back, where we discussed the idea that AI safety work disproportionately rewards incremental research and outlining new problems, as opposed to offering solutions. Because any solution will inevitably have flaws, and because the attack surface of the problem is so vast, it's much easier to publish work on identifying new problems than it is to present a coherent solution that tries to resolve the issue you've pointed out.

I came into the talk he was giving with a similar problem. I basically don't buy the d/acc plan, and by extension the argument L&D give for promoting proliferation in the Intelligence Curse. But d/acc not working doesn't negate the inconvenient reality of concentration of power! If you take that problem seriously, your solution still needs to get around to addressing it, as well as everything else you're worried about: controlling institutions that are hard to coup, handoff to a night watchman ASI, preventing anyone from ever building superintelligence, etc.

It's also important to present ToV's that are mutually coherent. A common example of this is:

1. Misuse of superintelligence to design superweapons is a catastrophic risk -> We should get the government to restrict the deployment/distributed development of at least the most powerful systems.

2. Concentration of power risks gradual disempowerment of the public -> We should redistribute the benefits of AI.

Each of these are individually reasonable to argue. But the problem is that most solutions to 1 will necessarily preclude 2: in order for the gov to prevent proliferation of the most powerful AIs, it needs a monopoly on violence to enforce its laws (domestically, or as part of an international coalition). But it can only maintain this monopoly on violence by proactively monopolizing the AI systems capable of violence! And if it has this monopoly, the state has the power to revoke whatever redistribution it's doing whenever it likes.

Likewise: you can't stop a country from building nukes if they already have one. The only way international atomic nonproliferation can work is if the coalition of nuclear states has, effectively, a decisive strategic advantage over their competitors. It's not like you can bomb Iran's enrichment facilities if they're already done developing a nuclear weapon---the risk of retaliation is too high.

If AI systems can cheaply/easily design new offense-dominant superweapons, then we'll end up in a similar dynamic: they'll need to be monopolized if you want the government to be able to enforce *any* restrictions on development, or else OpenAI will become the state (or worse, a regional warlord).

D/acc is, for its faults, a coherent solution to this: it would remove the need to monopolize AIs in the first place. But what if I don't think the "strong-d/acc" view that every AI-enabled offensive tactic can be resolved defense-dominant is right, and that there is at least one capability/superweapon so offense-dominant that general superintelligence itself needs to be monopolized? Then we'd still need state control, and all the problems at come with it.

To reconcile this, we'd need to figure out why monopolizing nuclear weapons worked---why every government that has nukes doesn't just point them at their citizens and neighbors without them and extract resources. Is it because coordination across the chain of command is too hard? Retaliation from other nuclear states? Gains from trade exceeding what could be taken? Poor threat credibility? Which of these restraints would ASI break? (Probably threat credibility & gains from trade). Which are useful for designing AI institutions? (Probably making the chain of command over ASI systems really big).

In other words, if I think d/acc won't work, then I need to bite the bullet on a coherent plan to get around that. We'd need to design government institutions that are difficult for one actor to take over and disempower everyone else, while still giving them the final say over superintelligence.

Julian's avatar

Very interesting post!

> Yes, indeed, they are fragile, but you should know the causal story of how all the actions are going to lead to the correct things happening.

I feel like this is the step that needs more detail. You don’t know what other people will do. You don't know all the weird second order and n-th order effects that are going to happen. Given this, I don't see why a theory of victory (which is essentially one happy story you could make up about how the world goes) is useful. (I suppose it's useful if it helps you realize that your current actions fit into zero such stories - but i suspect motivated cognition and the huge number of degrees of freedom will probably prevent this happening for most people who try the exercise)

By analogy, what would a theory of victory for the industrial revolution or even the computer revolution have looked like?

I would be curious if anyone has ever described how they successfully used a full theory of victory to good effect. Maybe Peter Thiel has, we should ask him what to do about AI.

there's an interesting comment by Richard Ngo on this topic i think a lot about: [argh i cannot find it but it's him saying on LW that its better to do science than try to implement any specific theory of change, because we still don't understand enough about agents and intelligence and what we actually want, and at least doing science has a chance of improving that]

No posts

Ready for more?