Discussion about this post

User's avatar
Felix Choussat's avatar

Takes on takes:

1. Does AI deterrence promote cooperation? - I think this strongly depends on how hard is it to break out of a stalemate. If mutual sabotage is easy and inflicts very long delays, then each country would be encouraged to cooperate and develop AI under mutually agreed terms. If sabotage is hard to sustain (ex: algorithmic progress advances far enough that you can easily train RSI-capable AIs at a covert blacksite), each country might bide their time and hope to break out first. I think the second is much more likely, since even the largest non-nuclear strikes on AI infrastructure would only buy you five or so years at the current rate of algorithmic efficiency improvement.

And in the short term, of course, you would clearly burn your bridges by taking such an aggressive position.

5. AI research progress - There's a few competing ideas on why there might be differences between the leadership and employees. On the one hand, the leaders might be feeling pressure to speak AGI into reality by drumming up investor support, so they're distorting their public timelines.

From the research perspective, I think it's clear that there's several big unsolved capability bottlenecks (continual learning, sample efficiency, self-play in domains without perfect ground-truth). The bet from companies like Anthropic is that scaling LLMs will increase the productivity of their best human researchers enough that they can quickly come up with breakthroughs (such as by letting developers test more ideas more quickly), and that the solutions can then be stapled on top of the existing LLM paradigm.

7. AI Chips - My sense is that China might catch up qualitatively, but that the more important factor is their ability to scale how many they're making, which is still several years behind the U.S. See: Erich's analysis that most of the compute available to China in 2026 will come from legal H200 sales.

https://www.the-substrate.net/p/where-will-china-get-its-compute

To your other point, I think AI is clearly a government priority. It's just that the Chinese government is investing more heavily in robots and embodied AI, while the U.S. is focused on recurisive software improvements to automate white collar work. The Chinese perspective is roughly that it doesn't matter if the U.S. has slightly better models: China can stay close behind by iterating on U.S. improvements, and ultimately grow much more by preparing their industrial base to better integrate AI.

8. Alignment difficulty - Today's LLMs are both highly competent and mostly aligned, which has cut some of the edge from the original Yudkowskian foom and doom perspective. The division is between the people that you can scale LLMs and that the HHH persona will be robust, vs the people that think that either a) LLMs cannot be scaled without some kind of self-play RL and that this optimization pressure will misalign the model, or b) that LLMs can't scale to ASI, and the only architecture that can will be heavily RL-based and prone to misalignment by default.

10. Good futures - State of the art thinking here is Forethought's Better Futures. The major crux in longtermist planning (to me) is whether the driving goal is to reduce existential risk as much as possible, or whether to try and maximize welfare. Ex: there's tension between a one-world government to avoid proliferation of offensive AI capabilities, and with the freedom people would have to do a long-reflection style consideration of what to do with future resources. I am much closer towards the maximizing security end of the spectrum: "The main thing for us to do about Utopia is to protect its potential to someday be made real"

As far as your specific question on creating moral patients: if you can easily create new beings with welfare, and they have rights to some resources, there has to be a limit on creating new moral patients. Otherwise, some populations (digital or biological) will grow explosively and reduce us back to a malthusian equilibrium.

Shobha Dasari's avatar

shard theory for the win

1 more comment...

No posts

Ready for more?