Last month there was an interview with Eliezer Yudkowsky, the rationalist philosopher and successful Harry Potter fanfic writer who heads the world’s foremost research outfit dedicated to figuring out ways in which a future runaway computer superintelligence could be made to refrain from murdering us all.
It’s really pretty interestingl. It contains a nice explication of Bayes, what Eliezer would do if he were to be World Dictator, his thoughts on the Singularity, justification of immortality, and thoughts on how to balance mosquito nets against the risk of genocidal Skynet from an Effective Altruism perspective.
That said, the reason I am making a separate post for this is that here at last Yudkowsky gives a more more or less concrete definition of what conditions a superintelligence “explosion” would have to satisfy in order to be considered as such:
Suppose we get to the point where there’s an AI smart enough to do the same kind of work that humans do in making the AI smarter; it can tweak itself, it can do computer science, it can invent new algorithms. It can self-improve. What happens after that — does it become even smarter, see even more improvements, and rapidly gain capability up to some very high limit? Or does nothing much exciting happen?
It could be that, (A), self-improvements of size δ tend to make the AI sufficiently smarter that it can go back and find new potential self-improvements of size k ⋅ δ and that k is greater than one, and this continues for a sufficiently extended regime that there’s a rapid cascade of self-improvements leading up to superintelligence; what I. J. Good called the intelligence explosion. Or it could be that, (B), k is less than one or that all regimes like this are small and don’t lead up to superintelligence, or that superintelligence is impossible, and you get a fizzle instead of an explosion. Which is true, A or B? If you actually built an AI at some particular level of intelligence and it actually tried to do that, something would actually happen out there in the empirical real world, and that event would be determined by background facts about the landscape of algorithms and attainable improvements.
You can’t get solid information about that event by psychoanalyzing people. It’s exactly the sort of thing that Bayes’s Theorem tells us is the equivalent of trying to run a car without fuel. Some people will be escapist regardless of the true values on the hidden variables of computer science, so observing some people being escapist isn’t strong evidence, even if it might make you feel like you want to disaffiliate with a belief or something.
I am fairly sure that k<1 for the banal reason that more advanced technologies need exponentially more and more cognitive capacity – intelligence, IQ – to develop. Critically, there is no reason this wouldn’t apply to cognitive-enhancing technologies themselves. In fact, it would be extremely strange – and extremely dangerous, admittedly – if this consistent pattern in the history of science ceased to hold. (In other words, this is merely an extension of Apollo’s Ascent theory. Technological progress invariably gets harder as you climb up the tech tree, which works against sustained runaway dynamics).
Any putative superintelligence, to continue making breakthoughs at an increasing rate, would have to not only solve ever harder problems as part of the process of constantly upgrading itself but to also create and/or “enslave” an exponentially increasing amount of computing power and task it to the near exclusive goal of improving itself and prevent rival superintelligences from copying its advances in what will surely be a far more integrated noosphere by 2050 or 2100 or if/whenever this scenario happens. I just don’t find it very plausible our malevolent superintelligence will be able to fulfill all of those conditions. Though admittedly, if this theory is wrong, then there will be nobody left to point it out anyway.