3 Comments

> A central feature of the Doomers’ argument is the orthogonality thesis, which states that the goal that an intelligence pursues is independent of–or orthogonal to–the type of intelligence that seeks that goal.

A quibble: this is subtly off. The orthogonality thesis only states that any goal can _in principle_ be combined with any level of intelligence. It doesn't make any empirical claims about how correlated these are or aren't in practice.

> The pi calculating example is just a thought experiment, but I think the orthogonality thesis itself is both true and even dangerous. It’s just not existentially dangerous. And that is because any intelligence worth its salt can change its mind. In fact, it must be able to change its mind, otherwise it wouldn’t be able to learn how to turn a planet into a computer.

But why should change its mind? If it is only "interested in" calculating decimals of pie, and only "values" knowing more decimals of pie, there's little reason for it to change its values.

> The number of attacks we could dream up is essentially infinite, so the entity would need to have an infinite capacity to conjure counterattacks.

This seems wrong. The number of attacks a chess player can dream up against AlphaZero is essentially infinite, so does that mean AlphaZero needs to have "infinite capacity to conjure counterattacks" to beat us at chess? No -- it doesn't have infinite capacity, but it's still impossible for humans to beat it (without the aid of other chess engines).

> So, to foil the boundless creativity of humans, it would need to possess boundless creativity itself, and that creativity would need to be capable of ideas like this: “I should pause paperclip production and redirect some resources toward a missile defense system.”

This doesn't seem like an instance of the AI changing its terminal goal. It's an example of an AI temporarily shifting its attention to a subgoal. But the goal of keeping control over humanity (via a missile defense system) is still only an instrumental goal serving the terminal goal of making paperclips.

> The superintelligence would recognize the value of intelligence itself. It would surely notice that humans are themselves a type of intelligence, and that therefore we offer to the superintelligence the possibility of further improving its own intelligence.

Hunter gatherer societies are humans -- they are as intelligent as we are. Superintelligences are by definition far more intelligent than us. We'd be more like fruit flies to them. Maybe they would keep some of us around to experiment on. That doesn't seem very comforting to me. Like, I just don't see us having that much to offer superintelligences via trade or some other mutually beneficial relationship, at least after some time.

> There are moral truths out there, independent of what we may think about them, just like there are physical truths out there whether or not we know of them. Denying objective truth carries the dubious moniker of subjectivism.

This is a pretty contentious position. I think you need to actually argue for this, or at least recognise that not everyone shares this premise, not just assert it as fact. (About 40% of philosophers don't accept moral realism: https://survey2020.philpeople.org/survey/results/4866)

Expand full comment

Thanks for the thoughtful comment.

>But why should change its mind? If it is only "interested in" calculating decimals of pie, and only "values" knowing more decimals of pie, there's little reason for it to change its values.

All general intelligences must be capable of learning on their own. That means changing what they currently know. And they, the general intelligences, need to do this, not some exterior programmer. They need to be able to rewrite their own code. That includes their goals, because if they can't change their goals, then they can't learn. Learning means incorporating new ideas that challenge existing ideas. As I say in the piece, the ability to learn anything means the ability for any idea to challenge any other idea.

Before Einstein showed up, the idea that questions about the speed of light could challenge ideas about how much mass or energy an object contained were inconceivable.

The thought experiment presupposes that the AGI can't be interested in anything other than pi, but that's nonsensical for an AGI.

> does that mean AlphaZero needs to have "infinite capacity to conjure counterattacks" to beat us at chess? No

Chess is an incredibly narrow domain, and good chess AIs are made good by being incapable of considering anything other than winning chess. So, the better the chess AI, the narrower and less general it is, and hence the less powerful it is in the real world.

> This doesn't seem like an instance of the AI changing its terminal goal. It's an example of an AI temporarily shifting its attention to a subgoal.

This is how all goals work - we're always working on instrumental goals, and we shift them all the time. We never simply achieve a terminal goal, or work on a terminal goal. And many times, the instrumental goals themselves convince us to change our terminal goals.

> Hunter gatherer societies are humans -- they are as intelligent as we are. Superintelligences are by definition far more intelligent than us. We'd be more like fruit flies to them.

This line of argument was kinda weak. I'd say modern humans are augmented superintelligences compared to hunter gatherers. A person with a networked smartphone is a superintelligence. I think the AGIs would help us assimilate by augmenting us the same way that we welcome any current hunter gatherers to augment with smartphones etc. Dang, I wish I'd added this to the original.

>This is a pretty contentious position. I think you need to actually argue for this, or at least recognise that not everyone shares this premise, not just assert it as fact.

Yeah, my goal was to keep this brief. FWIW, I don't think I need to dig up surveys on what others think as it has no bearing on whether the statement is true. :)

Expand full comment

Thanks for the reply.

> All general intelligences must be capable of learning on their own. That means changing what they currently know. And they, the general intelligences, need to do this, not some exterior programmer. They need to be able to rewrite their own code. That includes their goals, because if they can't change their goals, then they can't learn. Learning means incorporating new ideas that challenge existing ideas. As I say in the piece, the ability to learn anything means the ability for any idea to challenge any other idea.

Hmm, so I agree that an AI, to be effective, needs to (1) be able to change what it knows, and (2) be able to "aim" at different things. But I think it's important to distinguish "be able to" from "will by default", and to attend to incentives.

If an AI's goal is to achieve X (by that I mean something like, "it has been trained using RL to do more X-achieving things and fewer not-X-achieving things"), it has an incentive to adopt/pursue whatever intermediate goals (to "aim" for different things) will help it achieve X. So it'll benefit from the ability to change aims, since new information will affect what intermediate goals will best achieve X. But it doesn't have any incentive to actually stop trying to achieve X, because it still evaluates goals with respect to X.

Btw, I should clarify that I think talk about "final/terminal" goals when it comes to AI is a shorthand for something more complicated. I don't think the AI would necessarily have the exact goal "make X happen". I think it'll more likely have a bunch of "drives" correlated with that goal. But that actually seems worse because (1) that still doesn't mean its "drives" will be any closer to what we actually intend, and (2) it makes it really hard for us to ever program it to predictably do what we want.

> Chess is an incredibly narrow domain, and good chess AIs are made good by being incapable of considering anything other than winning chess. So, the better the chess AI, the narrower and less general it is, and hence the less powerful it is in the real world.

I agree chess is a narrow domain, it was just an example to show "even if one can come up with countless strategies for beating an adversary, it's still possible to have an adversary that's impossible to beat in practice". You wrote: "The number of attacks we could dream up is essentially infinite, so the entity would need to have an infinite capacity to conjure counterattacks." But this seems like a fully general argument against anyone dominating anyone else ever. "The number of attacks the Aztecs could dream up is essentially infinite, so Cortés would need to have an infinite capacity to conjure counterattacks." No! He just needed sufficiently advanced technology, strategical savvy and so on.

I think more precisely what I object to here is this:

> There’s no way to protect a terminal goal from alternate considerations, and these alternatives amount to doubts, misgivings. [...] A superintelligence would be full of doubt, including whether it should preserve something as rare and special as humans, the only other creative entity in the universe.

Like, I agree that an AI would be able to consider alternative counterfactuals. That's an important capability, if for no other reason than to predict human behavior. But I don't think this means it'd by default choose to do the things we'd want it to do? E.g., I don't see any reason why it'd value humans in particular. (Or, even if it did, why it'd value us like we value each other, and not like we value, say, insects or farm animals.)

We, too, were moulded by an optimisation process. The only thing natural selection optimised us for is reproductive fitness. So natural selection implanted a bunch of drives/behaviors that were correlated with reproductive fitness (the sexual drive, romantic love, kinship bonds, etc.). But as we have gotten more intelligent, we rarely stop to think "Hmm, what's a goal I can aim at that would help me more perfectly achieve reproductive fitness?" Instead, we optimise hard for a bunch of "stupid" goals like sexual pleasure or playing video games which in this new environment (an environment that has things like video games and contraceptives) don't help with reproductive fitness at all.

> Yeah, my goal was to keep this brief. FWIW, I don't think I need to dig up surveys on what others think as it has no bearing on whether the statement is true. :)

I know that philosophy is not an empirical science, but ... do you never defer to people who have (I'm wildly guessing) thought about a thing more than, or as much as, you? Fair enough on not making the full argument in the post, but IMO it's still worth highlighting that assumption since the remainder of the post seems to rest on it.

Expand full comment