PHYS771 Lecture 10.5: Penrose

Scott Aaronson


So, you guys finally finished reading Roger Penrose's The Emperor's New Mind? What did you think of it?

(Since I forgot to record this lecture, the class responses are tragically lost to history. But if I recall correctly, the entire class turned out to consist of -- YAWN -- straitlaced, clear-thinking materialistic reductionists who correctly pointed out the glaring holes in Penrose's arguments. No one took Penrose's side, even just for sport.)


Alright, so let me try a new tack: who can summarize Penrose's argument (or more correctly, a half-century-old argument adapted by Penrose) in a few sentences?

How about this: Gödel's First Incompleteness Theorem tells us that no computer, working within a fixed formal system F such as Zermelo-Fraenkel set theory, can prove the sentence

But we humans can just "see" the truth of G(F) -- since if G(F) were false, then it would be provable, which is absurd! Therefore the human mind can do something that no present-day computer can do. Therefore consciousness can't be reducible to computation.

Alright, class: problems with this argument?

Yeah, there are two rather immediate ones:

Actually, the response I prefer encapsulates both of the above responses as "limiting cases." Recall from Lecture 3 that, by the Second Incompleteness Theorem, G(F) is equivalent to Con(F): the statement that F is consistent. Furthermore, this equivalence can be proved in F itself for any reasonable F. This has two important implications.

First, it means that when Penrose claims that humans can "see" the truth of G(F), really he's just claiming that humans can see the consistency of F! When you put it that way, the problems become more apparent: how can humans see the consistency of F? Exactly which F's are we talking about: Peano Arithmetic? ZF? ZFC? ZFC with large cardinal axioms? Can all humans see the consistency of all these systems, or do you have to be a Penrose-caliber mathematician to see the consistency of the stronger ones? What about the systems that people thought were consistent, but that turned out not to be? And even if you did see the consistency of (say) ZF, how would you convince someone else that you'd seen it? How would the other person know you weren't just pretending?

(Models of Zermelo-Fraenkel set theory are like those 3D dot pictures: sometimes you really have to squint...)

The second implication is that, if we grant a computer the same freedom that Penrose effectively grants to humans -- namely, the freedom to assume the consistency of the underlying formal system -- then the computer can prove G(F).

So the question boils down to this: can the human mind somehow peer into the Platonic heavens, in order to directly perceive (let's say) the consistency of ZF set theory? If the answer is no -- if we can only approach mathematical truth with the same unreliable, savannah-optimized tools that we use for doing the laundry, ordering Chinese takeout, etc. -- then it seems we ought to grant computers the same liberty of being fallible. But in that case, the claimed distinction between humans and machines would seem to evaporate.

(Perhaps Turing himself said it best: "If we want a machine to be intelligent, it can't also be infallible. There are theorems that say almost exactly that.")

In my opinion, then, Penrose doesn't need to be talking about Gödel's theorem at all. The Gödel argument turns out to be just a mathematical restatement of the oldest argument against reductionism in the book: "sure a computer could say it perceives G(F), but it'd just be shuffling symbols around! When I say I perceive G(F), I really mean it! There's something it feels like to be me!"

The obvious response is equally old: "what makes you so sure that it doesn't feel like anything to be a computer?"


Years ago I parodied Penrose's argument by means of the Gödel CAPTCHA. Recall from Lecture 4 that a CAPTCHA (Completely Automated Public Turing Test to tell Computers and Humans and Apart) is a test that today's computers can generate and grade, but not pass. These are those "retype the curvy-looking nonsense word" deals that Yahoo and Google use all the time to root out spambots. Alas, today's CAPTCHA's are far from perfect; some of them have even been broken by clever researchers.

By exploiting Penrose's insights, I was able to create a completely unbreakable CAPTCHA. How does it work? It simply asks whether you believe the Gödel sentence G(F) for some reasonable formal system F! Assuming you answer yes, it then (and this is a minor security hole I should really patch sometime) asks whether you're a human or a machine. If you say you're a human, you pass. If, on the other hand, you say you're a machine, the program informs you that, while your answer happened to be correct in this instance, you clearly couldn't have arrived at it via a knowably sound procedure, since you don't possess the requisite microtubules. Therefore your request for an email account must unfortunately be denied.


Opening the Black Box

Alright, look: Roger Penrose is one of the greatest mathematical physicists on Earth. Is it possible that we've misconstrued his thinking?

To my mind, the most plausible-ish versions of Penrose's argument are the ones based on an "asymmetry of understanding": namely that, while we know the internal workings of a computer, we don't yet know the internal workings of the brain.

How can one exploit this asymmetry? Well, given any known Turing machine M, it's certainly possible to construct a sentence that stumps M:

There are two cases: either M outputs S(M), in which case it utters a falsehood, or else M doesn't output S(M), in which case there's a mathematical truth to which it can never assent.

The obvious response is, why can't we play the same game with humans?

Well, conceivably there's an answer: because we can formalize what it means for M to output something, by examining its inner workings. (Indeed, "M" is really just shorthand for the appropriate Turing machine state diagram.) But can we formalize what it means for Penrose to output something? The answer depends on what we believe about the internal workings of the brain (or more precisely, Penrose's brain)! And this leads to Penrose's view of the brain as "non-computational."

A common misconception is that Penrose thinks the brain is a quantum computer. In reality, a quantum computer would be much weaker than he wants! As we saw before, quantum computers don't even seem able to solve NP-complete problems in polynomial time. Penrose, by contrast, wants the brain to solve uncomputable problems, by exploiting hypothetical collapse effects from a yet-to-be-discovered quantum theory of gravity.

When Penrose visited Perimeter Institute a few years ago, I asked him: why not go further, and conjecture that the brain can solve problems that are uncomputable even given an oracle for the halting problem, or an oracle for the halting problem for Turing machines with an oracle for the halting problem, etc.? His response was that yes, he'd conjecture that as well.


My own view has always been that, if Penrose really wants to speculate about the impossibility of simulating the brain on a computer, then he ought to talk not about computability but about complexity. The reason is simply that, in principle, we can always simulate a person by building a huge lookup table, which encodes the person's responses to every question that could ever be asked within (say) a million years. If we liked, we could also have the table encode the person's voice, gestures, facial expressions, etc. Clearly such a table will be finite. So there's always some computational simulation of a human being -- the only question is whether or not it's an efficient one!

You might object that, if people could live for an infinite or even just an arbitrarily long time, then the lookup table wouldn't be finite. This is true but irrelevant. The fact is, people regularly do decide that other people have minds after interacting with them for just a few minutes! (Indeed, maybe just a few minutes of email or instant messaging.) So unless you want to retreat into Cartesian skepticism about everyone you've ever met on MySpace, Gmail chat, the Shtetl-Optimized comment section, etc., there must be a relatively small integer n such that by exchanging at most n bits, you can be reasonably sure that someone else has a mind.

In Shadows of the Mind (the "sequel" to The Emperor's New Mind), Penrose concedes that a human mathematician could always be simulated by a computer with a huge lookup table. He then argues that such a lookup table wouldn't constitute a "proper" simulation, since (for example) there'd be no reason to believe that any given statement in the table was true rather than false. The trouble with this argument is that it explicitly retreats from what one might have thought was Penrose's central claim: namely, that a machine can't even simulate human intelligence, let alone exhibit it!

In Shadows, Penrose offers the following classification of views on consciousness:

  1. Consciousness is reducible to computation (the view of strong-AI proponents)
  2. Sure, consciousness can be simulated by a computer, but the simulation couldn't produce "real understanding" (John Searle's view)
  3. Consciousness can't even be simulated by computer, but nevertheless has a scientific explanation (Penrose's own view, according to Shadows)
  4. Consciousness doesn't have a scientific explanation at all (the view of 99% of everyone who ever lived)

Now it seems to me that, in dismissing the lookup table as not a "real" simulation, Penrose is retreating from view C to view B. For as soon as we say that passing the Turing Test isn't good enough -- that one needs to "pry open the box" and examine a machine's internal workings to know whether it thinks or not -- what could possibly be the content of view C that would distinguish it from view B?


Again, though, I want to bend over backwards to see if I can figure out what Penrose might be saying.

In science, you can always cook up a theory to "explain" the data you've seen so far: just list all the data you've got, and call that your "theory"! The obvious problem here is overfitting. Since your theory doesn't achieve any compression of the original data -- i.e., since it takes as many bits to write down your theory as to write down the data itself -- there's no reason to expect your theory to predict future data. In other words, your theory is a useless piece of shit.

So, when Penrose says the lookup table isn't a "real" simulation, perhaps what he means is this. Of course one could write a computer program to converse like Disraeli or Churchill, by simply storing every possible quip and counterquip. But that's the sort of overfitting up with which we must not put! The relevant question is not whether we can simulate Sir Winston by any computer program. Rather, it's whether we can simulate him by a program that can be written down inside the observable universe -- one that, in particular, is dramatically shorter than a list of all possible conversations with him.

Now, here's the point I keep coming back to: if this is what Penrose means, then he's left the world of Gödel and Turing far behind, and entered my stomping grounds -- the Kingdom of Computational Complexity. How does Penrose, or anyone else, know that there's no small Boolean circuit to simulate Winston Churchill? Presumably we wouldn't be able to prove such a thing, even supposing (for the sake of argument) that we knew what a Churchill simulator meant! All ye who would claim the intractability of finite problems: that way lieth the P versus NP beast, from whose 2n jaws no mortal hath yet escaped.


At Risk of Stating the Obvious

Even if we supposed the brain was solving a hard computational problem, it's not clear why that would bring us any closer to understanding consciousness. If it doesn't feel like anything to be a Turing machine, then why does it feel like something to be a Turing machine with an oracle for the halting problem?


All Aboard the Holistic Quantum Gravy Train

Let's set aside the specifics of Penrose's ideas, and ask a more general question. Should quantum mechanics have any affect on how we think about the brain?

The temptation is certainly a natural one: consciousness is mysterious, quantum mechanics is also mysterious, therefore they must be related somehow! Well, maybe there's slightly more to it than that, since the source of the mysteriousness seems the same in both cases: namely, how do we reconcile a third-person description of the world with a first-person experience of it?

When people try to make the question more concrete, they often end up asking: "is the brain a quantum computer?" Well, it might be, but I can think of at least four good arguments against this possibility:

  1. The problems for which quantum computers are believed to offer dramatic speedups -- factoring integers, solving Pell's equation, simulating quark-gluon plasmas, approximating the Jones polynomial, etc. -- just don't seem like the sorts of things that would have increased Oog the Caveman's reproductive success relative to his fellow cavemen.
  2. Even if humans could benefit from quantum computing speedups, I don't see any evidence that they're actually doing so. (It's said that Gauss could immediately factor large integers in his head -- but if so, that only proves that Gauss's brain was a quantum computer, not that anyone else's is!)
  3. The brain is a hot, wet environment, and it's hard to understand how long-range coherence could be maintained there. (With today's understanding of quantum error-correction, this is no longer a knock-down argument, but it's still an extremely strong one.)
  4. As I mentioned earlier, even if we suppose the brain is a quantum computer, it doesn't seem to get us anywhere in explaining consciousness, which is the usual problem that these sorts of speculations are invoked to solve!


Alright, look. So as not to come across as a total curmudgeon -- for what could possibly be further from my personality? -- let me at least tell you what sort of direction I would pursue if I were a woo-woo quantum mystic.

Near the beginning of Emperor's New Mind, Penrose brings up one of my all-time favorite thought experiments: the teleportation machine. This is a machine that whisks you around the galaxy at the speed of light, by simply scanning your whole body, encoding all the cellular structures as pure information, and then transmitting the information as radio waves. When the information arrives at its destination, nanobots (of the sort we'll have in a few decades, according to Ray Kurzweil et al.) use the information to reconstitute your physical body down to the smallest detail.

Oh, I forgot to mention: since obviously we don't want two copies of you running around, the original is destroyed by a quick, painless gunshot to the head. So, fellow scientific reductionists: which one of you wants to be the first to travel to Mars this way?

What, you feel squeamish about it? Are you going to tell me you're somehow attached to the particular atoms that currently reside in your brain? As I'm sure you're aware, those atoms are replaced every few weeks anyway. So it can't be the atoms themselves that make you you; it has to be the patterns of information they encode. And as long as the information is safely on its way to Mars, who cares about the original meat hard-drive?

So, soul or bullet: take your pick!


Quantum mechanics does offer a third way out of this dilemma, one that wouldn't make sense in classical physics.

Suppose some of the information that made you you was actually quantum information. Then even if you were a thoroughgoing materialist, you could still have an excellent reason not to use the teleportation machine: because, as a consequence of the No-Cloning Theorem, no such machine could possibly work as claimed.

This is not to say that you couldn't be teleported around at the speed of light. But the teleportation process would have to be very different from the one above: it could not involve copying you and then killing the original copy. Either you could be sent as quantum information, or else -- if that wasn't practical -- you could use the famous BBCJPW protocol, which sends only classical information, but also requires prior entanglement between the sender and the receiver. In either case, the original copy of you would disappear unavoidably, as part of the teleportation process itself. Philosophically, it would be just like flying from Newark to LAX: you wouldn't face any profound metaphysical dilemma about "whether to destroy the copy of you still at Newark."

Of course, this neat solution can only work if the brain stores quantum information. But crucially, in this case we don't have to imagine that the brain is a quantum computer, or that it maintains entanglement across different neurons, or anything harebrained like that. As in quantum key distribution, all we need are individual coherent qubits.

Now, you might argue that in a hot, wet, decoherent place like the brain, not even a single qubit would survive for very long. And from what little I know of neuroscience, I'd tend to agree. In particular, it does seem that long-term memories are encoded as synaptic strengths, and that these strengths are purely classical information that a nanobot could in principle scan and duplicate without any damage to the original brain. On the other hand, what about (say) whether you're going to wiggle your left finger or your right finger three seconds from now? Is that decision determined in part by quantum events?

Well, whatever else you might think about such a hypothesis, it's clear what it would take to falsify it. You'd simply have to build a machine that scanned a person's brain, and correctly predicted which finger that person would wiggle three seconds from now. (If I remember correctly, computers hooked up to EEG machines can make these sorts of predictions today, but only a small fraction of a second in advance -- not three seconds.)


[Discussion of this lecture on blog]

[← Previous lecture | Next lecture →]

[Return to PHYS771 home page]