PHYS771 Lecture 18: Free Will

Scott Aaronson

Scribe: Chris Granade

So today we're going to ask---and hopefully answer---this question of whether there's free will or not. If you want to know where I stand, I'll tell you: I believe in free will. Why? Well, the neurons in my brain just fire in such a way that my mouth opens and I say I have free will. What choice do I have?

Before we start, there are two common misconceptions that we have to get out of the way. The first one is committed by the free will camp, and the second by the anti-free-will camp.

The misconception committed by the free will camp is the one I alluded to before: if there's no free will, then none of us are responsible for our actions, and hence (for example) the legal system would collapse. Well, I know of only one trial where the determinism of the laws of physics was actually invoked as a legal defense. It's the Leopold and Loeb trial in 1926. Have you heard of this? It was one of the most famous trials in American history, next to the OJ trial. So, Leopold and Loeb were these brilliant students at the University of Chicago (one of them had just finished his undergrad at 18), and they wanted to prove that they were Nietzschean supermen who were so smart that they could commit the perfect murder and get away with it. So they kidnapped this 14-year-old boy and bludgeoned him to death. And they got caught---Leopold dropped his glasses at the crime scene.

They were defended by Clarence Darrow---the same defense lawyer from the Scopes monkey trial, considered by some to be the greatest defense lawyer in American history. In his famous closing address, he actually made an argument appealing to the determinism of the universe. "Who are we to say what could have influenced these boys to do this? What kind of genetic or environmental influences could've caused them to commit the crime?" (Maybe Darrow thought he had nothing to lose.) Anyway, they got life in prison instead of the death penalty, but apparently it was because of their age, and not because of the determinism of the laws of physics.

Alright, what's the problem with using the non-existence of free will as a legal defense?

A: The judge and the jury don't have free will either.
Scott: Thank you! I'm glad someone got this immediately, because I've read whole essays about this, and the obvious point never gets brought up.

The judge can just respond, "The laws of physics might have predetermined your crime, but they also predetermined my sentence: DEATH!" (In the US, anyway. In Canada, maybe 30 days...)

Alright, that was the misconception of the free will camp. Now on to the misconception of the anti-free will camp. I've often heard the argument which says that not only is there no free will, but the very concept of free will is incoherent. Why? Because either our actions are determined by something, or else they're not determined by anything, in which case they're random. In neither case can we ascribe them to "free will."

For me, the glaring fallacy in the argument lies in the implication Not Determined ⇒ Random. If that was correct, then we couldn't have complexity classes like NP---we could only have BPP. The word "random" means something specific: it means you have a probability distribution over the possible choices. In computer science, we're able to talk perfectly coherently about things that are non-deterministic, but not random.

Look, in computer science we have many different sources of non-determinism. Arguably the most basic source is that we have some algorithm, and we don't know in advance what input it's going to get. If it were always determined in advance what input it was going to get, then we'd just hardwire the answer. Even talking about algorithms in the first place, we've sort of inherently assumed the idea that there's some agent that can freely choose what input to give the algorithm.

Q: Not necessarily. You can look at an algorithm as just a big compression scheme. Maybe we do know all the inputs we'll ever need, but we just can't write them in a big enough table, so we write them down in this compressed form.
Scott: OK, but then you're asking a technically different question. Maybe there's no efficient algorithm for some problem such that there is an efficient compression scheme. All I'm saying is that the way we use language---at least in talking about computation---it's very natural to say there's some transition where we have this set of possible things that could happen, but we don't know which is going to happen or even have a probability distribution over the possibilities. We would like to be able to account for all of them, or maybe at least one of them, or the majority of them, or whatever other quantifier we like. To say that something is either determined or random is leaving out whole swaths of the Complexity Zoo. We have lots of ways of producing a single answer from a set of possibilities, so I don't think it's logically incoherent to say that there could exist transitions in the universe with several allowed possibilities over which there isn't even a probability distribution.
Q: Then they're determined.
Scott: What?
Q: According to classical physics, everything is determined. Then, there's quantum mechanics, which is random. You can always build a probability distribution over the measurement outcomes. I don't think you can get away from the fact that those are the only two kinds of things you can have. You can't say that there's some particle which can go to one of three states, but that you can't build a probability distribution over them. Unless you want to be a frequentist about it, that's something that just can't happen.
Scott: I disagree with you. I think it does make sense. As one example, we talked about hidden-variable theories. In that case, you don't even have a probability distribution over the future until you specify which hidden-variable theory you're talking about. If we're just talking about measurement outcomes, then yes, if you know the state that you're measuring and you know what measurement you're applying, quantum mechanics gives you a probability distribution over the outcomes. But if you don't know the state or the measurement, then you don't even get a distribution.
Q: I know that there are things out there that aren't random, but I don't concede this argument.
Scott: Good! I'm glad someone doesn't agree with me.
Q: I disagree with your argument, but not your result that you believe in free will.
Scott: My "result"?
Q: Can we even define free will?
Scott: Yeah, that's an excellent question. It's very hard to separate the question of whether free will exists from the question of what the definition of it is. What I was trying to do is, by saying what I think free will is not, give some idea of what the concept seems to refer to. It seems to me to refer to some transition in the state of the universe where there are several possible outcomes and we can't even talk coherently about a probability distribution over them.
Q: Given the history?
Scott: Given the history.
Q: Not to beat this to death, but couldn't you at least infer a probability distribution by running your simulation many times and seeing what your free will entity chooses each time?
Scott: I guess where it becomes interesting is, what if (as in real life) we don't have the luxury of repeated trials?

Newcomb's Paradox

So let's put a little meat on this philosophical bone with a famous thought experiment. Suppose that a super-intelligent Predictor shows you two boxes: the first box has $1000, while the second box has either $1,000,000 or nothing. You don't know which is the case, but the Predictor has already made the choice and either put the money in or left the second box empty. You, the Chooser, have two choices: you can either take the second box only, or both boxes. Your goal, of course, is money and not understanding the universe.

Here's the thing: the Predictor made a prediction about your choice before the game started. If the Predictor predicted you'll take only the second box, then he put $1,000,000 in it. If he predicted you'll both boxes, then he left the second box empty. The Predictor has played this game thousands of times before, with thousands of people, and has never once been wrong. Every single time someone picked the second box, they found a million dollars in it. Every single time someone took both boxes, the found that the second box was empty.

First question: Why is it obvious that you should take both boxes? Right: because whatever's in the second box, you'll get $1,000 more by taking both boxes. The decision of what to put in the second box has already been made; your taking both boxes can't possibly affect it.

Second question: Why is it obvious that you should take only the second box? Right: because the Predictor's never been wrong! Again and again you've seen one-boxers walk away with $1,000,000, and two-boxers walk away with only $1,000. Why should this time be any different?

Q: How good is the Predictor's computer?
Scott: Well, clearly it's pretty good, given that he's never been wrong. We're going to get to that later.

This paradox was popularized by a philosopher named Robert Nozick in 1969. There's a famous line from his paper about it: "To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly."

There's actually a third position---a boring "Wittgenstein" position---which says that the problem is simply incoherent, like asking about the unstoppable force that hits the immovable object. If the Predictor actually existed, then you wouldn't have the freedom to make a choice in the first place; in other words, the very fact that you're debating which choice to make implies that the Predictor can't exist.

Q: Why can't you get out of the paradox by flipping a coin?
Scott: That's an excellent question. Why can't we evade the paradox using probabilities? Suppose the Predictor predicts you'll take only the second box with probability p. Then he'll put $1,000,000 in that box with the same probability p. So your expected payoff is:
1,000,000 p² + 1,001,000 p(1 − p) + 1,000(1 − p)² = 1,000,000p + 1,000(1-p)
leading to exactly the same paradox as before, since your earnings will be maximized by setting p = 1. So my view is that randomness really doesn't change the fundamental nature of the paradox at all.

To review, there are three options: are you a one-boxer, a two-boxer or a Wittgenstein? Let's take a vote.

Results of Voting
Take both boxes: 1
Take only one box: 9
The question is meaningless: 9
Note: there were many double votes.

Well, it looks like the consensus coincides with my own point of view: (1) the question is meaningless, and (2) you should take only one box!

Q: It it really meaningless if you replace the question "what do you choose to do" with "how many boxes will you take?" It's not so much that you're choosing; you're reflecting on what you would in fact do, whether or not there's choice involved.
Scott: That is, you're just predicting your own future behavior? That's an interesting distinction.
Q: How good of a job does the Predictor have to do?
Scott: Maybe it doesn't have to be a perfect job. Even if he only gets it right 90% of the time, there's still a paradox here.
Q: So by the hypothesis of the problem, there's no free will and you have to take the Wittgenstein option.
Scott: Like with any good thought experiment, it's never any fun just to reject the premises. We should try to be good sports.

I can give you my own attempt at a resolution, which has helped me to be an intellectually-fulfilled one-boxer. First of all, we should ask what we really mean by the word "you." I'm going to define "you" to be anything that suffices to predict your future behavior. There's an obvious circularity to that definition, but what it means is that whatever "you" are, it ought to be closed with respect to predictability. That is, "you" ought to coincide with the set of things that can perfectly predict your future behavior.

Now let's get back to the earlier question of how powerful a computer the Predictor has. Here's you, and here's the Predictor's computer. Now, you could base your decision to pick one or two boxes on anything you want. You could just dredge up some childhood memory and count the letters in the name of your first-grade teacher or something and based on that, choose whether to take one or two boxes. In order to make its prediction, therefore, the Predictor has to know absolutely everything about you. It's not possible to state a priori what aspects of you are going to be relevant in making the decision. To me, that seems to indicate that the Predictor has to solve what one might call a "you-complete" problem. In other words, it seems the Predictor needs to run a simulation of you that's so accurate it would essentially bring into existence another copy of you.

Let's play with that assumption. Suppose that's the case, and that now you're pondering whether to take one box or two boxes. You say, "all right, two boxes sounds really good to me because that's another $1,000." But here's the problem: when you're pondering this, you have no way of knowing whether you're the "real" you, or just a simulation running in the Predictor's computer. If you're the simulation, and you choose both boxes, then that actually is going to affect the box contents: it will cause the Predictor not to put the million dollars in the box. And that's why you should take just the one box.

Q: I think you could predict very well most of the time with just a limited dataset.
Scott: Yeah, that's probably true. In a class I taught at Berkeley, I did an experiment where I wrote a simple little program that would let people type either "f" or "d" and would predict which key they were going to push next. It's actually very easy to write a program that will make the right prediction about 70% of the time. Most people don't really know how to type randomly. They'll have too many alternations and so on. There will be all sorts of patterns, so you just have to build some sort of probabilistic model. Even a very crude one will do well. I couldn't even beat my own program, knowing exactly how it worked. I challenged people to try this and the program was getting between 70% and 80% prediction rates. Then, we found one student that the program predicted exactly 50% of the time. We asked him what his secret was and he responded that he "just used his free will."
Q: It seems like a possible problem with "you-completeness" is that, at an intuitive level, you is not equal to me. But then, anything that can simulate me can also presumably simulate you, and so that means that the simulator is both you and me.
Scott: Let me put it this way: the simulation has to bring into being a copy of you. I'm not saying that the simulation is identical to you. The simulation could bring into being many other things as well, so that the problem it's solving is "you-hard" rather than "you-complete."
Q: What happens if you have a "you-oracle" and then decide to do whatever the simulation doesn't do?
Scott: Right. What can we conclude from that? If you had a copy of the Predictor's computer, then the Predictor is screwed, right? But you don't have a copy of the Predictor's computer.
Q: So this is a theory of metaphysics which includes a monopoly on prediction?
Scott: Well, it includes a Predictor, which is a strange sort of being, but what do you want from me? That's what the problem stipulates.

One thing that I liked about about my solution is that it completely sidesteps the mystery of whether there's free will or not, in much the same way that an NP-completeness proof sidesteps the mystery of P versus NP. What I mean is that, while it is mysterious how your free will could influence the output of the Predictor's simulation, it doesn't seem more mysterious than how your free will could influence the output of your own brain! It's six of one, half a dozen of the other.

One reason I like this Newcomb's Paradox is that it gets at a connection between "free will" and the inability to predict future behavior. Inability to predict the future behavior of an entity doesn't seem sufficient for free will, but it does seem somehow necessary. If we had some box, and if without looking inside this box, we could predict what the box was going to output, then we would probably agree among ourselves that the box doesn't have free will. Incidentally, what would it take to convince me that I don't have free will? If after I made a choice, you showed me a card that predicted what choice I was going to make, wellm that's the sort of evidence that seems both necessary and sufficient. Modern neuroscience does get close to this for certain kinds of decisions. There were some famous experiments in the 1980's, where someone would attach electrodes to someone's brain and would tell them that they could either press button 1 or 2. Something like 200ms before the person was conscious of making the decision of which button to press, certainly before they physically moved their finger, you could see the neurons spiking for that particular finger. So you can actually predict which button the person is going to press a fraction of a second before they're aware of having made a choice. This is the kind of thing that forces us to admit that some of our choices are less free than they feel to us---or at least, that whatever is determining these choices acts earlier in time than it seems to subjective awareness.

If free will depends on an inability to predict future behavior, then it would follow from that free will somehow depends on our being unique: on it being impossible to copy us. This brings up another of my favorite thought experiments: the teleportation machine.

Suppose that in the far future, there's a very simple way of getting to Mars---the Mars Express---in only 10 minutes. It encodes the positions of all the atoms in your body as information, then transmits it to Mars as radio waves, reconstitutes you on Mars, and (naturally) destroys the original. Who wants to be the first to sign up and buy tickets? You can assume that destroying the original is painless. If you believe that your mind consists solely of information, then you should be lining up to get a ticket, right?

Q: I think there's a big difference between the case where you take someone apart then put them together on the other end, and the case where you look inside someone to figure out how to build a copy, build a copy at the end and then kill the original. There's a big difference between moving and copying. I'd love to get moved, but I wouldn't go for the copying.
Scott: The way moving works in most operating systems and programming languages is that you just make a copy then delete the original. In a computer, moving means copy-and-deleting. So, say you have a string of bits x1, ..., xn and you want to move it from one location to another. Are you saying it matters whether we first copy all of the bits then delete the first string, or copy-and-delete just the first bit, then copy-and-delete the second bit and so on? Are you saying that makes a difference?
Q: It does if it's me.
Q: I think I'd just want to be copied, then based on my experiences decide whether the original should be destroyed or not, and if not, just accept that there's another version of me out there.
Scott: OK. So which of the two yous is going to make the decision? You'll make it together? I guess you could vote, but you might need a third you to break the tie.
Q: Are you a quantum state or a classical state?
Scott: You're ahead of me, which always makes me happy. One thing that's always really interested me about the famous quantum teleportation protocol (which lets you "dematerialize" a quantum state and "rematerialize" it at another location) is that in order for it to work, you need to measure---and hence destroy---the original state. But going back to the classical scenario, it seems even more problematic if you don't destroy the original than if you do. Then you have the problem of which one is "really" you. Q: This reminds me of the many-worlds interpretation.
Scott: At least there, two branches of a wave function are never going to interact with each other. At most, they might interfere and cancel each other out, but here the two copies could actually have a conversation with each other! That adds a whole new layer of difficulties.
Q: So if you replaced your classical computer with a quantum computer, you couldn't just copy-and-delete to move something...
Scott: Right! This seems to me like an important observation. We know that if you have an unknown quantum state, you can't just copy it, but you can move it. So then the following question arises: is the information in the human brain encoded in some orthonormal basis? Is it copyable information or non-copyable information? The answer does not seem obvious a priori. Notice that we aren't asking if the brain is a quantum computer (let alone a quantum gravity computer a la Penrose), or whether it can factor 300-digit integers. Maybe Gauss could, but it's pretty clear that the rest of us can't. But even if it's only doing classical computation, the brain could still be doing it in a way that involves single qubits in various bases, in such a way that it would be physically impossible to copy important parts of the brain's state. There wouldn't even have to be much entanglement for that to be the case. We know that there are all kinds of tiny effects that can play a role in determining whether a given neuron will fire or not. So, how much information do you need from a brain to predict a person's future behavior (at least probabilistically)? Is all the information that you need stored in "macroscopic" variables like synaptic strengths, which are presumably copyable in principle? Or is some of the information stored microscopically, and possibly not in a fixed orthonormal basis? These are not metaphysical questions. They are, in principle, empirically answerable ones.

Now that we've got quantum in the picture, let's stir the pot a little bit more and bring in relativity. There's this argument (again, you can read whole Ph. D. theses about all these things) called the block-universe argument. The idea is that somehow special relativity precludes the existence of free will. Here you are, and you're trying to decide whether to order pizza or Chinese take-out. Here's your friend, who's going to come over later and wants to know what you're going to order. As it happens, your friend is traveling close to the speed of light in your rest frame. Even though you perceive yourself agonizing over the decision, from her perspective, your decision has already been made.

Q: You and your friend are spacelike-separated, so what does that even mean?
Scott: Exactly. I don't really think, personally, that this argument says anything about the existence or non-existence of free will. The problem is that it only works with spacelike-separated observers. Your friend can say, in principle, that in what she perceives to be her spacelike hypersurface, you've already made your decision---but she still doesn't know what you actually ordered! The only way for the information to propagate to your friend is from the point where you actually made the decision. To me, this just says that we don't have a total time-ordering on the set of events---we just have a partial ordering. But I've never understood why that should preclude free will.

I have to rattle you up somehow, so let's throw quantum, relativity and free will all into the stew. There was a paper recently by Conway and Kochen called The Free Will Theorem, which got a fair bit of press. So what is this theorem? Basically, Bell's Theorem, or rather an interesting consequence of Bell's Theorem. It's kind of a mathematically-obvious consequence, but still very interesting. You can imagine that there's no fundamental randomness in the universe, and that all of the randomness we observe in quantum mechanics and the like was just predetermined at the beginning of time. God just fixed some big random string, and whenever people make measurements, they're just reading off this one random string. But now suppose we make the following three assumptions:

  1. We have the free will to choose in what basis to measure a quantum state. That is, at least the detector settings are not predetermined by the history of the universe.
  2. Relativity gives some way for two actors (Alice and Bob) to perform a measurement such that in one reference frame Alice measures first, and in another frame Bob measures first.
  3. The universe cannot coordinate the measurement outcomes by sending information faster than light.

Given these three assumptions, the theorem concludes that there exists an experiment---namely, the standard Bell experiment---whose outcomes are also not predetermined by the history of the universe. Why is this true? Basically, because supposing that the two outcomes were predetermined by the history of the universe, you could get a local hidden-variable model, in contradiction to Bell's Theorem. You can think of this theorem as a slight generalization of Bell's Theorem: one that rules out not only local hidden-variable theories, but also hidden-variable theories that obey the postulates of special relativity. Even if there were some non-local communication between Alice and Bob in their different galaxies, as long as there are two reference frames such that Alice measures first in one and Bob measures first in the other, you can get the same inequality. The measurement outcomes can't have been determined in advance, even probabilistically; the universe must "make them up on the fly" after seeing how Alice and Bob set their detectors. I wrote a review of Steven Wolfram's book a while ago where I mentioned this, as a basic consequence of Bell's Theorem that ruled out the sort of deterministic model of physics that Wolfram was trying to construct. I didn't call my little result the Free Will Theorem, but now I've learned my lesson: if I want people to pay attention, I should be talking about free will! Hence this lecture.

Years ago, I was at one of John Preskill's group meetings at Caltech. Usually, it was about very physics-y stuff and I had trouble understanding. But once, we were talking about a quantum foundations paper by Chris Fuchs, and things got very philosophical very quickly. Finally, someone got up and wrote on the board: "Free Will or Machine?" And asked for a vote. "Machine" won, seven-to-five. But we don't have to accept their verdict! We can take our own vote.

The Results
Free Will: 6
Machines: 5

Note: The class was largely divided between those who abstained and those who voted for both.

I'll leave you with the following puzzle for next time:
Dr. Evil is on his moon base, and he has a very powerful laser pointed at the Earth. Of course, he's planning to obliterate the Earth, being evil and all. At the last minute, Austin Powers hatches a plan, and sends Dr. Evil the following message: "Back in my lab here on Earth, I've created a replica of your moon base down to the last detail. The replica even contains an exact copy of you. Everything is the same. Given that, you actually don't know if you're in your real moon base or in my copy here on Earth. So if you obliterate the Earth, there's a 50% chance you'll be killing yourself!" The puzzle is, what should Dr. Evil do? Should he fire the laser or not? (See here for the paper about this.)

[Discussion of this lecture on blog]

[← Previous lecture | Next lecture →]

[Return to PHYS771 home page]