PHYS771 Lecture 19: Time Travel

Scott Aaronson

Scribe: Chris Granade


Last time we talked about free will, superintelligent predictors, and Dr. Evil planning to destroy the earth from his moon base. Today I'd like to talk about a more down-to-earth topic: time travel. The first point I have to make is one that Carl Sagan made: we're all time travelers—at the rate of one second per second! Har har! Moving on, we have to distinguish between time travel into the distant future and into the past. Those are very different.

Travel into the distant future is by far the easier of the two. There are several ways to do it:

This suggests one of my favorite proposals for how to solve NP-complete problems in polynomial time: why not just start your computer working on an NP-complete problem, then board a spaceship traveling at close to the speed of light and return to Earth to pick up the solution? If this idea worked, it would let us solve much more than just NP. It would also let us solve PSPACE-complete and EXP-complete problems–maybe even all computable problems, depending on how much speedup you want to assume is possible. So what are the problems with this approach?

A: The Earth ages, too.
Scott: Yeah, so all your friends will be dead when you get back. What's a solution to that?
A: Bring the whole Earth with you, and leave your computer floating in space.
Scott: Well, at least bring all your friends with you!

Let's suppose you're willing to deal with the inconvenience of the Earth having aged exponentially many years. Are there any other problems with this proposal? The biggest problem is, how much energy does it take to accelerate to relativistic speed? Ignoring the time spent accelerating and decelerating, if you travel at a v fraction of the speed of light for a proper time t, then the elapsed time in your computer's reference frame is:

It follows that, if you want t′ to be exponentially larger than t, then v has to be exponentially close to 1. There might already be fundamental difficulties with that, coming from quantum gravity, but let's ignore that for now. The more obvious problem is, you're going to need an exponential amount of energy to accelerate to this speed v. Think about your fuel tank, or whatever else is powering your spaceship. It's going to have to be exponentially large! Just for locality reasons, how is the fuel from the far parts of the tank going to affect you? Here, I'm using the fact that spacetime has a constant number of dimensions. (Well, and I'm also using the Schwarzchild bound, which limits the amount of energy that can be stored in a finite region of space: your fuel tank certainly can't be any denser than a black hole!)

Let's talk about the more interesting kind of time travel: the backwards kind. Can closed timelike curves (CTCs) exist in Nature? This question has a very long history of being studied by physicists on weekends. It was discovered early on, by Gödel and others, that classical general relativity admits CTC solutions. All of the known solutions, however, have some element that can be objected to as being "unphysical." For example, some solutions involve wormholes, but that requires "exotic matter" having negative mass to keep the wormhole open. They all, so far, involve either non-standard cosmologies or else types of matter or energy that have yet to be experimentally observed. But that's just classical general relativity. Once you put quantum mechanics in the picture, it becomes an even harder question. General relativity is not just a theory of some fields in spacetime, but of spacetime itself, and so once you quantize it, you'd expect there to be fluctuations in the causal structure of spacetime. The question is, why shouldn't that produce CTCs?

Incidentally, there's an interesting metaquestion here: why have physicists found it so hard to create a quantum theory of gravity? The technical answer usually given is that, unlike (say) Maxwell's equations, general relativity is not renormalizable. But I think there's also a simpler answer, one that's much more understandable to a doofus layperson like me. The real heart of the matter is that general relativity is a theory of spacetime itself, and so a quantum theory of gravity is going to have to be talking about superpositions over spacetime and fluctuations of spacetime. One of the things you'd expect such a theory to answer is whether closed timelike curves can exist. So quantum gravity seems "CTC-hard", in the sense that it's at least as hard as determining if CTCs are possible! And even I can see that this can't possibly be a trivial question to settle. Even if CTCs are impossible, presumably they're not going to be proven impossible without some far-reaching new insight. Of course, this is just one instantiation of a general problem: that no one really has a clear idea of what it means to treat spacetime itself quantum-mechanically.

In the field I come from, it's never our place to ask if some physical object exists or not, it's to assume it exists and see what computations we can do with it. Thus, from now on, we'll assume CTCs exist. What would the consequences be for computational complexity? Perhaps surprisingly, I'll be able to give a clear and specific answer to that.

So how would you exploit a closed timelike curve to speed up computation? First let's consider the naïve idea: compute the answer, then send it back in time to before your computer started.

From my point of view, this "algorithm" doesn't work even considered on its own terms. (It's nice that, even with something as wacky as time travel, we can definitively rule certain ideas out!) I know of at least two reasons why it doesn't work. Anyone want to take a shot?

A: The universe can still end in the time you're computing the answer.
Scott: Yes! Even in this model where you can go back in time, it seems to me that you still have to quantify how much time you spend in the computation. The fact that you already have the answer at the beginning doesn't change the fact that you still have to do the computation! Refusing to count the complexity of that computation is like maxing out your credit card, then not worrying about the bill. You're going to have to pay up later!
A: Couldn't you just run the computation for an hour, go back in time, continue the computation for another hour, then keep repeating until you're done?
Scott: Ah! That's getting toward my second reason. You just gave a slightly less naïve idea, which also fails, but in a more interesting way.
A: The naïve idea involves iterating over the solution space, which could be uncountably large.
Scott: Yeah, but let's assume we're talking about an NP-complete problem, so that the solution space is finite. If we could merely solve NP-complete problems, we'd be pretty happy.

Let's think some more about the proposal where you compute for an hour then go back in time, compute for another hour then go back again and so on. The trouble with this proposal is that it doesn't take seriously that you're going back in time. You're treating time as a spiral, as some sort of scratchpad that you can keep erasing and writing over, but you're not going back to some other time, you're going back to the time that you started from. Once you accept that this is what we're talking about, you immediately start having to worry about the Grandfather Paradox (i.e., where you go back in time and kill your grandfather). For example, what if your computation takes as input a bit b from the future, and produces as output a bit ¬b, which then goes back in time to become the input? Now when you use ¬b as input, you compute ¬¬b = b as output, and so on. This is just the Grandfather Paradox in a computational form. We have to come up with some account of what happens in this situation. If we're talking about closed timelike curves at all, then we're talking about something where this sort of behavior can happen, and we need some theory of what results.

My own favorite theory was proposed by David Deutsch in 1991. His proposal was that if you just go to quantum mechanics, the problem is solved. Indeed, quantum mechanics is overkill: it works just as well to go to a classical probabilistic theory. In the latter case, you have some probability distribution (p1,...,pn) over the possible states of your computer. Then the computation that takes place within the closed timelike curve can be modeled as a Markov chain, which transforms this distribution to a different one. What should we impose if we want to avoid grandfather paradoxes?

A: That the output distribution should be the same as the input one?
Scott: Exactly.

We should impose the requirement that Deutsch calls causal consistency: the computation within the CTC must map the input probability distribution to itself. In deterministic physics, we know that this sort of consistency can't always be achieved—that's just another way of stating the Grandfather Paradox. But as soon as we go to probabilistic theories, well, it's a basic fact that every Markov chain has at least one stationary distribution. In this case of the Grandfather Paradox, the unique solution is that you're born with probability ½, and if you're born, you go back in time and kill your grandfather. Thus, the probability that you go back in time and kill your grandfather is ½, and hence you're born with probability ½. Everything is consistent; there's no paradox.

One thing that I like about Deutsch's resolution is that it immediately suggests a model of computation. First we get to choose a polynomial-size circuit C : {0,1}n → {0,1}n. Then Nature chooses a probability distribution D over strings of length n such that C(D)=D, and gives us a sample y drawn from D. (If there's more than one fixed-point D, then we'll suppose to be conservative that Nature makes her choice adversarially.) Finally, we can perform an ordinary polynomial-time computation on the sample y. We'll call the complexity class resulting from this model PCTC.

What can we say about this class? My first claim is that NP ⊆ PCTC; that is, closed timelike curve computers can solve NP-complete problems in polynomial time. Does anyone see why? More concretely, suppose we have a Boolean formula φ in n variables and we want to know if there's a satisfying assignment. What should our circuit C do?

A: If the input is a satisfying assignment, spit it back out?
Scott: Good. And what if the input isn't a satisfying assignment?
A: Iterate to the next assignment?
Scott: Right! And go back to the beginning if you've reached the last assignment.

We'll just have this loop over all possible assignments, and we stop as soon as we get to a satisfying one. Assuming there exists a satisfying assignment, the only stationary distributions will be concentrated on satisfying assignments. So when we sample from a stationary distribution, we'll certainly see such an assignment. (If there are no satisfying assignments, then the stationary distribution is uniform.)

Q: So we're assuming that Nature gives us this stationary distribution for free?
Scott: Yes. Once we set up the CTC, its evolution has to be causally consistent to avoid grandfather paradoxes. But that means Nature has to solve a hard computational problem to make it consistent! That's the key idea that we're exploiting.

Related to this algorithm for solving NP-complete problems is what Deutsch calls the "knowledge creation paradox." The paradox is best illustrated through the movie Star Trek IV. The Enterprise crew has gone back in time to the present (meaning to 1986) in order to find a humpback whale and transport it to the 23rd century. But to build a tank for the whale, they need a type of plexiglass that hasn't been invented yet. So in desperation, they go to the company that will invent the plexiglass, and reveal the molecular formula to that company. They then wonder: how did the company end up inventing the plexiglass? Hmmmm....

Note that the knowledge creation paradox is a time travel paradox that's fundamentally different from the grandfather paradox, because here there's no actual logical inconsistency. This paradox is purely one of computational complexity: somehow this hard computation gets performed, but where was the work put in? In the movie, somehow this plexiglass gets invented without anyone ever having taken the time to invent it!

As a side note, my biggest pet peeve about time travel movies is how they always say, "Be careful not to step on anything, or you might change the future!" "Make sure this guy goes out with that girl like he was supposed to!" Dude—you might as well step on anything you want. Just by disturbing the air molecules, you've already changed everything.

OK, so we can solve NP-complete problems efficiently using time travel. But can we do more than that? What is the actual computational power of closed timelike curves? I claim that certainly, PCTC is contained in PSPACE. Does anyone see why?

Well, we've got this exponentially large set of possible inputs x ∈ {0, 1}n to the circuit C, and our basic goal is to find an input x that eventually cycles around (that is, such that C(x)=x, or C(C(x))=x, or...). For then we'll have found a stationary distribution. But finding such an x is clearly a PSPACE computation. For example, we can iterate over all possible starting states x, and for each one apply C up to 2n times and see if we ever get back to x. Certainly, this is in PSPACE.

My next claim is that PCTC is equal to PSPACE. That is, CTC computers can solve not just NP-complete problems, but all problems in PSPACE. Why?

Well, let M0, M1, ... be the successive configurations of a PSPACE machine M. Also, let Macc be the "halt and accept" configuration of M, and let Mrej be the "halt and reject" configuration. Our goal is to find which of these configurations the machine goes into. Note that each of these configurations takes a polynomial number of bits to write down. Then, we can define a polynomial-size circuit C that takes as input some configuration of M plus some auxiliary bit b. The circuit will act as follows:

C(⟨Mi, b⟩)=Mi + 1, b
C(⟨Macc, b⟩)=M0, 1⟩
C(⟨Mrej, b⟩)=M0, 0⟩
So, for each configuration that isn't the accepting or rejecting configuration, C increments to the next configuration, leaving the auxiliary bit as it was. If it reaches an accepting configuration, then it loops back to the beginning and sets the auxiliary bit to 1. Similarly, if it reaches an rejecting configuration, then it loops back and sets the auxiliary bit to 0.

Simulating a PSPACE computation in PCTC.

Now if we think about what's going on, we have two parallel computations: one with the answer bit set to 0, the other with the answer bit set to 1. If the true answer is 0, then the rejecting computation will go around in a loop, while the accepting computation will lead into that loop. Likewise, if the true answer is 1, it's the accepting computation that will go around in a loop. The only stationary distribution, then, is a uniform distribution over the computation steps with b set to the correct answer. We can then read off a sample and look at b, to find out whether the PSPACE machine accepts or rejects.

Thus we can tightly characterize PCTC as equal to PSPACE. One way to think about it is that having a closed timelike curve makes time and space equivalent as computational resources. In retrospect, maybe we should have expected that all along, but we still have to show it!

Now, there's an obvious question that we have to ask: what if we have a quantum computer acting inside the closed timelike curve? Obviously, we need to know the answer. How does this work? Now we have a polynomial-sized quantum circuit instead of a classical circuit, and we say that we have two sets of qubits: "closed timelike curve qubits" and "chronology-respecting qubits." We can do some quantum computation on both of them, but we're only really going to care about the CTC qubits. There will be some induced superoperator S that acts on the CTC qubits. (Recall that a superoperator is just a general quantum operation, not necessarily unitary.) Then, Nature will adversarially find a mixed state ρ that is a fixed point of S: i.e., such that S(ρ) = ρ. It's not always possible to find a pure state ρ=|ψ⟩⟨ψ| with that property, but by basic linear algebra (Deutsch worked out the details, and I won't drag you through them) there is always such a mixed state.

Q: So ρ is a state just over the CTC qubits?
Scott: Yes. The only real reason for the other qubits is that without them, the superoperator would always be unitary, in which case the maximally mixed state I would always be fixed point. And that would trivialize the model.

As a general principle, quantum computers can simulate classical ones, and (as is easily shown) it's no different when we throw in closed timelike curves. So we can certainly say that BQPCTC contains PSPACE. But what's an upper bound on BQPCTC?

A: EXPSPACE?
Scott: EXPSPACE would certainly work, yes. Can we give a better upper bound?

So we're given an n-qubit superoperator (specified implicitly by a circuit) and we want to find a fixed point of it. This is basically a linear algebra problem. We know that you can do linear algebra in time polynomial in the dimension of the Hilbert space, which in this case is 2n. This implies that we can simulate BQPCTC in EXP. So we now have that BQPCTC is somewhere between PSPACE and EXP. In my survey paper on NP-complete Problems and Physical Reality a few years ago, pinning this down further was the main technical open problem!

Recently John Watrous and I were able to solve the problem. Our result is that BQPCTC = PCTC = PSPACE. In other words, if closed timelike curves existed, then quantum computers would be no more powerful than classical ones.

Q: Do we know anything about other classes with closed timelike curves? Like PSPACECTC?
Scott: That one is going to be PSPACE again. On the other hand, you can't just take any complexity class and append a CTC to it. You have to say what that means, and for some classes (like NP) it won't even make any sense.

In the last part of the lecture I can give you a little hint of why BQPCTC ⊆ PSPACE. Given a superoperator S that's described by a polynomial-size quantum circuit, which maps n qubits to n qubits, our goal is to compute a mixed state ρ such that S(ρ) = ρ. We won't be able to write down ρ explicitly (it would be far too large to fit in a PSPACE machine's memory), but all we're really aiming to do is to simulate the result of some polynomial-time computation that could have been performed on ρ.

Let vec(ρ) be the "vectorization" of ρ (a vector with 22n components, one for each matrix entry of ρ). Then there exists a 22n×22n matrix M such that for all ρ, S(ρ)=ρ if and only if M vec(ρ) = vec(ρ). In other words, we can just expand everything out from matrices to vectors, and then our goal is to find a +1 eigenvector of M.

Define P := limz → 1 (1 − z)(I − zM)-1. Then by Taylor expansion:

MP = M limz → 1 (1 − z) (I+zM+z2M2+⋅⋅⋅)
= limz → 1 (1 − z)(M+zM2+z2M3+⋅⋅⋅)
= limz → 1 (1 − z)/z (zM+z2M2+z3M3+⋅⋅⋅)
= limz → 1 (1 − z)/z [(I − zM)-1 − I]
= limz → 1 (1 − z)/z (I − zM)-1
= limz → 1 (1 − z) (I − zM)-1
= P

In other words, P projects onto fixed-points of M. For all v, M(Pv) = (Pv).

So now all we need to do is start with some arbitrary vector v—say vec(I) where I is the maximally mixed state—and then compute:

But how do we do apply this matrix P in PSPACE? Well, we can apply M in PSPACE since it's just a polynomial-time quantum computation. But what about taking a matrix inverse? Here, we borrow something from computational linear algebra. Csanky's algorithm, proposed in the 1970's, lets us compute the inverse of an n×n matrix not merely in polynomial time, but by a circuit of depth log2n. Similar algorithms are actually used in practice today, for example when doing scientific computing with lots of parallel processors. Now, "shifting everything up" by an exponential, we find that it's possible to invert a 22n×22n matrix using a circuit of size 2O(n) and depth O(n2). But computing the output of an exponential-size, polynomial-depth circuit (which is described to us implicitly) is a PSPACE computation—in fact it's PSPACE-complete. As a final step, one can take the limit as z → 1 using algebraic rules, and some further tricks due to Beame, Cook, and Hoover.

Obviously I'm skipping a lot of details.

Q: So does this P always project onto the vectorization of a density matrix?
Scott: OK, that's an additional point that needs to be argued. If you look at the power series above, each individual term maps a vectorization of a density matrix onto another such vectorization, so the sum has to project onto vectorizations of density matrices as well. (Well, you might worry about the normalization, but that works out also.)

As usual, I'll end with a puzzle for next lecture. Suppose you can only fit a single bit at a time through a CTC. You can make as many CTCs as you like, but you can only send one bit through each, not a polynomial number of bits. (After all, we don't want to be extravagant!) In this alternate model, can you solve NP-complete problems in polynomial time?


[Discussion of this lecture on blog]

[← Previous lecture | Next lecture →]

[Return to PHYS771 home page]