NSA in P/poly: The Power of Precomputation

Even after the Snowden revelations, there remained at least one big mystery about what the NSA was doing and how.  The NSA’s classified 2013 budget request mentioned, as a priority item, “groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.”  There was a requested increase, of several hundred million dollars, for “cryptanalytic IT services” and “cryptanalysis and exploitation services program C” (whatever that was).  And a classified presentation slide showed encrypted data being passed to a high-performance computing system called “TURMOIL,” and decrypts coming out.  But whatever was going on inside TURMOIL seemed to be secret even within NSA; someone at Snowden’s level wouldn’t have had access to the details.

So, what was (or is) inside the NSA’s cryptanalytic black box?  A quantum computer?  Maybe even one that they bought from D-Wave?  (Rimshot.)  A fast classical factoring algorithm?  A proof of P=NP?  Commentators on the Internet rushed to suggest each of these far-reaching possibilities.  Some of us tried to pour cold water on these speculations—pointing out that one could envision many scenarios that were a little more prosaic, a little more tied to the details of how public-key crypto is actually used in the real world.  Were we just naïve?

This week, a new bombshell 14-author paper (see also the website) advances an exceedingly plausible hypothesis about what may have been the NSA’s greatest cryptanalytic secret of recent years.  One of the authors is J. Alex Halderman of the University of Michigan, my best friend since junior high school, who I’ve blogged about before.  Because of that, I had some advance knowledge of this scoop, and found myself having to do what regular Shtetl-Optimized readers will know is the single hardest thing in the world for me: bite my tongue and not say anything.  Until now, that is.

Besides Alex, the other authors are David Adrian, Karthikeyan Bhargavan, Zakir Durumeric, Pierrick Gaudry, Matthew Green, Nadia Heninger, Drew Springall, Emmanuel Thomé, Luke Valenta, Benjamin VanderSloot, Eric Wustrow, Santiago Zanella-Béguelink, and Paul Zimmermann (two of these, Green and Heninger, have previously turned up on Shtetl-Optimized).

These authors study vulnerabilities in Diffie-Hellman key exchange, the “original” (but still widely-used) public-key cryptosystem, the one that predates even RSA.  Diffie-Hellman is the thing where Alice and Bob first agree on a huge prime number p and a number g, then Alice picks a secret a and sends Bob ga (mod p), and Bob picks a secret b and sends Alice gb (mod p), and then Alice and Bob can both compute (ga)b=(gb)a=gab (mod p), but an eavesdropper who’s listening in only knows p, g, ga (mod p), and gb (mod p), and one can plausibly conjecture that it’s hard from those things alone to get gab (mod p).  So then Alice and Bob share a secret unknown to the eavesdropper, which they didn’t before, and they can use that secret to start doing cryptography.

As far as anyone knows today, the best way to break Diffie-Hellman is simply by calculating discrete logarithms: that is, solving the problem of recovering a given only g and h=ga (mod p).  At least on a classical computer, the fastest known algorithm for discrete logarithms (over fields of prime order) is the number field sieve (NFS).  Under plausible conjectures about the distribution of “smooth” numbers, NFS uses time that grows like exp((1.923+o(1))(log p)1/3(log log p)2/3), where the exp and logs are base e (and yes, even the lower-order stuff like (log log p)2/3 makes a big difference in practice).  Of course, once you know the running time of the best-known algorithm, you can then try to choose a key size (that is, a value of log(p)) that’s out of reach for that algorithm on the computing hardware of today.

(Note that the recent breakthrough of Antoine Joux, solving discrete log in quasipolynomial time in fields of small characteristic, also relied heavily on sieving ideas.  But there are no improvements from this yet for the “original” discrete log problem, over prime fields.)

But there’s one crucial further fact, which has been understood for at least a decade by theoretical cryptographers, but somehow was slow to filter out to the people who deploy practical cryptosystems.  The further fact is that in NFS, you can arrange things so that almost all the discrete-logging effort depends only on the prime number p, and not at all on the specific numbers g and h for which you’re trying to take the discrete log.  After this initial “precomputation” step, you then have a massive database that you can use to speed up the “descent” step: the step of solving ga=h (mod p), for any (g,h) pair that you want.

It’s a little like the complexity class P/poly, where a single, hard-to-compute “advice string” unlocks exponentially many inputs once you have it.  (Or a bit more precisely, one could say that NFS reveals that exponentiation modulo a prime number is sort of a trapdoor one-way function, except that the trapdoor information is subexponential-size, and given the trapdoor, inverting the function is still subexponential-time, but a milder subexponential than before.)

The kicker is that, in practice, a large percentage of all clients and servers that use Diffie-Hellman key exchange use the same few prime numbers p.  This means that, if you wanted to decrypt a large fraction of all the traffic encrypted with Diffie-Hellman, you wouldn’t need to do NFS over and over: you could just do it for a few p‘s and cache the results.  This fact can singlehandedly change the outlook for breaking Diffie-Hellman.

The story is different depending on the key size, log(p).  In the 1990s, the US government insisted on “export-grade” cryptography for products sold overseas (what a quaint concept!), which meant that the key size was restricted to 512 bits.  For 512-bit keys, Adrian et al. were able to implement NFS and use it to do the precomputation step in about 7 days on a cluster with a few thousand cores.  After this initial precomputation step (which produced 2.5GB of data), doing the descent, to find the discrete log for a specific (g,h) pair, took only about 90 seconds on a 24-core machine.

OK, but no one still uses 512-bit keys, do they?  The first part of Adrian et al.’s paper demonstrates that, because of implementation issues, even today you can force many servers to “downgrade” to the 512-bit, export-grade keys—and then, having done so, you can stall for time for 90 seconds as you figure out the session key, and then do a man-in-the-middle attack and take over and impersonate the server.  It’s an impressive example of the sort of game computer security researchers have been playing for a long time—but it’s really just a warmup to the main act.

As you’d expect, many servers today are configured more intelligently, and will only agree to 1024-bit keys.  But even there, Adrian et al. found that a large fraction of servers rely on just a single 1024-bit prime (!), and many of the ones that don’t rely on just a few other primes.  Adrian et al. estimate that, for a single 1024-bit prime, doing the NFS precomputation would take about 45 million years using a single core—or to put it more ominously, 1 year using 45 million cores.  If you built special-purpose hardware, that could go down by almost two orders of magnitude, putting the monetary cost at a few hundred million dollars, completely within the reach of a sufficiently determined nation-state.  Once the precomputation was done, and the terabytes of output stored in a data center somewhere, computing a particular discrete log would then take about 30 days using 1 core, or mere minutes using a supercomputer.  Once again, none of this is assuming any algorithmic advances beyond what’s publicly known.  (Of course, it’s possible that the NSA also has some algorithmic advances; even modest ones could obviate the need for special-purpose hardware.)

While writing this post, I did my own back-of-the-envelope, and got that using NFS, calculating a 1024-bit discrete log should be about 7.5 million times harder than calculating a 512-bit discrete log.  So, extrapolating from the 7 days it took Adrian et al. to do it for 512 bits, this suggests that it might’ve taken them about 143,840 years to calculate 1024-bit discrete logs with the few thousand cores they had, or 1 year if they had 143,840 times as many cores (since almost all this stuff is extremely parallelizable).  Adrian et al. mention optimizations that they expect would improve this by a factor of 3, giving us about 100 million core-years, very similar to Adrian et al.’s estimate of 45 million core-years (the lower-order terms in the running time of NFS might account for some of the remaining discrepancy).

Adrian et al. mount a detailed argument in their paper that all of the details about NSA’s “groundbreaking cryptanalytic capabilities” that we learned from the Snowden documents match what would be true if the NSA were doing something like the above.  The way Alex put it to me is that, sure, the NSA might not have been doing this, but if not, then he would like to understand why not—for it would’ve been completely feasible within the cryptanalytic budget they had, and the NSA would’ve known that, and it would’ve been a very good codebreaking value for the money.

Now that we know about this weakness of Diffie-Hellman key exchange, what can be done?

The most obvious solution—but a good one!—is just to use longer keys.  For decades, when applied cryptographers would announce some attack like this, theorists like me would say with exasperation: “dude, why don’t you fix all these problems in one stroke by just, like, increasing the key sizes by a factor of 10?  when it’s an exponential against a polynomial, we all know the exponential will win eventually, so why not just go out to where it does?”  The applied cryptographers explain to us, with equal exasperation in their voices, that there are all sorts of reasons why not, from efficiency to (maybe the biggest thing) backwards-compatibility.  You can’t unilaterally demand 2048-bit keys, if millions of your customers are using browsers that only understand 1024-bit keys.  On the other hand, given the new revelations, it looks like there really will be a big push to migrate to larger key sizes, as the theorists would’ve suggested from their ivory towers.

A second, equally-obvious solution is to stop relying so much on the same few prime numbers in Diffie-Hellman key exchange.  (Note that the reason RSA isn’t vulnerable to this particular attack is that it inherently requires a different composite number N for each public key.)  In practice, generating a new huge random prime number tends to be expensive—taking, say, a few minutes—which is why people so often rely on “standard” primes.  At the least, we could use libraries of millions of “safe” primes, from which a prime for a given session is chosen randomly.

A third solution is to migrate to elliptic-curve cryptography (ECC), which as far as anyone knows today, is much less vulnerable to descent attacks than the original Diffie-Hellman scheme.  Alas, there’s been a lot of understandable distrust of ECC after the DUAL_EC_DBRG scandal, in which it came out that the NSA backdoored some of NIST’s elliptic-curve-based pseudorandom generators by choosing particular parameters that it knew how handle.  But maybe the right lesson to draw is mod-p groups and elliptic-curve groups both seem to be pretty good for cryptography, but the mod-p groups are less good if everyone is using the same few prime numbers p (and those primes are “within nation-state range”), and the elliptic-curve groups are less good if everyone is using the same few parameters.  (A lot of these things do seem pretty predictable with hindsight, but how many did you predict?)

Many people will use this paper to ask political questions, like: hasn’t the NSA’s codebreaking mission once again usurped its mission to ensure the nation’s information security?  Doesn’t the 512-bit vulnerability that many Diffie-Hellman implementations still face, as a holdover from the 1990s export rules, illustrate why encryption should never be deliberately weakened for purposes of “national security”?  How can we get over the issue of backwards-compatibility, and get everyone using strong crypto?  People absolutely should be asking such questions.

But for readers of this blog, there’s one question that probably looms even larger than those of freedom versus security, openness versus secrecy, etc.: namely, the question of theory versus practice.  Which “side” should be said to have “won” this round?  Some will say: those useless theoretical cryptographers, they didn’t even know how their coveted Diffie-Hellman system could be broken in the real world!  The theoretical cryptographers might reply: of course we knew about the ability to do precomputation with NFS!  This wasn’t some NSA secret; it’s something we discussed openly for years.  And if someone told us how Diffie-Hellman was actually being used (with much of the world relying on the same few primes), we could’ve immediately spotted the potential for such an attack.  To which others might reply: then why didn’t you spot it?

Perhaps the right lesson to draw is how silly such debates really are.  In the end, piecing this story together took a team that was willing to do everything from learning some fairly difficult number theory to coding up simulations to poring over the Snowden documents for clues about the NSA’s budget.  Clear thought doesn’t respect the boundaries between disciplines, or between theory and practice.

(Thanks very much to Nadia Heninger and Neal Koblitz for reading this post and correcting a few errors in it.  For more about this, see Bruce Schneier’s post or Matt Green’s post.)

94 Responses to “NSA in P/poly: The Power of Precomputation”

  1. Geoffrey Irving Says:

    The other lesson is that if you want to bring theory and practice closer together, you should use systems with fewer “moving parts”. The export side of this is a bug in protocol negotiation, which is entirely separate from the theoretical concerns.

  2. Brian Sniffen Says:

    Some of us DID notice. When there was a big push to move from SSHv1, which did up a fresh group for each connection, to SSHv2, which used a few fixed groups, some people did notice, and said so, loudly. SSHv2 implementations fixed this a few years later.

    So it’s worse than just theory vs. practice. The community of practice is so vast and, with many meanings, so dense, that it takes a while for information to penetrate.

  3. wolfgang Says:

    This is all very interesting, but it is also quite silly imho.

    1) We the People pay for various government agencies to secretly spy on us.
    2) But actually we don’t like to be spied on, so we also pay researchers, at universities and various companies, to figure out how the NSA and others might spy on us.
    3) Once a vulnerability is detected we upgrade browsers, servers etc. at significant cost.

    Is this just an inevitable inefficiency of the democratic process, a clever way to finance CS research or pure insanity?

  4. Dan Miller Says:

    As a part-time complexity enthusiast and full-time technology professional with real-world security responsibilities, it seems to me that this result shines a light on the intersection (or lack thereof) between crypto- theory and practice.

    Why did it take a fortuitous interdisciplinary effort of 14 researchers to discover this combination of arcane implementation trivia (servers can be made to fall back to 1024-bit DH; DH implementations typically use few primes) and theoretical knowledge (1024-bit DH with few primes is weak crypto)?

    Is there some hypothetical way that real-world implementationists such as myself can guard against deploying infrastructure that is known (at least to esoteric crypto eggheads) to be at risk of a successful present-day attack with attainable resources?

  5. Greg Maxwell Says:

    Hi Scott, You’re making a lot of assumptions about what practicing cryptographic engineers know which I think aren’t generally correct. (and it only took me a second to find cases years ago where I was happily telling laymen about pre-computation gains for discrete log– and my level of knowledge is not unusual).

    That you can precompute for discrete log problem solving (and efficiently store the results of precomputation for lookup) is _very_ well know. The normal engineering goal is to adopt parameters which are strong enough to make even a single compromise infeasable. If it is feasible, then the linear advantages of grouping 100,000 high value targets really is a very frighteningly weak defense, and you’ve basically already lost.

    Using a different session (or user) unique group has considerable costs– not just the computation needed to make sure the group has a cryptographically useful order (e.g. picking safe primes for mod p, or running SEA for EC groups), but in communicating the group to the far side: Communicating a random group can mean effectively doubling the amount of communicated data.

    If you’re able to afford the extra computation or bandwidth: why not then use a somewhat larger _static_ group, and gain an exponential advantage vs the attacker, vs using the same bandwidth to communicate a random group and only get at best a linear advantage in the number of targets?

    There are also more pedestrian engineering concerns, you might note their paper reports many users with non-standard groups using groups that had low order; turns out the more you ask software to do, the more things it’ll do wrong. In the case of EC, specializing the group logic allows enormous implementation speedups (e.g. 10+ fold for specialized code vs generic code); which again could be used to permit larger groups or may make the difference of getting cryptography to be used at all (this is hopefully becoming less true but “performance” was still successfully used as a part of the argument against mandatory always on crypto in HTTP2).

    Perhaps the suggestion of a big list of static groups would address the communications costs, but its still more implementation to get right, and still leaves the performance concerns. Moreover, as you see with the downgrade attacks: attacks on negation are the most common attacks against modern (IMO overly flexible) cryptosystems. With a big list and a a vulnerable negotiation it may well be the case that that you only need to pre-compute one and then you trick your targets into using it; and this can be hard to get right (what- randomly pick them? per connection? what happens when I just kill every connection using the wrong one until you pick one I’ve compromised, etc.)

  6. Will Says:

    Greg Maxwell #5

    Using a list of primes does sort of give an exponential advantages. If we have a known list of a million safe primes, as Scott suggests, it takes 20 bits to communicate which one, but makes it a million times harder to do the precomputation attack.

    As long as your primes are hard enough to break that anyone trying to break them has to be big enough to want to break lots of them, this seems helpful. Although, as you point out, this doesn’t help you if you care about the security of any particular computation.

    But I guess this would require universal adoption of the same list, and the storage would start to get significant – 2^20 x 1,024 bits = 128 megabytes, around the same size as the version of Firefox on my computer.

  7. Quantum.Mechanic Says:

    Please quit using “a” as a variable — it’s also an article, and makes it hard to parse.

  8. Scott Says:

    wolfgang #3:

      1) We the People pay for various government agencies to secretly spy on us.
      2) But actually we don’t like to be spied on, so we also pay researchers, at universities and various companies, to figure out how the NSA and others might spy on us.
      3) Once a vulnerability is detected we upgrade browsers, servers etc. at significant cost.

    There are many, many cases where a complicated human organization (a government, a corporation), full of agents with conflicting agendas, can be made to sound stupid if you treat it as if it were a single, rationally-deliberating individual. E.g., the people of California subsidize alfalfa and almond growers, who have now used up all of California’s water, and therefore the people of California will likely be forced to subsidize the growers even more to not use so much water.

    OK, scratch that, a lot of these are pretty stupid. 😉

  9. g Says:

    There’s some discussion of this over on Hacker News.

    Completely irrelevantly: Scott, in my web browser (up-to-date Firefox on Windows) the “just one piece of information” in your blog header overruns the box it’s in and its last line disappears against the white background below. So the “one piece of information” becomes: “Quantum computers would not solve hard search problems”. That may be a stronger statement than you wish to be seen making :-).

    (It’s fine in Chrome. I haven’t tried other browsers. The thing that’s different between Chrome and Firefox is the size of the text in the header; all of it, including the main title, is smaller in Chrome and larger in Firefox. Perhaps use explicit pixel sizes in the header, since it’s all against the background of a fixed-size image?)

  10. Scott Says:

    Greg Maxwell #5:

      The normal engineering goal is to adopt parameters which are strong enough to make even a single compromise infeasable. If it is feasible, then the linear advantages of grouping 100,000 high value targets really is a very frighteningly weak defense, and you’ve basically already lost.

    On reflection, you’re right: the “giant list of primes” would be, at best, a stopgap measure, if we can’t currently upgrade from 1024 to 2048 bits for backwards-compatibility reasons, but we can switch around which 1024-bit primes we use. (Knowledgeable commenters: is that currently the situation with any widely-used protocols?)

    Ultimately, increasing the key size—what I called the “most obvious solution”—is also the best solution, the one whose advantages swamp everything else in the asymptotic limit.

  11. On solving real world problems | Entertaining Research Says:

    […] of the details in this post can be better understood only by those who work in cryptography, probably. However, there are some […]

  12. Amir Says:

    At least in IPSEC, the standard defines only 3 DH groups modulo primes: one 768 bit, one 1024 bit, and one 1536 bit.

    However, vendor-specific extensions are always used, and I don’t know if VPN vendors define other groups and set them as the default. And if they’re not default, or if not both VPN nodes come from the same vendor, they won’t be used at all.

  13. Scott Says:

    g #9: OK, I tried to move the text upwards; let me know if it looks any better in your browser.

  14. Amir Says:

    A more accurate complexity class for the title would be something like L(⅓,1.232)/L(⅓,1.923), since these are the actual complexities of the descent and precomputation steps of NFS, assuming standard number-theoretic assumptions.

  15. Scott Says:

    Amir #14: Yup.

  16. jonas Says:

    There’s something I’ve wanted to ask about, and this post seems like a good occasion.

    We know that both factoring large composite integers and finding discrete logarithm modulo a large prime can be broken efficiently using a quantum computer using Shor’s algorithm (in quantum polynomial time). You mention in this post the number field sieve algorithm is the fastest known classical algorithm for the discrete logarithms problem, and is likely also the fastest for factoring large composite numbers. Is there some inherent connection that makes us expect that these two problems are similar, and that if someone found a faster (randomized) classical algorithm for one of these problems, there was likely also one for the other problem? Further, how are elliptic curves connected to all these?

  17. Douglas Knight Says:

    Jonas, my understanding is that it is somewhat mysterious that factoring and discrete logarithm in prime fields have such analogous algorithms.

    Elliptic curves do not seem to have any analogous structure that can be exploited. Thus one can use very small elliptic curves, and this is why they are popular.

    However, there are lots of classes of elliptic curves that have extra structure that can be exploited, usually by comparing them to the multiplicative group of fields, and then applying QS/NFS. It is very easy to imagine that new such classes will be discovered in the future. Indeed, it is easy to imagine that NSA already has such a class and that the Suite B curves that it chose were specifically to be (in)vulnerable.

  18. Job Says:

    Maybe the real trapdoor is that finding an exploit for a given implementation has lower average-case complexity than breaking an encryption scheme.

    In some cases, it’s as easy as deliberately introducing or surfacing a hidden vulnerability (e.g. heartbleed in OpenSSL).

    How can we increase the average-case complexity of exploit-finding?

  19. Joshua Zelinsky Says:

    That’s very interesting. I had no idea that the same large primes were being used for practical implementation of Diffie-Hellman. If I had to make a guess I would have guessed that when people set up a server they select their own large prime for that server.

    Actually, given all of this, I have to wonder if there are at this point any advantages whatsoever in using Diffie-Hellman key exchange rather than just using RSA for everything. Can someone involved in the practical end explain why using DH still makes sense?

  20. Scott Says:

    Jonas #16: Yeah, you ask an excellent question to which I wish better answers were available. Empirically, progress on factoring and on discrete log algorithms have tracked each other almost perfectly, and in particular instances, one can explain why: for example, Shor’s algorithm handles anything whatsoever reducible to order-finding or to an abelian hidden subgroup problem, which factoring and discrete log both are (after a bit of massaging). But we currently have no reduction from factoring to discrete log or vice versa.

  21. Andy H Says:

    Given the amount of resources invested in the hack versus the relative simplicity of the fix, aren’t there better things for the NSA to invest in just for it’s own spycraft, much less for the U.S. government in things that aren’t borderline evil?

  22. Scott Says:

    Andy #20: If they were indeed doing something like this, it might have given them amazing access to much of the world’s encrypted data for a few years (that would be hard to obtain otherwise), and have been well worth the few hundred million dollars it would’ve cost from that standpoint. Whether the US government should be doing such things is, of course, the broader question that people ought to be asking, and have been asking, post-Snowden.

  23. Scott Says:

    Quantum.Mechanic #7: OK, just for you, I italicized all the variables.

  24. William Hird Says:

    Andy #20:
    Maybe the question we should be asking here is this: Is it the de-facto policy of the NSA (regardless of who is “President” or who runs Congress) to NEVER allow the public ( or any entity for that matter) to use encryption that doesn’t have some weakness or back-door that they can exploit at will?

  25. Rahul Says:

    “A third solution is to migrate to elliptic-curve cryptography (ECC), which as far as anyone knows today, is much less vulnerable to descent attacks than the original Diffie-Hellman scheme.”

    But has elliptic curve cryptography been studied for as long and hard as Diffie Hellmam? Have people tried to exploit it as hard?

    Perhaps if we migrated to ECC, the motivation for exploits would rise & we would find analogous flaws?

  26. Rahul Says:

    If this exploit is so pervasive why does the Logjam site report that only 8% of the top million websites are vulnerable to it currently? Does that mean the rest are using 512+ bit keys? Or are protected in other ways?

    Even with the hypothetical hundreds of million infrastructure that would let the most common 1024-bit group be broken, only 17% of the top million domains seem vulnerable?

    And the top million seems a lot of sites to base your assessment on. The sites most of us care about (for secure traffic at least) will be in the top segment of these one million sites so they might be way less vulnerable that 8% or 17%?

  27. En vrac | Le blog de Balise Says:

    […] NSA in P/poly: The Power of Precomputation – un billet agréable à lire sur les derniers événements autour du fait que Diffie-Hellman est en pratique moins sûr que ce qu’il pourrait/devrait être en théorie, avec quelques calculs divers sur le concept « et donc, si la NSA-ou-autre-agence-gouvernementale voulait faire ça, ça leur prendrait combien de moyens/temps/etc ? » (en anglais) […]

  28. How NSA can decrypt ciphers with 1024 bit keys? | Mahdix Says:

    […] Here is the complete story. The story is: Most of today's secure network data transfers (e.g. when you enter your credit card number in a website), is done based on protocols which are using Diffie-Hellman algorithm. This algorithm's implementation uses some fixed numbers as its input parameters. NSA (supposedly) has pre-calculated different computation results for the single most common used parameter and using this set of pre-calculated data (plus several hundred CPU cores), it can decrypt text which is encrypted using 1024 bit keys. If you use 2048 bit keys they will not be able to do this but currently most of client and server applications are pre-configured to use this type of key.  […]

  29. a Says:

    Professor:
    This post is very serious. Do you really think Factoring and discrete log are in P/Poly?

  30. Scott Says:

    a #28: No. Did you even read the post?

    I don’t know if factoring and discrete log are in P/poly—but at any rate, I don’t think anyone knows anything like that today.

    The conjecture is that the NSA might be exploiting precomputation vaguely analogous to P/poly. Both the precomputation and the post-computation would still take subexponential time, but the post-computation would be a milder subexponential than the pre-computation, and calculations suggest that the combination could put 1024-bit discrete logs just within the range of feasibility, if you had a few hundred million dollars. (Also, the precomputations would depend on the exact value of p, not only on the input size log(p).) Again, none of this is assuming any algorithms that aren’t already known.

  31. Scott Says:

    Rahul #25: Firstly, I believe the 8% figure refers only to the sites vulnerable to the 512-bit downgrade attack—i.e., the one that Adrian et al. could demonstrate using just the computing resources that they had easily available as academics. A much larger percentage of websites would be vulnerable to the 1024-bit attack, the one within range of a nation-state. But secondly, an attack that affects 8% of the top million sites already seems like a big deal—that’s 80,000 sites that we’re talking about, more than enough for (e.g.) some criminal enterprise to make a killing by exploiting them.

  32. Joshua Zelinsky Says:

    a #28, does bring up an actual issue. If we were to phrase this in terms of computational complexity what would we want it to be?

    I don’t know a good name for the following class, so I’m going just call it NSA for now. A language L is in NSA, if instances consist of two strings, (x,y), and there’s a polynomial time algorithm for determining membership of (x,y) in L given, x f(x), and y. Additionally, we insist that f(x) can be computed in PSPACE.

    Is this the natural actual computational class here?

  33. Rahul Says:

    The Logjam site assesses all of my internet browser clients ( Chrome, Opera & Internet Explorer) as vulnerable to attack.

    Is there a fix for this yet?

  34. Scott Says:

    Rahul #31: Well, here’s what it says on weakdh.org:

      Make sure you have the most recent version of your browser installed, and check for updates frequently. Google Chrome (including Android Browser), Mozilla Firefox, Microsoft Internet Explorer, and Apple Safari are all deploying fixes for the Logjam attack.
  35. si Says:

    I wonder what NSA thinks about this paper. Are they saying “trouble, they are into us”, or “if this is the best academics can do, we are safe”. Unfortunately, I have no idea which one is the truth (a priori probability of 0.5).

  36. Yuval Levental Says:

    Unrelated : ( http://www.bbc.com/news/world-us-canada-32865248

  37. matt Says:

    Can the pre-computation be done efficiently on a quantum computer? Of course, if you have a qc, you could do the whole computation on the qc, but if a qc is available but time on it is a very limited resource, it might be worth pre-computing on the qc and then using classical computers to do the rest.

  38. Scott Says:

    Yuval #34: This is a tragedy. I’ve often gotten uneasy when I’m in a taxi that’s weaving madly in and out of traffic, but then I reflect, “the driver does this every day … so he must know what he’s doing, right??” I had agreed a few months ago to contribute an essay about P vs. NP to a book about famous open problems that John Nash was editing, which would’ve been my first interaction with him (not counting seeing him around in the IAS cafeteria). Don’t know what will happen to the book now. RIP.

  39. Scott Says:

    si #33: Even if what the NSA is doing differs in important ways from what the paper described, if the paper finally spurs people to switch to 2048-bit keys, it’s hard to see how that wouldn’t cause trouble for NSA.

  40. Scott Says:

    matt #35: Well, the output of the precomputation already has subexponential size—indeed, depending on what tradeoffs you want, the output can be as large as exp((1.923+o(1))(log p)1/3(log log p)2/3), comparable to the entire running time of the precomputation. So it’s far from obvious how a quantum computer would speed this up: certainly it couldn’t do it in polylog(p) time; at best it could shave off lower-order terms.

  41. a Says:

    Scott #29: Sorry I only looked at title and it was an impulsive posting. I read after posting though:).

    In any case do you think Integer factoring, DLog being in P/Poly is a reasonable possibility? Or is this also like ‘when pigs shall fly’? Either Integer factoring or DLog being in P/Poly does not seem to affect other complexity classes. In general could every OWF be inverted in P/Poly a reasonable possibility?

  42. Scott Says:

    a #39: For factoring and discrete log to be in P/poly—or P, for that matter, or (say) TIME(nlog(n))—I regard as within the realm of possibility. As far as anyone knows, these are “just” specific problems, and their solution wouldn’t cause any wider collapse of complexity classes or other disastrous structural consequences. On the other hand, I think it’s safe to say that any such algorithm would constitute, or depend on, a revolutionary advance in number theory. And I also strongly disagree with the people who claim there “must” be a fast classical factoring algorithm: I’d say we simply don’t know right now, and the number field sieve or something like it being close to optimal (or it being improvable, but only to some smaller subexponential) are also perfectly reasonable possibilities.

    For arbitrary OWFs to be efficiently invertible I’d regard as much more surprising, almost as surprising as P=NP.

    As a final remark, I doubt there’s a great deal of difference between the uniform and nonuniform cases (i.e., between P and P/poly) for any of these questions.

  43. a Says:

    Scott #42 Thank you very much.

  44. a Says:

    Sorry regarding your last line ‘I doubt there’s a great deal of difference between the uniform and nonuniform cases (i.e., between P and P/poly’. I did some background reading. It does seem that getting that poly length string could be very hard. It seems that if NP is in P/Poly, it would still be reasonable to speculate P is not NP right? (I mean with same strength as ‘when pigs shall fly’). It just seems to collapse PH to Pi_2 or Sigma_2. Remaining problems could still be hard. So it does seem there is a significant difference between P and P/Poly for problems in polynomial hierarchy.

  45. Bram Cohen Says:

    Using multiple different primes incurs huge amounts of problems, because specific primes can be weak in a number of ways, possibly in some novel ways, and generally have much lower performance than ones which are selected specifically because they have efficient implementations. Selecting fresh primes for opportunistic encryption, in particular, would be a nightmare of susceptibility to potential new attacks.

    Standard primes have the big advantages that they can be specifically studied for weaknesses and implementation tricks. There are even standards for what they should be: https://tools.ietf.org/html/rfc3526

    As others who beat me to making these points earlier have pointed out, the one and only thing which is really needed here is increasing the sizes.

  46. Joshua Zelinsky Says:

    a #44,

    While NP being in P/poly doesn’t imply that P=NP, NP being in P/poly would be almost as shocking as finding out that P=NP. If we found out that NP was in P/poly we’d have to so fundamentally reevaluate our entire approach to things that any degree of confidence that P != NP would have to put aside until we had very carefully examined the proof that NP is in P/poly and understood exactly where it was coming from and what the limitations of the technique were.

  47. Rahul Says:

    What are the practical downsides of going to a really large yet single prime?

    How large would the key have to be to make the NFS precomputation not feasible, even for a state entity with lots of resources. With whatever hardware scale up rates we foresee for, say, the next decade.

  48. Scott Says:

    Rahul #47: I back-of-the-enveloped it again, and got that taking a 2048-bit discrete log using NFS should be about a billion times harder than taking a 1024-bit discrete log. Which means: we’d be talking 45 quadrillion core-years. That seems plausibly beyond the current range of human civilization, without some serious algorithmic advance.

  49. David Says:

    Hopefully we’ll soon see enhancements to IKEv1 and IKEv2 to support site-generated DH groups. For now one is stuck with Group 5 (1536 bits) and above. Here’s where TLS beats IPSEC (for once) by supporting custom group generation.

  50. Rahul Says:

    Scott #48:

    Interesting. I found this link online which seems to say that Google has been using 2048 bit keys for almost two years now?

    Not sure if those are the same Diffie Hellman keys we are discussing here.

    http://www.computing.co.uk/ctg/news/2285984/google-updates-ssl-certificates-to-2048bit-encryption

    If so, are the client browsers already fixed to work with 2048 bit keys? They implementation aspects are so confusing.

    e.g. To move to 2048 bit keys, does that need changes on both server & client ends? Or only the servers need to change.

  51. anonymous Says:

    Is the disclosure in a recent interview with Terence Tao related to this matter:

    http://www.smh.com.au/good-weekend/terence-tao-the-mozart-of-maths-20150306-13fwcv.html

    Perhaps he inadvertently assisted in refining the choice for the NFS polynomials?

  52. BlackSails Says:

    Do you think the NSA using this attack is more likely than the NSA having a functioning quantum computer?

  53. Scott Says:

    BlackSails #52: Without question.

  54. Alon Rosen Says:

    A true theoretical cryptographer does not know what DH Key-Exchange is, or at least so they say: https://jonkatz.wordpress.com/2010/05/12/what-do-we-know/

  55. Rahul Says:

    Wasn’t it rather irresponsible to publish this before the major browsers had a chance to announce a fix / patch?

    I’ve searched a lot for Chrome but I don’t even see a development thread yet.

    Couldn’t these guys have waited some more before they announced this?

  56. John and Alicia Nash, 1928,1933–2015 | Gödel's Lost Letter and P=NP Says:

    […] the new Logjam attack on Diffie-Hellman key exchange. This is excellently covered by Scott Aaronson here, and has echoes of faults with RSA implementations that likewise fail to diversify the prime […]

  57. fred Says:

    A question regarding discrete logarithm.

    E.g. to solve

    3^m = 13, modulo 17 (solution is m=4).

    if I were to use brute force to check

    k * 17 +13 = 3^m
    for m=1, m=2, m=3, … m=N

    How many powers of 3 would I have to consider before I know for sure there is no solution? (i.e. what is N?)

  58. Jay Says:

    Rahul #55, it’d take huge ressources to exploit this, so it’s not as if some random folk could read your mails tomorrow morning. On the other hand, the information that the NSA could soon decrypt most information they happen to collect now, that is relevant now.

  59. Tommaso G Says:

    > and the elliptic-curve groups are less good if everyone is using the same few elliptic curves.

    I see a problem here: I am not an expert in elliptic curves, but AFAIK the curves suitable for crypto use are relatively sparse, and this is why there are just a few of them “standardized” (by e.g., NIST or NSA Suite B). The reason is that these curves are very complex objects and there are endless categories of them, with very different properties, and many of them are vulnerable to one attack or another (see e.g. http://safecurves.cr.yp.to/ ).

    It is also possible to generate random curves, but testing them for crypto suitability involves expensive computation which I doubt is easier than generating a large safe prime for DH use.

  60. Scott Says:

    Rahul #55: Adrian et al. have a whole section of their paper (Section 6, “Disclosure and Response”) about how they notified companies to fix the 512-bit vulnerability before going public with it, and the companies did so. (As Jay #58 pointed out, disclosure isn’t really relevant for the 1024-bit vulnerability, since that one requires nation-state-level resources to exploit, and the relevant nation-states probably all know about it already—and if not, it would take them a few years to develop the capability.) Disclosing to the affected parties before going public, to give them the chance to fix whatever vulnerability you found, is standard practice in the computer security community, and people are pretty careful about it.

  61. Rahul Says:

    Scott #60:

    When you go to weakdh.org & it warns you “Warning! Your web browser is vulnerable to Logjam and can be tricked into using weak encryption. You should update your browser.” is that triggered by the 512-bit vulnerability or the 1024-bit vulnerability?

    My impression is that no browser has rolled out fixes yet even for the 512 bit downgrade exploit.

  62. Sniffnoy Says:

    Fred: See Fermat’s little theorem (more generally, Euler’s totient theorem (more generally, Lagrange’s theorem)).

  63. Amir Says:

    fred #57: For your specific problem, solving modulo 17 with generator 3, the most steps needed are 16, for 3^16=1 and there is no smaller solution. (it you start the brute-force from m=0, the hardest case would be 3^15=6)

    The complexity actually depends on the choice of generator. Any course or textbook on basic number theory proves Fermat’s little theorem, and from that it follows that most bases are optimal, you’d need N-1 steps to brute-force when working modulo a prime N.

  64. bg Says:

    Scott #31 :

    an attack that affects 8% of the top million sites already seems like a big deal—that’s 80,000 sites that we’re talking about, more than enough for (e.g.) some criminal enterprise to make a killing by exploiting them.

    This reminds me of those hedge funds with consistently amazing returns that mostly hire cryptographers. I know in reality it’s for their pattern recognition abilities, but I’ve always wondered if there could be something else going on behind the scenes.

  65. dimethylethylhexabutylpeptopeyotine derivative Says:

    The point is that as machines get faster at breaking encryption, they can also handle higher bit encryption. Furthermore, people that have more sensitive data should be the most willing to upgrade. Sometimes upgrades require hardware-level op codes or new algorithms but that’s why the theory is less interesting. From the engineering point of view, things just need to scale properly.

  66. fred Says:

    Amir #63, Sniffnoy #62

    thanks!

    I wonder, if working modulo N requires ~N exponentiations, does it follow that the QM algorithm from Shor also requires ~N simultaneous state superpositions to work?

  67. MarkH Says:

    @Bram Cohen (#45):

    As far as I know, if q is prime and p = 2q+1 is also prime, then the multiplication group mod p is no less secure a setting for Diffie-Hellman than any other group of the same size (i.e., same modulus length in bits).

    It is simple to verify such a p: test it for primality, and then test (p-1)/2 for primality.

    This verification is somewhat costly, but is no barrier to creating a big list of safe primes, or to doing a one-time check of a server or domain that uses a “custom” prime.

    I am aware of no other properties of multiplication group moduli that materially alter the cost of computing discrete logs using publicly known algorithms.

    Your comment suggests that you know of several other criteria for the security of p. Would you kindly enlighten me?

  68. Amir Safavi Says:

    “Clear thought doesn’t respect the boundaries between disciplines, or between theory and practice.”

  69. fred Says:

    Reading all this you’d think the NSA hasn’t actually recruited any theoretical or practical cryptographers from the US academia. How’s that possible?
    I’m not sure what the alternatives would be… Self-taught geniuses/hackers? Captured aliens from the Roswell crash site?

    I was just reading about John Nash (RIP) and came across declassified letters he had sent to the NSA.
    http://www.gwern.net/docs/1955-nash
    Is it typical for member of the US academia to work “on the side” with/for the NSA (out of patriotism, etc), as opposed to full time?

  70. Scott Says:

    fred #69: It’s not “typical,” but it certainly happens; there are lots of mathematicians who have done consulting work for NSA.

    NSA is often on campuses trying to recruit. They also have a prestigious summer research program for undergrads and grad students, and they often recruit people from that. It’s an uphill battle for them, since excellent researchers tend to like to tell everyone what they discovered! (And some might have moral qualms about working for them, especially post-Snowden.) But I believe NSA is still the single largest employer of mathematicians in the world.

  71. Raoul Ohio Says:

    Re #70:

    The NSA is to mathematics Ph.D’s what Starbucks is to philosophy Ph. D’s.

  72. a Says:

    I@Raoul Ohio #71. Then what objects are to theory CS phds and physics phds?

  73. Joshua Zelinsky Says:

    Scott,

    Are you going to blog on Aram Harrow’s recent work on improved quantum error correcting codes?

  74. Scott Says:

    Joshua #73: Could you give me a link to it?

  75. Joshua Zelinsky Says:

    Sorry, looks like the result is old. Paper is here http://arxiv.org/abs/1411.3334 and is listed as from November 2014. It looks like it is just now getting picked up by the general press (e.g. here http://phys.org/news/2015-05-protocol-virtually-errors-quantum-memory.html ). Did you blog about this already at some point and I missed it?

  76. C Says:

    Hi Scott, any thoughts on this?

    http://arxiv.org/abs/1505.06506

  77. Scott Says:

    C #76: Sorry, this one looks covered by the second question in my FAQ. 🙂

  78. Joshua Zelinsky Says:

    C #76,

    Not Scott, but at a glance, the paper claims that they’ve proven that NP= coNP. However, they don’t address anywhere how they’ve gotten around the fact that we have oracles such that NP^A != coNP^A. They don’t discuss relativization at all. This is a major red flag.

  79. Scott Says:

    Joshua #75: I’d somehow missed that paper, but it looks great! I’m sorry that I don’t have anything more intelligent to say right now.

  80. Rahul Says:

    Is the University / student reaction to NSA trying to recruit from CS Departments the same as Dow recruiters got in Chemistry Departments back in the ’70’s when Dow used to manufacture Napalm? Or Halliburton recruiters got during the Bush era?

  81. Bram Cohen Says:

    MarkH #67:

    I don’t know of any specific attacks, but it’s an attack vector I’d rather not deal with, and the the complexity of doing collaborative prime generation sounds quite frightening, especially if the selection of the prime itself is supposed to be part of the security and you’re worried about an attacker being able to select or force reuse of the modulus.

  82. Joshua Zelinsky Says:

    MarkH @67,

    I’m not sure how great an idea that is. For a large n, the probability that n is prime is around 1 / log n. For n and (n-1)/2 to be prime is around 1/ (log n)^2. That’s a lot smaller. So finding safe primes is more computationally intensive because you need to search a lot more to find them. I don’t know enough about the practical sizes involved to know if that would really be a practical barrier to implementing what you are envisioning.

  83. Links for May 2015 - foreXiv Says:

    […] Scott Aaronson on the the Logjam attack. […]

  84. MarkH Says:

    @Joshua Zelinsky:

    I ran the exercise some years ago, and it does take a goodly number of minutes. Also, an interesting optimization is available: once a prime q is found, 2q+1 can be tested for primality efficiently using the (poetic sounding) Pocklington criterion.

    I don’t see this as any obstacle to each site using its own prime (which would effectively defeat the precomputation attack) — the generation need only be done once. Likewise, the safe prime verification (which is a little costly, but only a matter of seconds or less) need only be made once by each party connecting to that site.

    If p-1 has only small factors then discrete logs can be computed efficiently in the multiplication group mod p. Accordingly, for a DH multiplication group modulus to be secure, it is necessary that p-1 have at least one “large” prime factor.

    Because it is necessary to know something about the factorization of p-1, primes chosen at random are not guaranteed to be secure (unlike in RSA, where the factors of the semiprime modulus need no special properties). So whether or not a DH modulus is a “safe” prime (that is, one plus the double of a Sophie Germain prime), finding an assuredly secure modulus takes extra time.*

    It is also important to know the order (group size) of the generator g. This is easy to check, when the factorization of p-1 is known. For a safe-prime modulus, every generator must have an order of q, 2q, or 2.

    If I remember correctly, choosing a g of order q eliminates any bit-leakage of the shared secret.

    Finally, I note that if everybody used their own DH group, then the precomputation attack would be impractical even at 1024 bits (it costs a lot to do the precomputation for even one modulus), and very far beyond the range of the feasible at 2048 bits. Going to 4096 bits is an overreaction, in my judgment, if everybody sets up their own DH group parameters.
    ________________________

    * In practice, a randomly chosen prime p is extremely unlikely to have only small factors of p-1; however, the custom is to generate DH moduli in a manner that guarantees at least one large factor.

  85. Joshua Zelinsky Says:

    MarkH #84,

    Good points all, especially regarding Pocklington.

    I’m not sure the problem of p-1 needing to not have too many small factors is that serious though: just take p-1 and try to factor it using naive division for small primes. If nothing turns up other than the 2 factor out to a fairly large size then you are good to go, but I’m enough on the theoretical end that I’m probably missing something here.

    I agree with you that in that context 4096 would probably be an overreaction, especially because actual bandwith starts mattering.

    I do think there’s another issue that hasn’t been touched upon that is a cause for (slight) concern: if everyone is producing their own primes then people need more random bits. Also, as long as one is reusing the same prime even for a single server, it makes the pre-computation type attack still somewhat viable.

  86. CJ3 Says:

    The explanation of D-H is not the same as at wiki.

  87. O. S. Dawg Says:

    Doesn’t RSA have a similar vulnerability to precomputation as DH?

    See: http://arxiv.org/abs/1003.5461

    In particular, see remark 7: “A great algorithmic advantage of the approach of Schnorr over that of Adleman is that the choice of the basis can be essentially independent of the number N.”

  88. Greg Says:

    Though this only applies to the standardized NIST curves, here is a decent ECC server, browser, and OS compatibility list:

    https://support.globalsign.com/customer/portal/articles/1995283-ecc-compatibility

  89. Crypto-Gram June 15, 2015 | privacytips.net Says:

    […] Good analysis of the cryptography: https://scottaaronson-production.mystagingwebsite.com/?p=2293 […]

  90. jonas Says:

    Scott re #20. On Theoretical Computer Science StackExchange, Arul points out in the recent question (id 32760) that there’s a 1984 note [1] about this. It seem the note claims that the factoring problem can be reduced to discrete logarithm. There’s a catch: this requires discrete logarithm over a composite modulus, the same modulus that you want to factor.

    [1] Eric Bach, “Discrete Logarithms and Factoring”, technical report (1984). [I’d give you direct links but your blog software hates me when I do that.]

  91. Douglas Knight Says:

    Jonas, that is a lot less interesting a result than it sounds. You just use discrete logarithm in Z/n^* to extract the mere order of that group, the totient φ(n). In the RSA case φ(pq)=(p-1)(q-1) and it’s trivial to factor from knowing that. But using discrete logarithm to extract the order of a blackbox group is really overkill.

    Discrete logarithm in Z/n^* is pretty much the same as factoring n plus discrete logarithm in the multiplicative groups of the prime power factors of n. Factoring is the reduction step and this doesn’t really relate discrete logarithm to anything other than discrete logarithm.

  92. jonas Says:

    Douglas Knight #91: exactly. The same report also tells that discrete logarithm with a composite modulus reduces the way you’re saying.

  93. turbo Says:

    Which of two is the correct definition of discrete logarithm problem?

    Problem 1:
    Input: p, g,h.
    Output: x such that g^x=h mod p

    Problem 2:
    Input: h.
    Output: x such that g^x=h mod p for some fixed cyclic group Zp with generator g.

    I think NSA may have P/poly algorithm for Problem 2.

    What does it mean to have a P/poly algorithm for Problem 1?

    Which of Problem 1 and Problem 2 is correct discrete log problem and which one if has P/poly algorithm would make you consider Discrete log is in P/poly?

    Is Problem 1 the discrete log problem and problem 2 the diffie hellman problem or vice versa?

  94. turbo Says:

    Professor Is there any importance of problem 1 being P/poly? Would you consider that possible compared to the general problem 2 or even that would be a breakthrough?