After the reviews from Eurocrypt 2013 (http://hyperelliptic.org/tanja/nonuniform-reviews.txt, also posted anonymously as http://pastebin.com/BxQkkq6y) we made huge changes to this paper. This is the svn diff output (limited to TeX files for sections; also, some lines are removed from the output for anonymity) between our Eurocrypt 2013 submission and our Crypto 2013 submission. Note that this is more than 2000 lines. Shortly after the Crypto 2013 submission we made various further edits, mainly to analyze the new formalization of collision resistance but also to add several small clarifications at various points in the paper. Those further edits aren't shown here. Index: intro.tex =================================================================== --- intro.tex (revision 87) +++ intro.tex (revision 178) @@ -21,8 +21,8 @@ The first part is a concrete definition of what it means for a cipher or a MAC to be secure. We quote from the classic paper \cite[Section 1.3]{2000/bellare-cbc} by Bellare, Kilian, and Rogaway: -the PRP-insecurity of a cipher such as AES -(denoted ``$\Adv_{\AES}^{\prp}(q',t)$'') +the PRP-``insecurity'' of a cipher such as AES +(denoted ``$\Adv_{\AES}^{\prp}(q',t')$'') is defined as the ``maximum, over all adversaries restricted to $q'$ input-output examples @@ -36,14 +36,12 @@ (denoted ``$\Adv_{{\CBC^m}\hbox{-}{\AES}}^{\prf}(q,t)$'') is defined similarly, using a uniform random function rather than a uniform random permutation. -These definitions have become completely standard, -except for a minor change in notation: -``the $(q,t)$-security of AES is at most $\epsilon$'' -is usually abbreviated ``AES is $(q,t,\epsilon)$-secure''. The second part of the answer is a concrete security theorem bounding the insecurity of AES-CBC-MAC -in terms of the insecurity of AES. +in terms of the insecurity of AES, +or more generally the insecurity of $F$-CBC-MAC +in terms of the insecurity of $F$ for any $\ell$-bit block cipher $F$. Specifically, here is the main theorem of \cite{2000/bellare-cbc}: ``for any integers $q,t,m\ge 1$, @@ -120,16 +118,44 @@ into an AES security conjecture and a CBC-MAC security proof drastically simplifies the cryptanalyst's job. -The same pattern can be found throughout the literature on provable concrete security. +The same three-part pattern has +(as illustrated by Appendix~\ref{literature}) +become completely standard +throughout the literature on concrete ``provable security''. +First part: +The insecurity of $X${\dash}where $X$ is a primitive such as AES or RSA, +or a higher-level protocol such as AES-CBC-MAC or RSA-PSS{\dash}is defined as +the maximum, over all algorithms $A$ (``attacks'') that cost at most $C$, +of the probability (or advantage in probability) that $A$ succeeds in breaking $X$. +This insecurity is explicitly a function of the cost limit $C$; +typically $C$ is separated into +(1) a time limit $t$ and +(2) a limit $q$ on the number of oracle queries. +Note that this function depends implicitly on how the ``cost'' of an algorithm is defined. + +Often ``the $(q,t)$-insecurity of $X$ is at most $\epsilon$'' +is abbreviated ``$X$ is $(q,t,\epsilon)$-secure''. +Many papers prefer the more concise notation +and do not even mention the insecurity function. +We emphasize, however, that this is merely a superficial change in notation, +and that both of the quotes in this paragraph refer to exactly the same situation: +namely, the nonexistence of algorithms that cost at most $(q,t)$ +and that break $X$ with probability more than $\epsilon$. + +Second part: Concrete ``provable security'' -theorems state that the insecurity of a complicated object -is bounded in terms of the insecurity of a simpler object; +theorems state that the insecurity (or security) of a complicated object +is bounded in terms of the insecurity (or security) of a simpler object. +Often these theorems require restrictions on the types of attacks allowed +against the complicated object: for example, Bellare and Rogaway in \cite{1995/bellare-oaep} showed that RSA-OAEP has similar security to RSA against generic-hash attacks (attacks in the ``random-oracle model''). + +Third part: The insecurity of a well-studied primitive such as AES or RSA-1024 -is then conjectured to match the success probability of the best attack known. +is conjectured to match the success probability of the best attack known. For example, Bellare and Rogaway in \cite[Section 1.4]{1996/bellare-sigs}, evaluating the concrete-security of RSA-FDH and RSA-PSS, @@ -143,10 +169,9 @@ in attacking these primitives. - -\subheading{Contributions of this paper} +\subheading{Primary contribution of this paper} Our primary goal in this paper is to convincingly undermine -all of the standard security conjectures discussed above. +all of the standard security conjectures reviewed above. Specifically, Sections \ref{aes}, \ref{ecc}, \ref{dsa}, and \ref{rsa} show{\dash}assuming standard, amply tested heuristics{\dash}% @@ -157,14 +182,15 @@ the insecurity of AES, NIST P-256, DSA-3072, and RSA-3072, according to the standard concrete-security definitions, reaches essentially 100\% for a time bound considerably below $2^{128}$. -The conjectures by Bellare, Kilian, and Rogaway in +The conjectures by Bellare and Rogaway in \cite[Section 1.4]{1996/bellare-sigs}, \cite[Section 3.6]{2005/bellare-intro}, \cite[Section 3.2]{2006/bellare-hmac}, etc.~are false for every reasonable assignment of the unspecified constants. The same ideas show that there {\it exist\/} high-probability attacks -against AES-CBC-MAC, RSA-PSS, RSA-OAEP, and thousands of other ``provably secure'' protocols, +against AES-CBC-MAC, RSA-3072-PSS, RSA-3072-OAEP, +and thousands of other ``provably secure'' protocols, in each case taking considerably less than $2^{128}$ time. It is not clear that similar attacks exist against {\it every\/} such protocol in the literature, since in some cases the security reductions are unidirectional, @@ -173,10 +199,10 @@ We do not claim that this reflects any actual security problem with AES, NIST P-256, DSA-3072, and RSA-3072, -or with the higher-level protocols built on top of these primitives. +or with higher-level protocols built from these primitives. On the contrary! -It is clear that these attacks {\it exist}, -but we conjecture that any fast {\it construction\/} of these attacks +Our constructions of these attacks are very slow; +we conjecture that any {\it fast\/} construction of these attacks has negligible probability of success. Users have nothing to worry about. @@ -198,30 +224,87 @@ -Our secondary goal in this paper is to articulate a recommended rescue strategy. -Appendix~\ref{salvage} -analyzes several possible responses to the existence of these attacks: -\begin{itemize} -\item Circle the wagons: abandon the conjectures but preserve the definitions. -\item Switch the definitions to the NAND metric. -\item Switch the definitions to the $AT$ metric. -\item Add effectivity to the definitions. -\item Add uniformity to the definitions. -\end{itemize} -We recommend switching to the $AT$ metric, adding effectivity, -and refactoring theorems to simplify further changes. -Stating and proving new definitions and theorems will require considerable effort, -but we think that the attacks amply justify this effort. +\subheading{Secondary contribution of this paper} +Our secondary goal in this paper is to propose a rescue strategy: +a new way to define security{\dash}a definition that restores, to the maximum extent possible, +the attractive three-part security arguments described above. +All of the gaps considered in this paper come from errors in quantifying feasibility. +Each of the high-probability attacks presented in this paper +(1) has a cost $t$ according to the standard definitions, +but (2) is obviously infeasible, +even for an attacker able to carry out a ``reasonable'' algorithm +that costs $t$ according to the same definitions. +The formalization challenge is to say exactly what ``reasonable'' means. +Our core objective here +is to give a new definition that accurately captures +what is actually feasible for attackers. -\subheading{Priority dates and credits} +This accuracy has two sides. +First, the formally defined set of algorithms must be large enough. +Security according to the definition does not imply actual security +if the definition ignores algorithms that are actually feasible. +Second, the formally defined set of algorithms must be small enough. +One cannot conjecture security on the basis of cryptanalysis +if infeasible attacks ignored by cryptanalysts +are misdeclared to be feasible by the security definition. + +We actually analyze four different ideas for modifying the notion of feasibility +inside existing definitions: +(1) switching the definitions from the RAM metric used in \cite{2000/bellare-cbc} +to the NAND metric, an ``alternative'' mentioned in \cite{2000/bellare-cbc}; +(2) switching the definitions to the $AT$ metric, +a standard hardware-design metric +formally defined by Brent and Kung in \cite{1981/brent-areatime} in 1981; +(3) adding constructivity to the definitions, +by a simple trick that we have not seen before (see Appendix~\ref{salvage-effectivity}); +and +(4) adding uniformity to the definitions. +Readers unfamiliar with the RAM, NAND, and $AT$ metrics +should see Appendix~\ref{cost} +for a summary and pointers to the literature. + +Ultimately we recommend the second and third modifications +as producing much more accurate models of actual feasibility. +We also recommend refactoring theorems to simplify further changes, +whether those changes are for even better accuracy or for other reasons. +We recommend against the first and fourth modifications. +Full details of our analysis appear in Appendix~\ref{salvage}; +the NAND and $AT$ analyses for individual algorithms +appear in Sections~\ref{aes}, \ref{ecc}, \ref{dsa}, and \ref{rsa}. +Appendix~\ref{faq} is a frequently-asked-questions list, +serving a role for this paper +comparable to the role that a traditional index serves for a book. + +These two modifications have several positive consequences. +Incorrect conjectures in the literature regarding the concrete security +of primitives such as AES can +be replaced by quite plausible conjectures using the new definitions. +Our impression is that +{\it most\/} of the proof ideas in the literature are compatible with the new definitions, +modulo quantitative changes, +so {\it most\/} concrete-security theorems in the literature can be replaced +by meaningful concrete-security theorems using the new definitions.\footnote +{We do not claim that {\it all\/} proofs can be rescued, +and it is even possible that some theorems will have to be abandoned entirely. +Some troublesome examples have been pointed out by Koblitz and Menezes +in \cite{2012/koblitz} and \cite{2012/koblitz-nonuniformity}. +Our experience indicates, however, that such examples are unusual. +For example, there is nothing troublesome about the CBC-MAC proof or the FDH proof; +these proofs simply need to be placed in a proper framework +of meaningful definitions, conjectures, and theorem statements.} +The conjectures and theorems together will produce exactly the desired conclusions +regarding the concrete security of protocols such as AES-CBC-MAC. + + +\subheading{Priority dates; credits; new analyses} On 20 March 2012 we publicly announced the trouble with the standard AES conjectures; on 17 April 2012 we publicly announced the trouble with the standard NIST P-256, DSA-3072, and RSA-3072 conjectures. The low-probability case of the AES trouble was observed independently by Koblitz and Menezes and announced earlier in March 2012; -further credits to Koblitz and Menezes appear in Appendix~\ref{koblitz}. +further credits to Koblitz and Menezes appear below. We are not aware of previous publications disputing the standard concrete-security conjectures. @@ -232,11 +315,11 @@ and new algorithm improvements in Sections~\ref{ecc}, \ref{dsa}, and \ref{rsa}; our improvements are critical for beating $2^{128}$ in Section~\ref{rsa}. In Sections~\ref{aes}, \ref{ecc}, and \ref{dsa} -the previous techniques were already adequate -to disprove the standard $2^{128}$ concrete-security conjectures, +the standard techniques were already adequate +to (heuristically) disprove the standard $2^{128}$ concrete-security conjectures, but as far as we know we were the first to point out these contradictions. We do not think the contradictions were obvious; -in many cases the techniques were published decades {\it before\/} the conjectures! +in many cases the standard techniques were published decades {\it before\/} the conjectures! Index: koblitz.tex =================================================================== --- koblitz.tex (revision 87) +++ koblitz.tex (revision 178) @@ -1,5 +1,4 @@ -\section{Appendix: Credits to Koblitz and Menezes}\label{koblitz} - This paper was triggered by a 23 February 2012 paper \cite{2012/koblitz}, in which Koblitz and Menezes objected to the non-constructive nature of Bellare's security proof \cite{2006/bellare-hmac} for NMAC. @@ -35,3 +34,13 @@ between one algorithm cost metric and another, and we raise the possibility of eliminating the difficulties by carefully selecting a cost metric. + +Readers who find these topics interesting +may also be interested in the followup paper \cite{2012/koblitz-nonuniformity}, +especially the detailed discussion in \cite[Section 2]{2012/koblitz-nonuniformity} +of ``two examples where the non-uniform model led researchers astray''. +See also Appendices +\ref{uniform-vs-constructive}, +\ref{non-uniformity-purely-asymptotic}, +and \ref{non-uniformity-definition} +of our paper for further comments on the concept of non-uniformity. Index: aes.tex =================================================================== --- aes.tex (revision 87) +++ aes.tex (revision 178) @@ -16,21 +16,32 @@ \subheading{Breaking AES with MD5}\label{aes-md5} -We begin with a low-probability attack that does not use any precomputations. +We begin with an attack that does not use any precomputations. +This attack is feasible, and in fact quite efficient; +its success probability is low, but not nearly as low as one might initially expect. +This is a warmup for the higher-success-probability attack +of Section~\ref{aes-bigprecomp}. -Let $P$ be a uniform random permutation of the set $\setof{0,1}^{128}$. +Let $P$ be a uniform random permutation of the set $\setof{0,1}^{128}$; +we label elements of this set in little-endian form as integers $0,1,2,\ldots$ +without further comment. The pair $(P(0),P(1))$ is nearly a uniform random $256$-bit string: it avoids $2^{128}$ strings of the form $(x,x)$ but is uniformly distributed among the remaining $2^{256}-2^{128}$ strings. If $k$ is a uniform random $128$-bit string -then the pair $(\AES_k(0),\AES_k(1))$ is a highly nonuniform random $256$-bit string. +then the pair $(\AES_k(0),\AES_k(1))$ is a highly nonuniform random $256$-bit string, +obviously limited to far fewer than $2^{256}$ possibilities. One can reasonably guess that -an easy way to distinguish this from $(P(0),P(1))$ +an easy way to distinguish this string from $(P(0),P(1))$ is to feed it through MD5 and inspect the first bit of the result. -The success probability of this attack is far below $1$, +The success probability of this attack{\dash}% +the absolute difference between its average output for input $(\AES_k(0),\AES_k(1))$ +and its average output for input $(P(0),P(1))${\dash}% +is far below $1$, but it is almost certainly above $2^{-80}$, and therefore many orders of magnitude above $2^{-128}$. +See Appendix~\ref{verif} for relevant computer experiments. To understand why this works, imagine replacing the first bit of MD5 @@ -66,7 +77,8 @@ but it would be astonishing for MD5 to interact with AES in such a way as to spoil this attack. More likely is that there are some collisions in $k\mapsto (\AES_k(0),\AES_k(1))$; -but any such collisions will tend to push $\delta$ away from $0$, helping the attack. +but such collisions are rare unless AES is deeply flawed, +and in any event will tend to push $\delta$ away from $0$, helping the attack. \subheading{Precomputing larger success probabilities}\label{aes-bigprecomp} @@ -100,11 +112,11 @@ There are $2^{3\cdot 2^{2n}}$ choices of $(3\cdot 2^{2n})$-bit strings $s$, and $2^{3\cdot 2^{2n}}$ is considerably larger than $\exp(2^{2n+1})$, so presumably at least one of these values of $s$ -will have $D_s$ has success probability at least ${\approx}2^{n-64}$. +will have $D_s$ having success probability at least ${\approx}2^{n-64}$. Similar comments apply to essentially any short-key cipher. There almost certainly {\it exists\/} a $(3\cdot 2^{2n})$-bit string $s$ -such that the following simple attack achieves insecurity ${\approx}2^{n-K/2}$, +such that the following simple attack achieves success probability ${\approx}2^{n-K/2}$, where $K$ is the number of bits in the cipher key: query $2K$ bits of ciphertext, append $s$, and hash the result to $1$ bit. @@ -114,7 +126,7 @@ It grows more quickly in the $AT$ metric: storing the $3\cdot 2^{2n}$ bits of $s$ uses area at least $3\cdot 2^{2n}$, and even a heavily parallelizable hash function -will take time $\Theta(2^n)$ simply to communicate across this area, +will take time proportional to $2^n$ simply to communicate across this area, for a total cost proportional to $2^{3n}$. In each metric there are also lower-order terms reflecting the cost of hashing per bit; @@ -123,10 +135,9 @@ \subheading{Iteration}\label{aes-iteration} -High levels of insecurity are more efficiently achieved +Large success probabilities are more efficiently achieved by a different type of attack that iterates, e.g., -the function $f_7:\setof{0,1}^{128}\to\setof{0,1}^{128}$ defined by $f_7(k)=\AES_k(0)\oplus 7$, -where $7$ is represented in little-endian binary form. +the function $f_7:\setof{0,1}^{128}\to\setof{0,1}^{128}$ defined by $f_7(k)=\AES_k(0)\oplus 7$. Choose an attack parameter $n$. Starting from $f_7(k)$, @@ -144,9 +155,9 @@ etc. If $n$ is not too large (see the next paragraph) then there are close to $2^{2n}$ different keys here. -The computation involves $\le 2^n$ initial iterations; +The computation involves ${\le}2^n$ initial iterations; $2^n$ table lookups; -and, in case of a match, $\le 2^n$ iterations to recompute $f_7^{2^n-j}(i)$. +and, in case of a match, ${\le}2^n$ iterations to recompute $f_7^{2^n-j}(i)$. The {\it precomputation\/} performs many more iterations, but this precomputation is only the cost of {\it finding\/} the algorithm, not the cost of {\it running\/} the algorithm. @@ -216,8 +227,8 @@ the precomputation covers $2^{2n}U$ keys instead of just $2^{2n}$ keys. To avoid excessive chain collisions one must limit $2^n$ to $2^{K/3}U^{-1/3}$; -the attack then finds each key with probability $2^{-K/3}U^{1/3}$, -with a cost of $2^{K/3}U^{-1/3}$ per key, +the attack then finds each key with probability $2^{2n}U/2^K=2^{-K/3}U^{1/3}$, +with a cost of $2^n=2^{K/3}U^{-1/3}$ per key, a factor of $U^{2/3}$ better than handling each key separately. Finding each key with high probability costs $2^{2K/3}U^{-2/3}$ per key. @@ -228,14 +239,16 @@ taking time on the scale of $2^n$, but the $2^n U$ precomputed keys occupy area $2^n U$, for a total cost on the scale of $2^{2n} U$, -i.e., $2^{2n}$ per key. -Precomputation again has only a small benefit. -There is still a benefit in success probability from handling $U$ keys together: -achieving success probability $p$ costs essentially $(2^K/U)p$. +i.e., $2^{2n}$ per key, for success probability $2^{2n}U/2^K$ per key. +Note that one can carry out the precomputation +using essentially the same area and time. +There is a large benefit from handling $U$ keys together{\dash}% +finding all $U$ keys costs essentially $2^K$, i.e., $2^K/U$ per key{\dash}% +but this benefit exists whether or not precomputation costs are taken into account. \subheading{Comparison} -We summarize the insecurity established by the best attacks discussed above. +We summarize the insecurity established by the best attacks presented above. Achieving success probability $p$ against $U$ keys costs \begin{itemize} \item @@ -265,21 +278,22 @@ \cite[Section 3.6]{2005/bellare-intro} and \cite[Section 3.2]{2006/bellare-hmac} were made, they were already inconsistent with known attacks. -The iteration idea was introduced by Hellman in \cite{1980/hellman} for the special case $U=1$. +The iteration idea +was introduced by Hellman in \cite{1980/hellman} for the special case $U=1$. Many subsequent papers have explored variants and refinements of Hellman's attack, including the easy generalization to larger $U$. Hellman's goal was to attack many keys -for a lower cost than attacking each key separately; +for a lower RAM cost than attacking each key separately; Hellman advertised a ``cost per solution'' of $2^{2K/3}$ using a precomputed table of size $2^{2K/3}$. The generalization to larger $U$ achieves the same goal at lower cost, but the special case $U=1$ remains of interest as a non-uniform single-key attack. Koblitz and Menezes in \cite{2012/koblitz} -recently considered the family of attacks $D_s$. +recently considered a family of attacks analogous to $D_s$. They explained that there should be a short string $s$ where $D_s$ has success probability at least $\approx 2^{-K/2}$, -and analyzed the consequences for provable concrete secret-key security. +and analyzed some consequences for provable concrete secret-key security. However, they did not analyze higher levels of insecurity. Replacing $D_s$ with a more structured family of attacks, @@ -288,14 +302,14 @@ (See, for example, \cite[Section 7]{2009/dodis-mac}, which says that this is ``well known in complexity theory''.) -De, Trevisan, and Tulsiani in \cite{2009/de} +De, Trevisan, and Tulsiani in \cite{2010/de} proved cost ${\approx}2^K p^2$, for both the RAM metric and the NAND metric, for any insecurity level $p$. A lucid discussion of the gap between these attacks and exhaustive search -appears in \cite[Section 1]{2009/de}, -but without any discussion of the resulting trouble +appears in \cite[Section 1]{2010/de}, +but without any analysis of the resulting trouble for the literature on provable concrete secret-key security, -and without any discussion of possible fixes. +and without any analysis of possible fixes. Biham, Goren, and Ishai in \cite[Section 1.1]{2008/biham-weak} pointed out that Hellman's attack Index: ecc.tex =================================================================== --- ecc.tex (revision 87) +++ ecc.tex (revision 178) @@ -14,6 +14,8 @@ a hyperelliptic curve, a torus, etc.)~does not stop the attack. +We focus on NIST P-256 for both concreteness and practical relevance, +as in the previous section. @@ -102,6 +104,9 @@ observe that each new step has chance $2^{-t}$ of reaching a distinguished point, and chance $O(2^{2t}/\ell)=O(2^{-t})$ of reaching one of the previous $O(2^{2t})$ points represented by the database. +Computer experiments reported in \cite{2012/bernstein-cuberoot}, +as a followup to this paper, +show that all the $O$ constants here are reasonably close to $1$. Now consider a walk starting from $aP+bQ$. This walk has chance approximately $1/e$ of continuing for at least $2^t$ steps. @@ -166,9 +171,13 @@ None of these papers noticed any implications for provable security, and none of them went beyond the RAM metric. After we released a preprint of this paper in June 2012, a followup paper \cite{2012/bernstein-cuberoot} by Bernstein and Lange -experimentally verified our algorithm, +experimentally verified the algorithm stated above, improved it to $1.77\cdot \sqrt[3]{\ell}$ additions using $\sqrt[3]{\ell}$ distinguished points, extended it to DLPs in intervals (using slightly more additions), and showed constructive applications in various protocols. Index: dsa.tex =================================================================== --- dsa.tex (revision 87) +++ dsa.tex (revision 178) @@ -44,7 +44,7 @@ The same is true for $h_2$; overall the attack requires between $2^{107.85}$ and $2^{108.85}$ iterations, depending on $2^{3071}/p$. -Batch trial division, discussed in detail in Section~\ref{rsa}, +Batch trial division, analyzed in detail in Section~\ref{rsa}, finds the $y$-smooth values among many choices of $h_1$ at very low cost in both the RAM metric and the NAND metric. This attack is much slower in the $AT$ metric. Index: rsa.tex =================================================================== --- rsa.tex (revision 87) +++ rsa.tex (revision 178) @@ -37,7 +37,7 @@ For example, with parameters $m=2^{256}$, $d=7$, $H=2^{55}$, and $y=2^{50}$, the attack factors integers between $2^{1792}$ and $2^{2048}$. -Parameter selection is discussed later in more detail. +Parameter selection is analyzed later in more detail. The following three paragraphs explain how the attack handles $N$. Write $N$ in radix $m$: @@ -55,7 +55,7 @@ to emphasize two important ways that this attack differs from conventional NFS: first, conventional NFS chooses $m$ as a function of $N$, while this attack does not; second, -conventional NFS computes $R$ by sieving all pairs $(a,b)$ +conventional NFS computes $R$ by sieving all pairs $(a,b)$ with $-H\le a\le H$ and $0:<0cm,0.007cm>:: +\xy <0.009cm,0cm>:<0cm,0.009cm>:: (-600,0)="origin"; (-650,0) *{\llap{\strut $1$}}; +(-650,150) *{\llap{\strut $2^{K/4}$}}; +(-600,150) *{+}; +(-650,300) *{\llap{\strut $2^{K/2}$}}; +(-600,300) *{+}; +(-650,450) *{\llap{\strut $2^{3K/4}$}}; +(-600,450) *{+}; (-650,600) *{\llap{\strut $2^K$}}; -(-650,300) *{\llap{\strut \txt{Cost}}}; +(-850,300) *{\llap{\strut \txt{Cost}}}; (-600,600)="yaxis"; -(-600,-50) *{\strut \txt{$1/2^K$}}; +(-600,-50) *{\strut \txt{$2^{-K}$}}; +(-450,0) *{+}; +(-450,-50) *{\strut \txt{$2^{-3K/4}$}}; +(-300,0) *{+}; +(-300,-50) *{\strut \txt{$2^{-K/2}$}}; +(-150,0) *{+}; +(-150,-50) *{\strut \txt{$2^{-K/4}$}}; (0,-50) *{\strut \txt{$1$}}; (0,0)="xaxis"; -(-300,-50) *{\strut \txt{Success probability}}; +(-300,-150) *{\strut \txt{Success probability}}; "origin"; "yaxis" **@{-}; "origin"; "xaxis" **@{-}; (-600,0); (0,600) **[green]@{-}; @@ -37,24 +51,31 @@ \label{aes-graph} \end{figure} - \begin{figure}[h] $$\vbox{ -\xy <0.007cm,0cm>:<0cm,0.007cm>:: -(-600,0)="origin"; -(-650,0) *{\llap{\strut $1$}}; -(-650,300) *{\llap{\strut $\sqrt{\ell}$}}; -(-650,150) *{\llap{\strut \txt{Cost}}}; -(-600,300)="yaxis"; -(-600,-50) *{\strut \txt{$1/\ell$}}; -(0,-50) *{\strut \txt{$1$}}; -(0,0)="xaxis"; -(-300,-50) *{\strut \txt{Success probability}}; +\xy <0.009cm,0cm>:<0cm,0.009cm>:: +(0,0)="origin"; +(-50,0) *{\llap{\strut $1$}}; +(-50,150) *{\llap{\strut $\ell^{1/4}$}}; +(0,150) *{+}; +(-50,300) *{\llap{\strut $\ell^{1/2}$}}; +(-250,150) *{\llap{\strut \txt{Cost}}}; +(0,300)="yaxis"; +(0,-50) *{\strut \txt{$\ell^{-1}$}}; +(150,-50) *{\strut \txt{$\ell^{-3/4}$}}; +(150,0) *{+}; +(300,-50) *{\strut \txt{$\ell^{-1/2}$}}; +(300,0) *{+}; +(450,-50) *{\strut \txt{$\ell^{-1/4}$}}; +(450,0) *{+}; +(600,-50) *{\strut \txt{$1$}}; +(600,0)="xaxis"; +(300,-150) *{\strut \txt{Success probability}}; "origin"; "yaxis" **@{-}; "origin"; "xaxis" **@{-}; -(-600,0); (0,200) **[red]@{-}; -(-600,0); (0,300) **[blue]@{-}; +"origin"; (600,200) **[red]@{-}; +"origin"; (600,300) **[blue]@{-}; \endxy }$$ \caption{Cost summary of discrete-logarithm attacks for a group of size $\ell$. @@ -69,3 +90,4 @@ } \label{ecc-graph} \end{figure} + Index: literature.tex =================================================================== --- literature.tex (revision 0) +++ literature.tex (revision 178) @@ -0,0 +1,152 @@ +\setcounter{section}{11} + +\section{Appendix: Literature samples +}\label{literature} + +To collect data regarding standard practice +in the literature on provable concrete security, +we scanned the proceedings of Crypto 2012, +searching for concrete security definitions and claims of provable concrete security. +This appendix reports the results of this scan. + +Most of the ``provable security'' papers +do {\it not\/} prove anything about concrete security. +Some use information-theoretic security notions +that cannot be broken by any amount of attacker computation; +the majority use purely asymptotic security notions +such as ``every probabilistic polynomial-time adversary +has negligible success probability''. + +There are, however, five provable-concrete-security papers. +Each of these five papers +explicitly claims to ``show'' or ``prove'' or ``guarantee'' +bounds that involve the ``security'' (or ``insecurity'' or ``Adv'') +of some cryptographic systems. +Proving such bounds +obviously requires a quantitative definition of the ``security'' of a system; +the central question here is what definitions the papers are actually using. + +Inspection of these papers{\dash}see below for details{\dash}% +suggests that it is completely standard to use definitions +that express exactly the following situation: +there does not exist an algorithm that +breaks the system with probability above $\epsilon$ +while satisfying specified limits on ``time'', queries, etc. +{\it None\/} of the five papers deviate from this standard. +Three of the papers explicitly state new definitions +that follow this standard. +The other two papers use a classic special case, +namely PRP ``security'', without defining it; +after many years of papers repeating the same definition of PRP ``security''{\dash}a +settled definition that also follows this standard{\dash}% +the reader is forced to assume that these two papers are also using that definition. + +\subheading{Details} +The five papers are as follows, +in the same order that they appear on Springer's web site. + +Jager, Kohlar, Sch\"age, and Schwenk in \cite{2012/jager} +give a ``formal proof that [one version of TLS-DHE] is a secure authenticated +key exchange protocol'', +define ``the notion of authenticated and confidential +channel establishment'', +and prove that a particular protocol +``forms a secure ACCE protocol''. +\cite[Definition 2]{2012/jager} +defines an encryption scheme to be $(t,\epsilon)$-``secure'' +if ``all adversaries $A$ running in time at most $t$'' +have success probability at most $\epsilon$; +similar definitions continue throughout the paper. +\cite[Theorem 1]{2012/jager} +assumes that various primitives are $(t,\epsilon)$-``secure'' +and concludes that ``any adversary'' that $(t',\epsilon')$-breaks a particular protocol +with ``$t\approx t'$'' must have $\epsilon'\le\cdots$. +This is trivially equivalent to a conclusion that the protocol is $(t',\epsilon')$-``secure'' +under the definitions in the paper. + +Bellare, Ristenpart, and Tessaro in \cite[Section 2]{2012/bellare-multiinstance} +define a ``$(t,q,q_c)$-adversary'' +as an algorithm that ``runs in time $t$ and makes at most $q[i]$ encryption queries +of the form $\cdots$ and makes at most $q_c$ corruption queries'' +and then define $\Adv^{{\rm uku}}_{{\rm SE},m}(t,q,q_c)$ +as the maximum advantage among ``all $(t,q,q_c)$-adversaries''. +\cite[Theorem 1]{2012/bellare-multiinstance} +bounds ``$\Adv^{{\rm uku}}_{{\rm SE},m}(t,q,q_c)$'' +relative to another similarly defined insecurity function. + +Hoang, Morris, and Rogaway in \cite[Section 4]{2012/hoang} +briefly discuss the ``complexity-theoretic interpretation'' +of an information-theoretic result. +They do not state a formal theorem +but they say that ``the PRP-security of $E$ is the PRF-security of $F$ minus [something small]''. +The paper neither gives nor cites a definition of ``the PRP-security of $E$'', +so (as above) the reader is forced to assume +that the paper reuses the definition given in many previous papers, +referring to a low success probability of all algorithms +subject to specified limits on time and queries. +Similar comments apply to \cite[Section 5]{2012/hoang}, +which asserts that one is ``guaranteed'' +that attacks have a success probability of ``less than $10^{-10}$'' +plus the ``insecurity'' of AES. + +Landecker, Shrimpton, and Terashima in \cite[page 16]{2012/landecker} +say that ``the bulk of the paper is dedicated to showing that'' +the proposed ``CLRW2 TBC'' is ``a strong tweakable-PRP'' under various hypotheses, +one hypothesis being that the underlying cipher $E$ ``is a secure strong-PRP''. +The paper neither gives nor cites a definition of ``secure strong-PRP'', +so (as above) the reader is again forced to assume the standard definition. +This assumption is consistent with the formulation of +the ``main technical result'' of the paper, \cite[Theorem 1]{2012/landecker}, +which hypothesizes that $A$ is ``an adversary asking a total of $q$ queries to its oracles, +these of total length $\mu$, and running in time $t$', +and concludes that ``there exists an adversary $B$ using the same resources'' +whose success probability has a particular lower bound +in terms of the success probability of $A$. +The {\it existence\/} of an algorithm with specified success probability, time, etc.~is +exactly what is logically required to prove insecurity under the standard definition. + +Hofheinz and Jager in \cite[Theorem 5]{2012/hofheinz} +state, under various hypotheses, +that a particular system is ``$(\epsilon,t,\mu,q)$-IND-CCA secure''. +This means, by \cite[Definition 5]{2012/hofheinz}, +that ``there exists no adversary'' +that $(\epsilon,t)$-breaks the $(\mu,q)$-IND-CCA security of the system, +i.e., there exists no adversary ``that runs in time $t$ and wins with probability $1/2+\epsilon$'' +using $\mu$ public keys and $q$ queries. + +\subheading{Controversy} +An earlier version of this paper, predating the scan described in this appendix, +had already asserted that such definitions are completely standard. +This assertion was based on what we had seen in many papers over many years, +and we did not expect it to be controversial. +In response we encountered +a puzzling claim that such definitions are ``not widespread'' and that, as a result, +most provable-concrete-security papers do not fall into the gaps that we have identified. + +It was of course possible that our earlier selection of papers was biased by our own interests, +but this new literature sample has no obvious bias +and makes the ``not widespread'' claim quite difficult to believe. +If such definitions are ``not widespread'' +then why are they used in all five +of the provable-concrete-security papers at a recent IACR flagship conference{\dash}% +papers from +Bellare, +Hoang, +Hofheinz, +Jager, +Kohlar, +Landecker, +Morris, +Ristenpart, +Rogaway, +Sch\"age, +Schwenk, +Shrimpton, +Terashima, +and +Tessaro? +It is of course {\it still\/} possible that these five papers +are not representative of +standard practice in the literature as a whole, +but we think that anyone making such a claim +is required to supply a considerable volume of evidence. Index: faq.tex =================================================================== --- faq.tex (revision 87) +++ faq.tex (revision 178) @@ -1,6 +1,7 @@ \setcounter{section}{16} -\section{Appendix: Frequently asked questions}\label{faq} +\section{Appendix: Frequently asked questions +}\label{faq} This appendix answers several questions that we have been asked regarding this paper. @@ -9,12 +10,12 @@ \subheadingnodot{In the real world, an attack is applied to many targets. Doesn't this make the precomputation effectively free?} No, not in general. -The fallacy here is addressed in Section \ref{salvage}.1.1. +The fallacy here is addressed in Appendix \ref{salvage}.1.1. \subheadingnodot{Aren't we simply making the user safer if we underestimate attack cost by ignoring precomputation?} No. -The fallacy here is addressed in Section \ref{salvage}.1.2. +The fallacy here is addressed in Appendix \ref{salvage}.1.2. \subheadingnodot{Why are you analyzing the cost of precomputation in these attacks? Didn't you just say that this cost is irrelevant to the security definitions?} @@ -23,8 +24,12 @@ and $2^{170}$ for the NIST P-256 attack) is critical for understanding the real-world inapplicability of the attacks. We recommend fixing the security definitions to take this cost into account; -see Sections~\ref{salvage-effectivity} and \ref{salvage-recommendations}. +see Appendices~\ref{salvage-effectivity} and \ref{salvage-recommendations}. +\subheadingnodot{I don't understand how MD5 could possibly break AES. Are you sure?} +Yes. The analysis is easy, using standard heuristics, +and is backed by the computer experiments reported in Appendix~\ref{verif}. + \subheadingnodot{If one special precomputation is enough to completely break AES, isn't that serious enough that the metric should capture it?} @@ -33,50 +38,148 @@ But the difficulty of carrying out the precomputation is also important, and is ignored in the standard security definitions. We propose taking all of these features into account. -See Section \ref{salvage-recommendations}. +See Appendix \ref{salvage-recommendations}. -\subheadingnodot{ What... is the air-speed velocity of an unladen swallow?} -What do you mean? An African or European swallow? +\subheadingnodot{Sorting costs $n\log_2 n$. How can you take a metric seriously if it says that sorting costs $n^{1.5}$?} +The best reported results for the energy consumption of sorting \cite{2012/nyberg-sortbenchmark} +are 43700 items sorted per joule for $10^{10}$ 100-byte items, +and 9700 items sorted per joule for $10^{12}$ 100-byte items. +The ratio $43700/9700$ is $4.5$, +and the chip evolution reviewed in Appendix~\ref{cost-at} +can be expected to improve the small sort by a larger factor than the large sort, +pushing the ratio up towards $10$. +For comparison, +the ratio $(\log_2 10^{12})/(\log_2 10^{10})$ is only $1.2$, +obviously understating the heavy communication costs of sorting. +See \cite{1995/bilardi} for a more detailed analysis +of physical constraints upon algorithm performance. +\subheadingnodot{I spend time $A$ setting up my area-$A$ hardware, +plus time $T$ waiting for the results. +Isn't $A+T$ my actual cost?} +If you have $n$ processing cores, each carrying out a series of $n$ separate computations, +then you have carried out $n^2$ computations and used energy proportional to $n^2$, +but $A+T$ is only linear in $n$. +The RAM metric avoids this pathology +by imposing an unrealistic prohibition upon parallel computations, +but it falls prey to other pathologies. + \subheadingnodot{Isn't your $AT$ metric simply a reinvention of Wiener's full-cost metric?} It's not {\it our\/} $AT$ metric. The $AT$ metric has decades of history in algorithm analysis, and more than a decade of history in cryptanalytic algorithm analysis; see, e.g., \cite{1981/brent-areatime} and \cite{2001/bernstein-nfscircuit}. -Wiener's 2004 full-cost metric -appears to be equivalent to the $VT$ metric analyzed in, e.g., -Preparata's 1983 paper \cite{1983/preparata}. +Wiener's 2004 full-cost metric \cite{2004/wiener-fullcost} +appears to be equivalent to the $VT$ metric analyzed in the 1983 papers +\cite{1983/rosenberg} and \cite{1983/preparata}. +See \ref{faq-3d} for further analysis of $VT$. \subheadingnodot{The world is three-dimensional! Isn't $VT$ a more realistic metric than $AT$?} +\label{faq-3d} The world is three-dimensional, but power consumption and heat dissipation are two-dimensional. -A modern billion-transistor CPU is much more like a $32000\times 32000$ square +A modern billion-transistor CPU is much more like a $32768\times 32768$ square than a $1024\times 1024\times 1024$ cube; -``three-dimensional'' manufacturing technology is limited to a small number of layers. +``three-dimensional'' manufacturing technology is limited to a small number of layers +(e.g., a flat $4\times 16384\times 16384$ box). Similarly, a ``three-dimensional'' computer cluster actually provides orders of magnitude faster communication within two-dimensional chips than it does between chips. + At a larger scale, -each square meter of the earth's atmosphere receives at most $1361$ watts from the sun, +each square meter of the Earth's atmosphere receives at most $1361$ watts from the Sun, so one cannot hope to sustain an $A$-square-meter computer that consumes more than $1361A$ watts, no matter what the computer technology is. -Shifting energy through time, as illustrated by oil drilling, -can temporarily violate this maximum but evidently is not sustainable; -similar comments apply to shifting energy around the earth's surface. +A three-dimensional $VT$-optimized computer +does not have enough surface area to receive and dissipate the energy it uses, +and if one tries to ``spread it out'' to give room for more energy +then one ends up with $VT$ cost matching $AT$ cost. +One might argue that it is possible to temporarily violate the $1361A$ maximum +by shifting energy through time, as illustrated by oil drilling, +or similarly by shifting energy around the Earth's surface, +but this argument ignores the energy cost of the shifting mechanisms. -\subheadingnodot{Have you considered constructivity?} -Yes. Constructivity is another name for effectivity, -which is analyzed in Section~\ref{salvage-effectivity}. -We recommend a specific strategy for incorporating effectivity into security definitions. +Furthermore, +three-dimensional designs pose severe {\it temperature\/} problems +even if enough energy magically becomes available at the surface of the computer. +Increasing the size of a $VT$-optimized design, +while still providing the energy needed for each bit operation, +requires a corresponding increase in +the amount of power that passes from the surface {\it through\/} each point of the design. +This is incompatible with the temperature limits +that are needed for adequate error correction +in any physical system for computation. +This is true even for limited three-dimensional systems +that use only one layer for transistors and the rest for wires: +wires require energy proportional to their length. -\subheadingnodot{When you say that the algorithms are non-uniform, aren't you really trying to say that the algorithms are non-constructive?} -Almost all of the algorithms we discuss are non-uniform: -they require different precomputations for different targets. -Almost all of the algorithms we discuss are {\it also\/} non-constructive. -These are not the same concept. +We have checked that the $AT$ analyses of high-probability attacks in this paper +would produce the same results if $AT$ were replaced by $VT$, +so the criterion of avoiding these ``pathologies'' is equally served by $AT$ and by $VT$, +but $AT$ is much more accurate in modeling physical reality. -\subheadingnodot{Isn't ``non-uniformity'' a purely asymptotic concept?} +\subheadingnodot{Doesn't the physical reality actually say that each computation is $O(1)$?} +Quite possibly, yes. +We recommend Aaronson's exposition in \cite{2006/aaronson} +of a physics argument concluding that the +``maximum number of bits that could ever be used in a computation in the physical world'' +is about $10^{122}$, inversely proportional to the cosmological constant. +The two major steps in the argument are as follows. +First, trying to pack information too densely +would violate the Bekenstein--Hawking bound (see, e.g., \cite{1981/bekenstein}), +which states that the entropy inside a spherical region of space +is at most $1/4$ of the surface area (not volume) of the region measured in Planck units. +Second, as observed by Bousso in \cite{2000/bousso}, +information located too far away +is forced by the nonzero cosmological constant +to accelerate away so quickly as to become unobservable. + +This bound on the number of bits used by a computation +implies that every terminating physical computation +has overwhelming probability of terminating +within a particular constant number of bit operations. +Standard asymptotic complexity classes +such as P, BPP, BQP, etc.~have no logical connection to such short computations. +Aaronson's only suggestions for allowing +a physical realization of the standard complexity classes +are to switch to an imaginary alternate universe in which the cosmological constant is zero, +or a sequence of imaginary alternate universes with cosmological constant converging to zero. + +Fortunately, +these asymptotic issues +also have no logical connection to concrete security questions +such as the security of AES-128, NIST P-256, DSA-3072, and RSA-3072. +This paper focuses almost exclusively on concrete security questions, +with asymptotics appearing only occasionally as inspiration. +For example, +Section~\ref{rsa-asymp} briefly analyzes the asymptotics of one algorithm, +but this is merely a warmup +for the concrete analysis of the same algorithm in Section~\ref{rsa-concrete}. + + +\subheadingnodot{What{\ldots} is the air-speed velocity of an unladen swallow?} +What do you mean? An African or European swallow? + +\subheadingnodot{Have you considered effectivity?} +Yes. Effectivity is another name for constructivity, +which is analyzed in Appendix~\ref{salvage-effectivity}. +We recommend a specific strategy (which appears to be new) +for incorporating a quantitative form of constructivity into security definitions. + +\subheadingnodot{When you say that the algorithms are non-uniform, aren't you really trying to +say that the algorithms are non-constructive?}\label{uniform-vs-constructive} +Almost all of the algorithms we present are non-uniform: +for example, the RSA attack requires different precomputations +for sufficiently different modulus sizes, +and the ECC attack requires different precomputations for different curves. +Almost all of the algorithms we present are {\it also\/} non-constructive, +i.e., very hard to find; +see Appendix~\ref{salvage-effectivity} for a quantitative formalization of this intuition. +Non-uniformity and non-constructivity are not the same concept. + +\subheadingnodot{Isn't ``non-uniformity'' a purely asymptotic +concept?}\label{non-uniformity-purely-asymptotic} No. Consider, for example, a theorem proving that for each $512$-bit string $s$ there is a fast algorithm $A$ that, @@ -85,19 +188,55 @@ Consider a stronger theorem proving that there is a fast algorithm $A$ that, given $512$-bit strings $s$ and $t$ as input, prints collisions in the $512$-bit-to-$128$-bit function $x\mapsto \MDfive(s,t,x)$. -The difference betwen these theorems +The difference between these theorems is precisely in the amount of uniformity imposed on $A$. There are no asymptotics here. -\subheadingnodot{I see books defining ``non-uniform'' algorithms -specifically as families of circuits, one circuit for each input size. -Isn't this definition completely standard?} -This is one type of non-uniformity but not the only type; -one can partition inputs by size but one can also partition inputs in many other ways. -Consider, for example, -Shor's discussion in \cite[Section 2]{2001/shor} of the ``additional uniformity condition'' -required to define the complexity class BQP. +\subheadingnodot{My favorite textbook defines ``non-uniform'' algorithms +specifically as families of polynomial-size circuits, one circuit for each input size, +and defines ``uniform'' algorithms specifically as polynomial-time algorithms. +Aren't these definitions completely standard?}\label{non-uniformity-definition} +Yes, these are two quite standard uses of the word ``uniform'', +and often the only uses described in introductory textbooks, +the same way that many introductory mathematics textbooks define ``small'' +as comparing real numbers in the usual ordering. +However, +the complexity literature actually uses the name ``uniform'' +for a much wider range of definitions, +the same way that the mathematics literature uses ``small'' +for a much wider range of definitions. +Example 1: +Consider Goldreich's discussion in \cite{2009/goldreich} +of the ``amount of non-uniformity'' in a circuit family. +This is a quantitative metric (the length of an advice string), +easily capturing the introductory notions of ``uniform'' (advice limited to 0) +and ``non-uniform'' (no limit on the advice) +but also allowing analysis of intermediate amounts of uniformity. + +Example 2: +Consider Katz's discussion in \cite{2011/katz} +of ``logspace-uniform'' families of circuits. +Compared to the introductory polytime-uniform concept, +logspace-uniform restricts non-uniformity in another dimension, +different from the length of an advice string. + +Example 3: +Consider Shor's discussion in \cite[Section 2]{2001/shor} +of the ``additional uniformity condition'' +used to define the complexity class BQP: +a condition on families of real numbers used as constants in algorithms, +yet another dimension of uniformity. + +Example 4: +Consider Koiran's comment in \cite[Section 6]{2007/koiran} +that the ``only difference between VPSPACE$^0$ and uniform VPSPACE$^0$ +is the nonuniformity of the coefficient function; +VPSPACE is even more nonuniform since arbitrary constants are allowed''. + +\subheadingnodot{?} 42. + \subheadingnodot{Isn't the hashing attack in Section~\ref{aes} outperformed by the linear-cryptanalysis approach of De, Trevisan, and Tulsiani?} No. @@ -128,29 +267,92 @@ \cite[Section 3.6]{2005/bellare-intro}, \cite[Section 3.2]{2006/bellare-hmac}, etc. +Rogaway begins \cite{2006/rogaway} by describing ``The Foundations-of-Hashing Dilemma'' +with no indication that the dilemma extends beyond hashing. +\subheadingnodot{These AES-128 and NIST P-256 and DSA-3072 and RSA-3072 issues +are just quantitative, right?} +Yes, they're ``just'' quantitative, +but getting the quantitative details right has always been a major theme of +the concrete-security literature. +This paper is aiming at large gaps that affect wide portions of the literature; +for comparison, +the gaps in many well-known ``tightness'' papers +are quantitatively smaller and apply to much narrower portions of the literature. + +Bellare, Desai, Jokipii, and Rogaway wrote in \cite{1997/bellare-concrete} +that when reductions are loose +``one must use a larger security parameter to be safe, reducing efficiency. +Thus, in the end, one pays for inefficient reductions in either assurance or running time.'' +Our attacks against AES-128, NIST P-256, et al.~show that there is an even larger loss of +``either assurance or running time'' from a flaw in the standard security definitions: +specifically, an inaccurate model of the set of algorithms available to the attacker. + +\subheadingnodot{Doesn't Rogaway's ``human ignorance'' model already solve these definitional problems?} +No. +Stinson in \cite{2001/stinson} proposed studying explicit hash-function reductions +as a workaround for the difficulties of defining collision resistance, +and Rogaway in \cite{2006/rogaway} gave examples of nontrivial theorems following this proposal; +in Appendix~\ref{salvage-recommendations} we review these proposals +and recommend further modularization of such theorems. +However, this line of work +does nothing to solve the problem of finding a realistic formalization of the statement +``SHA-512 is collision-resistant''{\dash}a problem that remains unsolved today +(but see Appendix~\ref{salvage-effectivity}). +Similarly, this line of work + does nothing to solve the problem of finding a realistic formalization of statements such as +``NIST P-256 is secure''{\dash}a problem that was incorrectly believed +to be solved many years ago and that is shown in this paper to be much more subtle. + +\subheadingnodot{Are you saying that the $AT$ metric fixes everything?} +No. +Appendix~\ref{salvage-at} evaluates switching to the $AT$ metric; +our analyses suggest that this switch fixes essentially all pathologies +{\it except\/} in one corner case. +Appendix~\ref{salvage-effectivity} evaluates adding constructivity; +our analyses suggest that this fixes some corner pathologies. +Appendix~\ref{salvage-recommendations} recommends both of these changes, +and also recommends modularizing theorems to accommodate further changes. + \subheadingnodot{Are you seriously suggesting redoing all the security definitions and proofs in the literature -to use effectivity and the $AT$ metric?} +to use constructivity and the $AT$ metric?}\label{faq-seriously} Yes. A few theorems are already factored -in the way we recommend in Section~\ref{salvage-recommendations}, +in the way we recommend in Appendix~\ref{salvage-recommendations}, but the vast majority of definitions and theorems require new work. -Tightness will often change dramatically -(see, e.g., the repeated-queries discussion in Section~\ref{salvage-nand}), -and it is important to actually do the work for each theorem -so that users know what security guarantees they are actually being provided. Some proofs, such as the ``coin-fixing'' proofs criticized in \cite{2012/koblitz}, -will not survive the transition. +will not survive the transition to constructivity. +Our impression is that most proofs in the literature are constructive, +but tightness will often change dramatically +(see, e.g., the comments on repeated queries in Appendix~\ref{salvage-nand}), +imposing surprisingly small limits on the number of queries that can be tolerated +before a proof becomes vacuous +and raising the question of whether there are tighter proof strategies. +If a proof {\it has\/} been checked +then the best way to record this fact is as a new theorem, +so that users know what security guarantees they are actually being provided. +\subheadingnodot{Can't we just assume that $AT$ theorems will look the same as current theorems?} +No. +See \ref{faq-seriously}. +\subheadingnodot{Can't we just assume that constructive theorems will look the same as current theorems?} +No, not always. +See \ref{faq-seriously}. + +\subheadingnodot{How do you expect authors to learn how to do $AT$ algorithm analyses?} +The $AT$ metric is used in thousands of papers. +We recommend the classic paper \cite{1981/brent-areatime} for definitions +and for illustrative algorithm-analysis examples. + \subheadingnodot{Isn't it inappropriate to switch definitions and start writing papers using the new definitions?} There are ample precedents for this. For example, Newtonian physics was replaced when it was discovered to be a poor model of reality. -We briefly discuss a much closer example from mathematics. +We briefly review a much closer example from mathematics. In the 19th century, Kronecker questioned the significance of proofs of existence that are not {\it effective}, @@ -158,7 +360,8 @@ The classic example is the Bolzano--Weierstrass theorem, which states that an infinite sequence of real numbers $x_0,x_1,\ldots\in[0,1]$ -(more generally, a sequence of elements of a compact topological space) has an infinite subsequence that converges. The critical observation in the usual proof is that there must be infinitely many $i$ with $x_i\in [0,0.5]$, @@ -181,9 +384,18 @@ However, the introduction of ``constructive mathematics'' showed that one {\it can\/} formalize mathematics in a way that gives different meanings to these two notions, -disallowing the Bolzano--Weierstrass proof +disallowing the Bolzano--Weierstrass proof (and theorem) while preserving more explicit mathematical proofs. -This is part of the background for our quantitative approach to effectivity. +This is part of the background for our quantitative approach to constructivity. - -\subheadingnodot{?} 42. +\subheadingnodot{I told you $X$ in my last review. Why didn't you change your paper?} +Chances are excellent that we did add something to our paper in response to your review, +either to comment positively on $X$ or to explain why $X$ is incorrect. +In many cases we've added FAQ entries +to help you find the answer. +If you still haven't found the answer, +please look for comments on your review in the following anonymous web page: +\showurl{http://pastebin.com/BxQkkq6y} Index: verif.tex =================================================================== --- verif.tex (revision 0) +++ verif.tex (revision 178) @@ -0,0 +1,115 @@ +\setcounter{section}{21} + +\section{Appendix: Verification that MD5 breaks AES}\label{verif} + +For each $8$-bit string $k$ +define $E_k(p)=\AES_{k,0}(p)$, +where $k,0$ means $k$ zero-padded to $128$ bits. +This cipher $E$ is a scaled-down version of AES. + +Consider the following very fast attack, +a scaled-down version of the simple attack described in Section~\ref{aes-md5}: +take $16$ bits of ciphertext, +zero-pad to $128$ bits, +feed the result through MD5, +and take the first bit of the result +(specifically, the bottom bit of the first byte of the result). + +This attack outputs $1$ for exactly $32949$ out of all $65536$ $16$-bit strings: +i.e., the average attack output against a uniform random function +is $32949/65536\approx 0.502762$. +If this attack is applied to the first $16$ bits of $E_k(0)$ +then it outputs $1$ for exactly $114$ out of the $256$ keys $k$; +i.e., the average attack output against $E$ is $114/256\approx 0.445312$. +The success probability of the attack against $E$ +is, by definition, the absolute value of the difference of these two averages, +namely $3765/65536\approx 0.057449$. + +Table~\ref{verif-table} displays the results of similar experiments +for $K$-bit ciphers for a range of values of $K$. +In each case $2K$ bits of ciphertext are zero-padded to $128$ bits and fed through MD5. +The table is consistent with the theory +that the success probability of the attack drops as roughly $1/\sqrt{2^K}$, +and very far from consistent with the naive theory +that all very fast attacks have success probability dropping as $1/2^K$. + +\begin{table}[t] +\centerline{ +\begin{tabular}{|r|r|r|r|r|r|r|} +\noalign{\hrule} +$K$&uniform&&cipher&&success\phantom{?}&scaled\phantom{?}\\ +\noalign{\hrule} +1&1&0.250000\phantom{?}&0&0.000000&0.250000\phantom{?}&0.353553\phantom{?}\\ +2&7&0.437500\phantom{?}&4&1.000000&0.562500\phantom{?}&1.125000\phantom{?}\\ +3&27&0.421875\phantom{?}&4&0.500000&0.078125\phantom{?}&0.220971\phantom{?}\\ +4&117&0.457031\phantom{?}&6&0.375000&0.082031\phantom{?}&0.328125\phantom{?}\\ +5&491&0.479492\phantom{?}&21&0.656250&0.176758\phantom{?}&0.999893\phantom{?}\\ +6&2038&0.497559\phantom{?}&37&0.578125&0.080566\phantom{?}&0.644531\phantom{?}\\ +7&8195&0.500183\phantom{?}&64&0.500000&0.000183\phantom{?}&0.002072\phantom{?}\\ +8&32949&0.502762\phantom{?}&114&0.445312&0.057449\phantom{?}&0.919189\phantom{?}\\ +9&131279&0.500790\phantom{?}&245&0.478516&0.022274\phantom{?}&0.504003\phantom{?}\\ +10&524921&0.500604\phantom{?}&498&0.486328&0.014276\phantom{?}&0.456818\phantom{?}\\ +11&2098847&0.500404\phantom{?}&1029&0.502441&0.002037\phantom{?}&0.092197\phantom{?}\\ +12&8389411&0.500048\phantom{?}&2020&0.493164&0.006884\phantom{?}&0.440563\phantom{?}\\ +13&33555992&0.500023\phantom{?}&4063&0.495972&0.004052\phantom{?}&0.366706\phantom{?}\\ +14&134215688&0.499992\phantom{?}&8070&0.492554&0.007439\phantom{?}&0.952152\phantom{?}\\ +15&536855969&0.499986\phantom{?}&16399&0.500458&0.000472\phantom{?}&0.085383\phantom{?}\\ +16&2147481189&0.499999\phantom{?}&32644&0.498108&0.001892\phantom{?}&0.484228\phantom{?}\\ +17&?&0.500000?&65347&0.498558&0.001442?&0.522044?\\ +18&?&0.500000?&131291&0.500835&0.000835?&0.427734?\\ +19&?&0.500000?&261618&0.498997&0.001003?&0.726442?\\ +20&?&0.500000?&523667&0.499408&0.000592?&0.606445?\\ +21&?&0.500000?&1048417&0.499924&0.000076?&0.109795?\\ +22&?&0.500000?&2097213&0.500015&0.000015?&0.029785?\\ +23&?&0.500000?&4193272&0.499877&0.000123?&0.356316?\\ +24&?&0.500000?&8386407&0.499869&0.000131?&0.537354?\\ +25&?&0.500000?&16776146&0.499968&0.000032?&0.184718?\\ +26&?&0.500000?&33552224&0.499967&0.000033?&0.269531?\\ +27&?&0.500000?&67108562&0.499998&0.000002?&0.026068?\\ +28&?&0.500000?&134239208&0.500080&0.000080?&1.311035?\\ +29&?&0.500000?&268434302&0.499998&0.000002?&0.049805?\\ +30&?&0.500000?&536878982&0.500008&0.000008?&0.246277?\\ +31&?&0.500000?&1073756269&0.500007&0.000007?&0.311711?\\ +32&?&0.500000?&2147489600&0.500001&0.000001?&0.090820?\\ +\noalign{\hrule} +\end{tabular} +} +\medskip +\caption{Success probability of MD5 +as an attack against zero-padded $K$-bit AES keys. +``Uniform'' is the number of $2K$-bit strings $s$ +such that $\bits_1\MDfive(s,0)=1$, +where $s,0$ means $s$ zero-padded to $128$ bits +and $\bits_1$ means the first bit. +The subsequent column is ``uniform'' divided by $2^{2K}$. +``Cipher'' is the number of $K$-bit strings $k$ +such that $\bits_1\MDfive(\bits_{2K}\AES_{k,0}(0),0)=1$, +where $\bits_{2K}$ means the first $2K$ bits. +The subsequent column is ``cipher'' divided by $2^{K}$. +``Success'' is the success probability of the attack: +the absolute difference between ``uniform'' divided by $2^{2K}$ +and ``cipher'' divided by $2^K$. +``Scaled'' is ``success'' times $\sqrt{2^K}$. +``?'' means that ``uniform'' was not computed but was estimated to be $2^{2K}/2$. +} +\label{verif-table} +\end{table} + +The analysis in Section~\ref{aes-md5} +states that (if $K$-bit keys +do not produce surprisingly frequent collisions in $2K$ bits of ciphertext) +replacing MD5 with a uniform random function +would have probability approximately $1-\erf(x\sqrt{2})$ +of producing an attack whose success probability is at least $x/\sqrt{2^K}$; +e.g., probability approximately $0.31731$ +of producing an attack whose success probability is at least $0.5/\sqrt{2^K}$, +and probability approximately $0.98404$ +of producing an attack whose success probability is at least $0.01/\sqrt{2^K}$. +With this replacement, +the second average described above +is the average of $2^K$ independent uniform random bits +and thus has a bell-curve distribution of width roughly $1/\sqrt{2^K}$, +while the first average +is the average of $2^{2K}$ independent uniform random bits +and thus has a much narrower bell-curve distribution of width roughly $1/2^K$, +so the difference will almost always be on the scale of $1/\sqrt{2^K}$.