http://hyperelliptic.org/tanja/nonuniform-reviews.txt (also posted
anonymously as http://pastebin.com/BxQkkq6y):
the reviews from Asiacrypt 2012 and Eurocrypt 2013, and our
comments dated 2013.02.15.
http://hyperelliptic.org/tanja/nonuniform-diff.txt (also posted
anonymously as http://pastebin.com/fLtiPYQ0): the svn diff output,
more than 2000 lines, between our Eurocrypt 2013 submission and our
Crypto 2013 submission.
http://hyperelliptic.org/tanja/nonuniform-reviews-2.txt (also posted
anonymously as http://pastebin.com/kUrpYZHE): the initial reviews from
Crypto 2013, and our comments dated 2013.04.04.
========================================================================
Crypto 2013 initial review (labelled #2):
> This paper discusses the question: if one considers nonuniform methods for
> breaking various specific cryptosystems deemed to be secure, just how well can
> one do if one is willing to settle for low probability of success and/or large
> solutions? This paper presents a number of new upper bounds (according to a
> few different measures)for breaking: a function generator, discrete log, DSA,
> integer factorization; both for specific common parameters, and more
> generally. For example, a function generator with a K-bit key can be
> distinguished from random by making 2 queries, with distinguishing probability
>
> 2^{-K/3}, using circuit size 2^{K/3}. These results are all heuristically
> argued, with some strong experimental evidence given involving small examples.
> These results are inspired by, and (in some cases build upon) other results
> in this direction by a number of researchers, including Koblitz/Menezes and
> De/Trvisan/Tulsiani.
>
> The point is that all these bounds are much smaller than standard, uniform
> attacks. I find these results interesting. They are often hard to keep
> straight, and a number of sections would benefit from clear summaries as in
> Section 2.5.
>
> Appendix Q consists of the continuation of a long argument (complete with
> Monty Python references) that I feel I've come in in the middle of. Deep
> philosophical musings occur there as well as in Appendix B. Some I agree
> with, some I disagree with, and some make my head hurt. However, I don't wish
> to enter this argument here.
>
> I feel this paper should stand, not as some rebuke to the field, but merely as
> a collection of heuristically argued, nonuniform (or is it nonconstructive?)
> upper bounds.
=======================
Our comments:
> This paper discusses the question: if one considers nonuniform methods for
> breaking various specific cryptosystems deemed to be secure, just how well can
> one do if one is willing to settle for low probability of success and/or large
> solutions? This paper presents a number of new upper bounds (according to a
> few different measures)for breaking: a function generator, discrete log, DSA,
> integer factorization; both for specific common parameters, and more
> generally. For example, a function generator with a K-bit key can be
> distinguished from random by making 2 queries, with distinguishing probability
>
> 2^{-K/3}, using circuit size 2^{K/3}. These results are all heuristically
> argued, with some strong experimental evidence given involving small examples.
> These results are inspired by, and (in some cases build upon) other results
> in this direction by a number of researchers, including Koblitz/Menezes and
> De/Trvisan/Tulsiani.
>
> The point is that all these bounds are much smaller than standard, uniform
> attacks. I find these results interesting.
Thanks.
> They are often hard to keep straight, and a number of sections would
> benefit from clear summaries as in Section 2.5.
Thanks for the suggestion. We'll look for places to add summaries.
> Appendix Q consists of the continuation of a long argument (complete with
> Monty Python references) that I feel I've come in in the middle of. Deep
> philosophical musings occur there as well as in Appendix B. Some I agree
> with, some I disagree with, and some make my head hurt. However, I don't wish
> to enter this argument here.
Okay, but if you think there's a questionable statement in the paper
then it would be nice of you to explain in more detail.
> I feel this paper should stand, not as some rebuke to the field, but merely as
> a collection of heuristically argued, nonuniform (or is it nonconstructive?)
> upper bounds.
They're both non-uniform and non-constructive. See Appendix Q.13.
We're pointing out a common error in the literature, and analyzing ideas
for fixing the error. Is someone else calling this a "rebuke to the
field"?
========================================================================
Crypto 2013 initial review (labelled #4):
> This paper is a "foundations of concrete security" paper. It shows/recalls the
> existence of so-called "non-uniform" attacks on AES, discrete logs, RSA, etc.
> A non-uniform attack is an attack that requires an infeasible (or maybe just
> impractical) amount of precomputation, or that requires knowledge of some
> hard-to-find constant.
>
> For example, the attack might require a very large precomputed table (where
> each entry of the table takes considerable time to compute), or might depend
> on knowing a particular set of bits whose XOR shows a bias (even though such a
> set can be shown to exist using a probabilistic argument, finding it might be
> another matter). Essentially all of the attacks presented were previously
> known (as the authors admit), though the authors do contribute some small
> improvements here are there.
>
> The paper's main interest is at least twofold (and probably more):
>
> (i) it recalls and collects these non-uniform attacks (which seem to have been
> mostly forgotten) in one place
>
> (ii) it points out these attacks are incompatible with current security
> conjectures regarding, eg, AES. For example, section 2 points out a
> high-probability attack on AES (as a PRP) that takes time only ~ 2^{(2/3)128}
> = 2^86 and also space 2^86. (But which requires precomputation time 2^128.)
> Depending on one's definition of "security" (more on this below), the
> conjecture "AES has security > 2^100" becomes false.
>
> At this stage, we should insert a word about how the security of AES is
> typically quantified in provable security papers. First off, provable security
> papers are (almost always) interested in *reducing* the security of some other
> construction to that of AES; their main interest is to establish a theorem
> along the lines of "If AES is a good PRP, this other construction [eg,
> CBC-MAC] is good"; as such, the main object of interest is the implication
> (i.e., the existence of an efficient computational reduction), not giving a
> careful definition of what constitutes "2^100 security". Nonetheless, there is
> by now a fairly standard security metric which speaks of the existence (or
> non-existence) of a hypothetic adversary parameterized by three parameters,
> like so:
>
> - epsilon: the success probability
> - "time" t: defined as:
>
> (the size of the adversary) + (running time of adversary)
>
> where everything is in the RAM model. (And the "size" of the adversary is the
> size of its program)
> - number of queries q made by the adversary; since this is always upper
> bounded by the running time t, we will mostly ignore q
>
> Thus we may speak of an adversary that achieves parameters, say,
>
> (epsilon, t, q) = (0.9, 2^100, 2^100)
>
> in breaking the PRP-security of AES. In fact, Rogaway and Bellare (and
> others?) have made conjectures to the effect to that no such adversary exists
> for AES, while this paper presents (as mentioned above) an adversary achieving
> parameters
>
> (0.9, 2^86 + 2^86, 2^86) = (0.9, 2^87, 2^86)
>
> thus disproving the conjecture. (Even while the attack (Hellman's) has long
> been known; it's the non-uniform nature of the attack that has seemingly
> caused it to be overlooked.)
>
> One's reaction might be that "ok, so some conjecture was wrong" but as the
> authors point out, this is not really a satisfactory reaction, since we don't
> really want our security definitions to ascribe security of 2^86 to AES, when
> practical attacks have much higher cost. Thus, the authors suggest that the
> three-parameter security definition should be traded in for something that
> gives a harsher grade to non-uniform attacks.
>
> The authors have two main proposals for "repairing the definitions": (1)
> switching to something called the "AT metric" for re-quantifying the
> adversary's "running time" (the parameter t above); (2) "adding uniformity" to
> the definition of an adversary. Since these proposals are separate, I will
> consider each in turn.
>
> The AT metric: The AT metric is something coming from the hardware world,
> apparently (had never heard of it before); A stands for "area" (as in, area
> taken up on a chip) and T for "time"; thus "AT = area times time" roughly
> meaning (as far as I could tell) the space taken by the program times the
> amount of time taken by the program. The authors point the ignorant reader
> (myself) to a paper by Brent and Kung as an introduction on the topic. I
> diligently downloaded the Brent/Kung paper but, poor me, found it to be
> incomprehensible. (The current paper gives no further tutorial on the AT
> metric, either.) In my mind's approximation, however, I understand the metric
> as meaning roughly "space times time" or maybe more exactly "the integral of
> the amount of space taken up by the program, over the program's running time"
> (?). (Since the space usage is nearly constant throughout most of these
> attacks, the distinction between these two seems unimportant anyway.)
> Moreover, wherever AT measures appear in the paper, I find that my own guessed
> interpretation of the metric seems consistent with the numbers given. My first
> few questions to the authors are therefore related to this:
>
> Q1a: Is the AT metric cost of an adversary simply equivalent (up to small
> constants) to:
>
> (size of program) * (amount of time taken by the program) //
> (everything in RAM)
>
> Q1b: Is the AT metric cost of an adversary simply equivalent (up to small
> constants) to:
>
> (max space usage) * (amount of time taken by the program) //
> (everything in RAM)
>
> I suspect the answers to be no (otherwise, why use the fancy AT jargon?). Thus
> my second question is:
>
> Q2: Is there any adversary *in the paper* whose AT cost is different from
>
> (max space usage) * (amount of time taken by the program)
>
> ...? (Excluding, as the authors do, small factors.) If so, can you explain
> where the difference stems from, and which has the higher cost? (I think,
> moreover, that such an example would deserve to be pointed out somewhere in
> the paper's intro, given the prominence of the AT metric in this paper.)
>
> In the event that the answer to Q2 should be "no", my first major complaint
> with the paper is its focus on the AT metric when the much simpler to
> understand metric "(size of program) * (amount of time)" could be used for
> practical purposes instead. Indeed, I sometimes get the impression that
> technical jargon is thrown about this paper in a somewhat silly attempt to
> impress the reader, which I don't like. (Discussions about thermodynamics and
> the cosmological constant that appear in the appendix come to mind.)
>
> Taking "(size of program) * (amount of time)" as our approximation of the AT
> metric (for this review), the switching-to-the-AT-metric proposal thus
> essentially amounts to advocating the switch
>
> (size of program) + (amount of time) to (size of
> program) * (amount of time)
>
> in the definition of the "time t" taken by an adversary (see the
> three-parameter definition above). (Possibly, replacing "size of program" by
> "maximum space usage" would reflect the AT metric more accurately, but this
> doesn't seem to matter for the attacks in the paper.)
>
> Given the examples in the paper, I initially thought the above change made
> good sense, but further consideration has made me more dubious. Consider two
> adversaries A1 and A2 (whose programs are not hard to find: someone has
> actually printed the programs):
>
> A1: takes time 2^45, has program size (and space usage) 2^20
> A2: takes time 2^50, has program size (and space usage) 2^10
>
> Obviously, A1 is much more advantageous than A2 (we don't care about a 1
> megabyte program; that's fine), but A1 scores worse than A2 on the AT metric.
> The truth is, nobody cares about the difference between a 1k and a 1Mb
> adversary; we certainly don't want to count a 1Mb-program attack as "costing"
> a factor 2^10 more than a 1kb adversary. Hence, I'm not convinced the AT
> metric accurately measures what cryptanalysts do in the real world. Instead,
> the AT metric seems to open its own can of worms. (?)
>
> In fact, the entire idea of reducing an adversary's cost to a single number
> seems futile and doomed to be flawed in one way or the other. What makes more
> sense to me is simply to quantify (as is already done to a certain extent) an
> adversary's cost by in terms of several different dimensions, such as size of
> program, space consumption, time consumption, query complexity, etc. A
> computational reduction then (boringly, what other way is there) accounts for
> the blowup in each of these dimensions separately. (?)
>
> Moreover, the authors recommend "rewriting every proof in the AT metric". Of
> course, it would be very flattering for the authors if a string of 50 papers
> appeared re-doing every provable security result (possibly in "factored form",
> as the authors suggest) of the last 20 years in the AT metric, and referencing
> this paper. But what would be different in such proofs, except for the
> rephrased (possibly "factored") theorem statements? Pretty much nothing at
> all, actually: the current proofs are mostly concerned with the reduction, and
> only pay fleeting attention to the reduction's efficiency, anyway. What used
> to be a theorem/proof like:
>
> "Theorem: Let A be a such-and-such adversary with parameters (e1, t1, q1),
> then there exists a this-and-that adversary B with parameters (e2, t2, q2).
>
>
> Proof: B runs A and when... blah-blah.... Finally it is easy to see that B has
> running time at most t2, and that B makes q2 queries."
>
> This will become, in the "new form":
>
> "Theorem: Let A be such-and-such adversary with advantage e1 and making q1
> queries, then the adversary B defined by B = (function of A) is a
> this-and-that adversary with advantage e2 and making q2 queries.
>
> Proof: [exact same as before, but excludes part about t1 and t2]
>
> Theorem (second part): For the adversary B above, B has [AT] cost t2 if A has
> [AT] cost t1.
>
> Proof: It is easy to see that B's running time is only this-and-that more than
> A's."
>
> My point is that people will be *waiving their hands either way* when it comes
> to assessing the "cost" of a reduction, whether the underlying metric is AT or
> something else, as this part of the proof is always singularly uninteresting.
> That the underlying cost metric (i.e., definition of "time") has changed won't
> make the accounting any less boring or any more noteworthy. So the proposal to
> rewrite every proof in the provable security literature---as if this might
> uncover some interesting mathematics, or change our understanding of
> crypto---strikes me as pretty odd, to say the least.
>
> Having said this, there are cases where the reduction from B to A is
> non-uniform, as pointed out recently by Koblitz and Menezes. In that case
> there *is* something to be said for theorem statements that explicitly mention
> the fact (i.e., warn about the presence of non-uniformity). Put differently,
> this gives a strong reason to advocate phrasing theorems in the form "B = B(A)
> has such-and-such properties" instead of "for every A, there exists B with
> such-and-such properties". However, the occurence of such non-uniform
> reductions in the literature seems to be very rare (I am only aware of two
> examples), and what makes sense to me would be to make an inventory (and
> possibly try to "repair") those rarities, as opposed to going back and
> "rewriting" (not!) every proof. Also note the explicit-reduction theorem style
> "B = B(A)" was already advocated by Rogaway, so this isn't a new idea.
>
> So much for the AT metric and the proposal to rewrite every proof in the AT
> metric. Next up: the proposal to "add uniformity" to the security definitions.
> Here (appendix B4) the authors have some quite cute ideas. Note, firstly, that
> it's not obvious at all how to speak of a "uniform" program in a
> concrete-security setting. The authors circumvent this as follows: a program A
> can be deemed realistic (or "uniform") if there is a "small" program B that
> prints A. (Standard resources of time, program size, etc, might be considered
> on B.) The idea is that discarding programs A that don't have a "small
> descriptor" B will give a better approximation of reality. In the authors own
> words:
>
> "In general, it seems reasonable to redefine the insecurity of X as the
> maximum, over all size-limited cost-limited algorithms B that print
> cost-limited algorithms A, of the probability that A succeeds in breaking X."
>
> One thing does puzzle me: B can not only print A but also *run* A, once A is
> printed; thus, doesn't this bring us back to simply considering size-limited
> cost-limited algorithms B? (Ok: the cost limit [running time] imposed on B
> might not be equal to the cost limit [running time] imposed on A, so the
> two-step definition gives some more fine-grained control, but I'm not sure
> when such fine-grained control would actually be useful. Examples?)
>
> I find the ideas of section B4 quite interesting but I doubt they will see
> implementation in the provable security literature. The reason is similar to
> above: changing the security definition of a primitive (eg AES) to mention,
> eg, such Kolmogorov-type metrics on the attacker doesn't have any bearing on
> the computational reduction, ie, on the proof. It would just complicate
> definitions in the "preliminaries" section of the paper (and in some sense for
> nought, as various pathological cases---take collision resistance, or the
> attack of section 2.1---still slip through the net) without changing the
> provable security paper's core message, which is the reduction.
>
> Ultimately, I don't see any very good way out of the conundrum posed by
> non-uniform attacks, and in particular I am not convinced by the solutions
> advanced in this paper. At the end of the day, I suspect people will state
> their reductions, while emphasizing the reduction's efficiency in the four
> metrics of running time overhead, space usage overhead, query complexity
> overhead (if applicable) and program size overhead, and emphasizing the
> reduction's uniformity. As for conjecturing numerical bounds on the security
> of the target problem, if this is required, the numbers to use might simply be
> those of the actual best known (feasible, explicit) attacks. This essentially
> brings us back to the "human ignorance" approach of Rogaway, which I fear we
> will have to learn to live with.
>
> BOTTOM LINE: I am sympathetic to this paper and I'd like to see it published
> (at CRYPTO, why not). The main reasons I like it is that pays attention to a
> topic that has for too long been swept under the rug. The paper does have a
> number of stylistic idiosyncrasies which make it hard to love, though. These
> are:
>
> -- Refusal to properly discuss Wiener's "full-cost" paper, even while it seems
> closely related to the AT metric. We're in a community, we should all be
> polite to one another and discuss the most closely related work within
> previous crypto papers, even if the results of that work are not directly
> used. (Put such discussion in an appendix, if space is an issue.)
>
> -- Ditto, more or less, for Rogaway's human ignorance paper. Given the kinship
> between the topics (with Rogaway focusing exclusively on collision resistance)
> I would have expected that paper to be mentioned earlier than on p21. But ok,
> at least Rogaway is not inside of an FAQ-about-why-he-isn't-cited... :)
>
> -- Lots of different numbers flying about, little summary. Why not highlight
> some of the "best of" attack parameters in each section? For example, I had to
> fish out the (0.9, 2^86, 2^86) parameters given above by myself, from a medley
> of 2^K/3's and 2^2n's. I think this would greatly improve readability (while
> allowing the reader to check whether they're actually understanding the stated
> bounds or not).
>
> -- Lack of tutorial on the AT metric, general attitude that technical jargon
> is cool. If you want people to adopt a new paradigm in their field (I think
> it's fair to say most cryptographers are not familiar with the AT metric), you
> have to do a better job of introducing it.
>
> Small editorial comment: The intro mentions "2^100 bit operations" and
> mentions t as denoting time, while it's well-known that t denotes (size of
> machine) + (time) in provable security definitions. This threw me off. I would
> mention earlier the correct (traditional) definition of t as being
> (description size) + (time).
>
> Oddball question: In FAQ 22 you mention "tightness will often change
> dramatically" referring to appendix B2 for supporting evidence, but appendix
> B2 talks about switching to the NAND metric, and I don't see any relation. Can
> you comment on how/why "tightness will often change dramatically" in the AT
> metric? Specific examples?
=======================
Our comments:
> This paper is a "foundations of concrete security" paper. It shows/recalls the
> existence of so-called "non-uniform" attacks on AES, discrete logs, RSA, etc.
> A non-uniform attack is an attack that requires an infeasible (or maybe just
> impractical) amount of precomputation, or that requires knowledge of some
> hard-to-find constant.
Wording nitpick: non-uniformity is what allows the precomputation to be
dependent upon the cryptosystem, whereas non-constructivity is what
allows the precomputation to be infeasible. A definition of AES-128
security can't be uniform but can be constructive. We find the
distinction useful for many reasons.
> For example, the attack might require a very large precomputed table (where
> each entry of the table takes considerable time to compute), or might depend
> on knowing a particular set of bits whose XOR shows a bias (even though such a
> set can be shown to exist using a probabilistic argument, finding it might be
> another matter).
Right.
> Essentially all of the attacks presented were previously
> known (as the authors admit), though the authors do contribute some small
> improvements here are there.
This understates the contributions. Sections 2, 3, 4, and 5 have new
algorithm analyses; the analyses are a critical part of the paper.
Sections 3, 4, and 5 have algorithm improvements. The improvement in
Section 5 is quite large, a subexponential factor in the RAM cost.
> The paper's main interest is at least twofold (and probably more):
> (i) it recalls and collects these non-uniform attacks (which seem to have been
> mostly forgotten) in one place
> (ii) it points out these attacks are incompatible with current security
> conjectures regarding, eg, AES. For example, section 2 points out a
> high-probability attack on AES (as a PRP) that takes time only ~ 2^{(2/3)128}
> = 2^86 and also space 2^86. (But which requires precomputation time 2^128.)
> Depending on one's definition of "security" (more on this below), the
> conjecture "AES has security > 2^100" becomes false.
Yes. Note that for the RSA-3072 attack (Section 5) our algorithm
improvements are critical for beating 2^128. The paper also states and
analyzes ideas for _fixing_ the definitions.
> At this stage, we should insert a word about how the security of AES is
> typically quantified in provable security papers. First off, provable security
> papers are (almost always) interested in *reducing* the security of some other
> construction to that of AES; their main interest is to establish a theorem
> along the lines of "If AES is a good PRP, this other construction [eg,
> CBC-MAC] is good"; as such, the main object of interest is the implication
> (i.e., the existence of an efficient computational reduction),
Note that a typical paper in this area emphasizes the construction of
some new cryptosystem more heavily than the reductions; only a small
fraction of the papers are analyzing proofs for previously known
systems. But we fully agree that the existence of the reductions is
important for all of these papers.
> not giving a careful definition of what constitutes "2^100 security".
Actually, formal security definitions play a critical role throughout
the provable-security literature; in particular, definitions using
numerical time bounds play a critical role throughout the literature on
provable concrete security. Here are some illustrative examples.
1. 2007 Katz--Lindell (emphasis in original):
The Basic Principles of Modern Cryptography ...
Principle 1---Formulation of Exact Definitions
One of the key intellectual contributions of modern cryptography has
been the realization that formal definitions of security are
_essential_ prerequisites for the design, usage, or study of any
cryptographic primitive or protocol.
2. 1994 Bellare--Kilian--Rogaway, Section 1.2, similarly emphasizes the
"need" to define security (and then gives a definition):
In this paper we will show that CBC MAC construction is secure if the
underlying block cipher is secure. To make this statement meaningful
we need first to discuss what we mean by security in each case.
3. 2000 Bellare--Kilian--Rogaway admitted that the 1994 definition
allowed "pathologies". Did they abandon the idea of defining security?
No. They again stated that they needed a definition of security; they
modified the 1994 definition to avoid the "pathologies".
4. Bellare, Hoang, Hofheinz, Jager, Kohlar, Landecker, Morris,
Ristenpart, Rogaway, Schage, Schwenk, Shrimpton, Terashima, and Tessaro
published provable-concrete-security papers at Crypto 2012. _Every
single one_ of those papers _explicitly_ claims to "show" or "prove" or
"guarantee" bounds that involve the "security" (or "insecurity" or
"Adv") of some cryptographic systems.
Three of the papers explicitly define "security"/"insecurity"/"Adv" in
the main body of the paper, and two of the papers reuse the standard
concept of PRP-security. See Appendix L of our paper for more details.
> Nonetheless, there is
> by now a fairly standard security metric which speaks of the existence (or
> non-existence) of a hypothetic adversary parameterized by three parameters,
> like so:
> - epsilon: the success probability
> - "time" t: defined as:
> (the size of the adversary) + (running time of adversary)
> where everything is in the RAM model. (And the "size" of the adversary is the
> size of its program)
> - number of queries q made by the adversary; since this is always upper
> bounded by the running time t, we will mostly ignore q
> Thus we may speak of an adversary that achieves parameters, say,
> (epsilon, t, q) = (0.9, 2^100, 2^100)
> in breaking the PRP-security of AES. In fact, Rogaway and Bellare (and
> others?) have made conjectures to the effect to that no such adversary exists
> for AES, while this paper presents (as mentioned above) an adversary achieving
> parameters
> (0.9, 2^86 + 2^86, 2^86) = (0.9, 2^87, 2^86)
> thus disproving the conjecture. (Even while the attack (Hellman's) has long
> been known; it's the non-uniform nature of the attack that has seemingly
> caused it to be overlooked.)
Agreed.
> One's reaction might be that "ok, so some conjecture was wrong" but as the
> authors point out, this is not really a satisfactory reaction, since we don't
> really want our security definitions to ascribe security of 2^86 to AES, when
> practical attacks have much higher cost. Thus, the authors suggest that the
> three-parameter security definition should be traded in for something that
> gives a harsher grade to non-uniform attacks.
Yes. To be precise, the number of parameters isn't a problem per se, but
the standard definition of t is clearly a problem; and the goal is to
bring the definitions in line with the reality faced by cryptanalysts
and attackers, not specifically to punish non-uniformity.
> The authors have two main proposals for "repairing the definitions": (1)
> switching to something called the "AT metric" for re-quantifying the
> adversary's "running time" (the parameter t above); (2) "adding uniformity" to
> the definition of an adversary. Since these proposals are separate, I will
> consider each in turn.
Note that the terminology for our second recommendation is adding
"constructivity"; we separately discuss "uniformity" (see above for the
distinction) and don't recommend it. Aside from this naming, the
discussion below correctly represents our recommendations.
> The AT metric: The AT metric is something coming from the hardware world,
> apparently (had never heard of it before); A stands for "area" (as in, area
> taken up on a chip) and T for "time"; thus "AT = area times time" roughly
> meaning (as far as I could tell) the space taken by the program times the
> amount of time taken by the program.
Very roughly, yes, but for the correct numbers you have to (1) allow
parallelism and (2) account for communication cost; chip AT naturally
does this, while most definitions of "program" don't.
> The authors point the ignorant reader
> (myself) to a paper by Brent and Kung as an introduction on the topic. I
> diligently downloaded the Brent/Kung paper but, poor me, found it to be
> incomprehensible. (The current paper gives no further tutorial on the AT
> metric, either.)
Another paper that you might try, focusing on cryptanalysis, is [14].
Much lengthier expositions can be found in hardware-design textbooks.
> In my mind's approximation, however, I understand the metric
> as meaning roughly "space times time" or maybe more exactly "the integral of
> the amount of space taken up by the program, over the program's running time"
> (?). (Since the space usage is nearly constant throughout most of these
> attacks, the distinction between these two seems unimportant anyway.)
> Moreover, wherever AT measures appear in the paper, I find that my own
> guessed interpretation of the metric seems consistent with the numbers
> given.
For many critical algorithms there are huge differences among the
metrics, in fact so huge that they're easily visible in asymptotic
exponents.
A simple example to keep in mind is sorting n numbers, each having
n^{o(1)} bits. The AT cost is n^(1.5+o(1)); let's suppress n^(o(1))
factors and simplify this to n^1.5. Specifically, a square chip of area
n^1 finishes sorting in time n^0.5, using any of the following sorting
algorithms: 1977 Thompson, 1986 Schnorr--Shamir, 1987 Schimmler.
RAM time+space and RAM time*program size are much smaller, n^1, because
RAM time ignores communication cost. Note that n^1 is physically
unrealistic; see Q.6.
RAM time*space is much larger, n^2; it pays for space while failing to
allow any compensating parallelism.
The exponents here also apply to n-bit multiplication and many other
fundamental communication-intensive subroutines; see 1981 Brent--Kung.
> My first
> few questions to the authors are therefore related to this:
> Q1a: Is the AT metric cost of an adversary simply equivalent (up to small
> constants) to:
> (size of program) * (amount of time taken by the program) //
> (everything in RAM)
No. See the sorting example above: AT is n^(1.5+o(1)), while the metric
here is n^(1+o(1)), a quite severe underestimate.
> Q1b: Is the AT metric cost of an adversary simply equivalent (up to small
> constants) to:
>
> (max space usage) * (amount of time taken by the program) //
> (everything in RAM)
No. See the sorting example above: the metric here is n^(2+o(1)), a
quite severe overestimate.
> I suspect the answers to be no (otherwise, why use the fancy AT jargon?).
AT is not jargon. It's the product of A (area) and T (time).
> Thus my second question is:
> Q2: Is there any adversary *in the paper* whose AT cost is different from
> (max space usage) * (amount of time taken by the program)
> ...? (Excluding, as the authors do, small factors.)
Yes. Gaps analogous to the sorting example appear in Section 5, again so
large as to be easily visible in asymptotic exponents. The algorithm in
Section 4 has the same issues, but Section 5 is more detailed. A similar
gap is stated in Section 2.2 and appears prominently as the middle line
in the summary graph G.1.
Here are the Section 5 exponents in more detail. The best RAM time+space
for NFS (1993 Coppersmith) is L^1.902 without precomputation but L^1.638
with precomputation. The best AT for NFS (2001 Bernstein) is L^1.976
either way. The gap between RAM time+space and AT appears for exactly
the reasons stated above: NFS needs space and large-scale communication,
and is not parallelizable enough to hide all of the communication cost.
The best RAM time*space for NFS is L^2.760 (2002 Pomerance). The paper
doesn't mention the RAM time*space metric at all, so of course it
doesn't cite this. The exponent here is huge for exactly the reasons
stated above: RAM time*space pays for space while failing to allow any
compensating parallelism.
> If so, can you explain
> where the difference stems from, and which has the higher cost?
See explanations above.
> (I think,
> moreover, that such an example would deserve to be pointed out somewhere in
> the paper's intro, given the prominence of the AT metric in this paper.)
Thanks for the suggestion.
> In the event that the answer to Q2 should be "no", my first major complaint
> with the paper is its focus on the AT metric when the much simpler to
> understand metric "(size of program) * (amount of time)" could be used for
> practical purposes instead.
The answer to Q2 is "yes", so this paragraph is inapplicable.
> Indeed, I sometimes get the impression that
> technical jargon is thrown about this paper in a somewhat silly attempt to
> impress the reader, which I don't like. (Discussions about thermodynamics and
> the cosmological constant that appear in the appendix come to mind.)
Q.9, which discusses thermodynamics, answers an earlier reviewer
claiming at considerable length that AT "denies the 3D-ness of our
world".
Q.10, which discusses the cosmological constant, answers an earlier
reviewer asking "Couldn't we ... declare all cryptography secure because
breaking it seems to take at least n steps, but the universe has a
finite amount of energy and can only perform O(1) operations?"
> Taking "(size of program) * (amount of time)" as our approximation of the AT
> metric (for this review), the switching-to-the-AT-metric proposal thus
> essentially amounts to advocating the switch
> (size of program) + (amount of time) to (size of
> program) * (amount of time)
> in the definition of the "time t" taken by an adversary (see the
> three-parameter definition above). (Possibly, replacing "size of program" by
> "maximum space usage" would reflect the AT metric more accurately, but this
> doesn't seem to matter for the attacks in the paper.)
This isn't accurate. See above.
> Given the examples in the paper, I initially thought the above change made
> good sense, but further consideration has made me more dubious. Consider two
> adversaries A1 and A2 (whose programs are not hard to find: someone has
> actually printed the programs):
>
> A1: takes time 2^45, has program size (and space usage) 2^20
> A2: takes time 2^50, has program size (and space usage) 2^10
It's certainly possible for time and area to be evaluated differently,
producing different optima on this two-point area-time tradeoff curve,
and obviously an arbitrary curve can't be mathematically captured by a
single number. As the paper says, hardware designers often consider
functions of (A,T) more general than AT.
However, AT is the _primary_ metric in thousands of hardware-design
papers, including pretty much every serious paper on cryptanalytic
hardware. The core reason is parallelism. For example, even if your
(2^10,2^50) chip can't be parallelized into (2^20,2^40), it can
certainly be parallelized into (2^20,2^50) solving 2^10 problems at
once, and more generally into (2^20 M,2^50 N) solving 2^10 MN problems.
For comparison, your (2^20,2^45) chip implies (2^20 M,2^50 N) solving
just 2^5 MN problems, which isn't as good. This is perfectly captured by
AT, and more generally by the standard concept of price-performance
ratio.
Of course, if one of the algorithms can be adapted to achieve better
than (2^20,2^50) solving 2^10 problems, great. This improvement is also
visible in AT.
> Obviously, A1 is much more advantageous than A2 (we don't care about a 1
> megabyte program; that's fine), but A1 scores worse than A2 on the AT metric.
We _do_ care about a 1-megabyte chip. See above.
> The truth is, nobody cares about the difference between a 1k and a 1Mb
> adversary; we certainly don't want to count a 1Mb-program attack as "costing"
> a factor 2^10 more than a 1kb adversary. Hence, I'm not convinced the AT
> metric accurately measures what cryptanalysts do in the real world. Instead,
> the AT metric seems to open its own can of worms. (?)
See above.
> In fact, the entire idea of reducing an adversary's cost to a single number
> seems futile and doomed to be flawed in one way or the other. What makes more
> sense to me is simply to quantify (as is already done to a certain extent) an
> adversary's cost by in terms of several different dimensions, such as size of
> program, space consumption, time consumption, query complexity, etc. A
> computational reduction then (boringly, what other way is there) accounts for
> the blowup in each of these dimensions separately. (?)
It's of course possible that allowing more dimensions will capture more
information, and we specifically recommend refactoring theorems to allow
further changes in the cost metric, but at this point we don't have any
evidence to support going beyond AT+constructivity. As for "boring", see
below.
> Moreover, the authors recommend "rewriting every proof in the AT metric". Of
> course, it would be very flattering for the authors if a string of 50 papers
> appeared re-doing every provable security result (possibly in "factored form",
> as the authors suggest) of the last 20 years in the AT metric, and referencing
> this paper. But what would be different in such proofs, except for the
> rephrased (possibly "factored") theorem statements? Pretty much nothing at
> all, actually: the current proofs are mostly concerned with the reduction, and
> only pay fleeting attention to the reduction's efficiency, anyway.
This is a serious overgeneralization; see, e.g., the Shoup RSA-OAEP
example below.
> What used
> to be a theorem/proof like:
> "Theorem: Let A be a such-and-such adversary with parameters (e1, t1, q1),
> then there exists a this-and-that adversary B with parameters (e2, t2, q2).
>
> Proof: B runs A and when... blah-blah.... Finally it is easy to see that B has
> running time at most t2, and that B makes q2 queries."
>
> This will become, in the "new form":
>
> "Theorem: Let A be such-and-such adversary with advantage e1 and making q1
> queries, then the adversary B defined by B = (function of A) is a
> this-and-that adversary with advantage e2 and making q2 queries.
>
> Proof: [exact same as before, but excludes part about t1 and t2]
>
> Theorem (second part): For the adversary B above, B has [AT] cost t2 if A has
> [AT] cost t1.
>
> Proof: It is easy to see that B's running time is only this-and-that more than
> A's."
For theorems where the cost analysis really _is_ easy, yes, the second
theorem is easy, and that's the factorization we're recommending.
However, there are many important cases where the cost analysis is much
more subtle, and then there's much more work in the second theorem. See
below.
> My point is that people will be *waiving their hands either way* when it comes
> to assessing the "cost" of a reduction, whether the underlying metric is AT or
> something else, as this part of the proof is always singularly uninteresting.
This is again an overgeneralization.
The big picture is that there is an interplay between analysis and
optimization of algorithms. Often the analysis is the interesting part.
Reduction algorithms are no exception; for example, the CBC-MAC
reduction is trivial (compose A with CBC), while its probability
analysis is much more interesting.
Sometimes the _speed_ analysis is the interesting part. For example,
Shoup's RSA-3-OAEP security proof needs Coppersmith's famous LLL-based
root-finding algorithm, and relies critically on the subtle fact that
LLL runs in polynomial time. Concrete bounds need more detailed LLL
analyses.
There are, furthermore, a huge number of security proofs that rely
critically on eliminating repeated queries. This is uninteresting in the
RAM metric but a huge research challenge in the AT metric---it's quite
unclear how to preserve tightness in the AT metric.
> That the underlying cost metric (i.e., definition of "time") has changed won't
> make the accounting any less boring or any more noteworthy.
This really depends on which security proof you're looking at. There are
certainly cases where the cost accounting is very easy, and we're not
suggesting making that any harder.
> So the proposal to
> rewrite every proof in the provable security literature---as if this might
> uncover some interesting mathematics, or change our understanding of
> crypto---strikes me as pretty odd, to say the least.
As the LLL/Coppersmith/Shoup RSA-3-OAEP example shows, there are
important cases where performance analysis and optimization are by far
the largest part of the work that goes into reduction algorithms.
> Having said this, there are cases where the reduction from B to A is
> non-uniform, as pointed out recently by Koblitz and Menezes. In that case
> there *is* something to be said for theorem statements that explicitly mention
> the fact (i.e., warn about the presence of non-uniformity). Put differently,
> this gives a strong reason to advocate phrasing theorems in the form "B = B(A)
> has such-and-such properties" instead of "for every A, there exists B with
> such-and-such properties".
Yes, this is another advantage of stating theorems the way we recommend.
> However, the occurence of such non-uniform
> reductions in the literature seems to be very rare (I am only aware of two
> examples), and what makes sense to me would be to make an inventory (and
> possibly try to "repair") those rarities, as opposed to going back and
> "rewriting" (not!) every proof.
How does the reviewer expect to make this inventory without doing the
work of checking every proof? Why should the results of this inventory
be recorded in some ad-hoc form instead of as useful theorems? This
suggestion seems strictly worse than the recommendations in the paper.
> Also note the explicit-reduction theorem style
> "B = B(A)" was already advocated by Rogaway, so this isn't a new idea.
We trace the history in Section B.6.
> So much for the AT metric and the proposal to rewrite every proof in the AT
> metric. Next up: the proposal to "add uniformity" to the security definitions.
> Here (appendix B4) the authors have some quite cute ideas. Note, firstly, that
> it's not obvious at all how to speak of a "uniform" program in a
> concrete-security setting. The authors circumvent this as follows: a program A
> can be deemed realistic (or "uniform") if there is a "small" program B that
> prints A. (Standard resources of time, program size, etc, might be considered
> on B.)
The details are important here. What we actually recommend is to
consider small _chips_ B; this puts a limit on the total space that B is
using while it runs.
> The idea is that discarding programs A that don't have a "small
> descriptor" B will give a better approximation of reality. In the authors own
> words:
> "In general, it seems reasonable to redefine the insecurity of X as the
> maximum, over all size-limited cost-limited algorithms B that print
> cost-limited algorithms A, of the probability that A succeeds in breaking X."
>
> One thing does puzzle me: B can not only print A but also *run* A, once A is
> printed;
No, it typically can't, at least not at the same speed.
The first example in the section is a large chip A "containing billions
of standard AES key-search units". This chip is printed at moderate cost
by a very small chip B. Yes, B can _print_ A, but if it tries to _run_
the same brute-force search then it ends up running much more slowly
than A. Parallelism is critical here: because A is much larger than B it
can perform many more computations in parallel.
An even more striking example is a small chip B that prints a large chip
A that runs NFS. Trying to run NFS in a small amount of space is a
performance disaster; A benefits both from its parallelism and from the
sheer amount of storage that it has available.
> thus, doesn't this bring us back to simply considering size-limited
> cost-limited algorithms B? (Ok: the cost limit [running time] imposed on B
> might not be equal to the cost limit [running time] imposed on A, so the
> two-step definition gives some more fine-grained control, but I'm not sure
> when such fine-grained control would actually be useful. Examples?)
See above.
> I find the ideas of section B4 quite interesting but I doubt they will see
> implementation in the provable security literature. The reason is similar to
> above: changing the security definition of a primitive (eg AES) to mention,
> eg, such Kolmogorov-type metrics on the attacker doesn't have any bearing on
> the computational reduction, ie, on the proof. It would just complicate
> definitions in the "preliminaries" section of the paper (and in some sense for
> nought, as various pathological cases---take collision resistance, or the
> attack of section 2.1---still slip through the net) without changing the
> provable security paper's core message, which is the reduction.
Some reduction algorithms, notably the 2006 Bellare reduction for NMAC,
are highly non-constructive; this would have been detected by analysis
using the cost metric proposed in this section. The problems with that
particular proof can also be detected in other ways, but this example
provides some support for the general idea that improvements in the
realism of a security metric are visible in reduction algorithms, not
just in attack algorithms.
> Ultimately, I don't see any very good way out of the conundrum posed by
> non-uniform attacks, and in particular I am not convinced by the solutions
> advanced in this paper.
This paper points out a common error in the literature---a gap between
actual feasibility and common definitions of feasibility---and analyzes
the merits of several ideas for closing this gap. Our recommendations
clearly make the gap much smaller, and at this point there's no evidence
that there's any remaining gap---but we also recommend a theorem style
that will easily accommodate any future changes that turn out to be
desired.
> At the end of the day, I suspect people will state
> their reductions, while emphasizing the reduction's efficiency in the four
> metrics of running time overhead, space usage overhead, query complexity
> overhead (if applicable) and program size overhead, and emphasizing the
> reduction's uniformity.
This sounds like it would be at least a partial step forward. We are
more optimistic about the ability of the provable-security literature
to adopt realistic cost metrics for algorithms.
> As for conjecturing numerical bounds on the security
> of the target problem, if this is required, the numbers to use might simply be
> those of the actual best known (feasible, explicit) attacks.
What do these "security" conjectures mean if there isn't a definition of
"security"?
> This essentially brings us back to the "human ignorance" approach of
> Rogaway, which I fear we will have to learn to live with.
We fully agree with the idea of stating explicit reduction theorems,
but this does nothing to help us define security.
> BOTTOM LINE: I am sympathetic to this paper and I'd like to see it published
> (at CRYPTO, why not). The main reasons I like it is that pays attention to a
> topic that has for too long been swept under the rug.
Thanks.
> The paper does have a
> number of stylistic idiosyncrasies which make it hard to love, though.
We're happy to take concrete suggestions regarding exposition,
citations, etc.
> These are:
> -- Refusal to properly discuss Wiener's "full-cost" paper, even while it seems
> closely related to the AT metric. We're in a community, we should all be
> polite to one another and discuss the most closely related work within
> previous crypto papers, even if the results of that work are not directly
> used. (Put such discussion in an appendix, if space is an issue.)
Can you please identify the idea (or other contribution) that you'd like
us to credit to 2004 Wiener, and explain what you see as the connection
between that idea and our paper?
> -- Ditto, more or less, for Rogaway's human ignorance paper. Given the kinship
> between the topics (with Rogaway focusing exclusively on collision resistance)
> I would have expected that paper to be mentioned earlier than on p21. But ok,
> at least Rogaway is not inside of an FAQ-about-why-he-isn't-cited... :)
Same questions here.
> -- Lots of different numbers flying about, little summary. Why not highlight
> some of the "best of" attack parameters in each section? For example, I had to
> fish out the (0.9, 2^86, 2^86) parameters given above by myself, from a medley
> of 2^K/3's and 2^2n's. I think this would greatly improve readability (while
> allowing the reader to check whether they're actually understanding the stated
> bounds or not).
We'll look for places to add more summaries.
> -- Lack of tutorial on the AT metric, general attitude that technical jargon
> is cool. If you want people to adopt a new paradigm in their field (I think
> it's fair to say most cryptographers are not familiar with the AT metric), you
> have to do a better job of introducing it.
It's really not appropriate to try to squeeze an AT tutorial into this
paper. We'll add further references to hardware-design textbooks.
The comment regarding "technical jargon" is too vague to evaluate. We're
happy to consider more specific comments.
> Small editorial comment: The intro mentions "2^100 bit operations" and
> mentions t as denoting time, while it's well-known that t denotes (size of
> machine) + (time) in provable security definitions. This threw me off. I would
> mention earlier the correct (traditional) definition of t as being
> (description size) + (time).
The first mention of "time" is simply a quote from [8], the classic
Bellare--Kilian--Rogaway paper introducing provable concrete security.
Later, when we review the definition of the RAM metric, we say that [8]
defines running time as "A's actual execution time plus the length of
A's description". This isn't our favorite terminology but we don't have
a good enough reason to change it.
> Oddball question: In FAQ 22 you mention "tightness will often change
> dramatically" referring to appendix B2 for supporting evidence, but appendix
> B2 talks about switching to the NAND metric, and I don't see any relation. Can
> you comment on how/why "tightness will often change dramatically" in the AT
> metric? Specific examples?
The paper highlights repeated-query elimination. This is cheap for RAM,
but very expensive for NAND, and very expensive for AT.
========================================================================
Crypto 2013 initial review (labelled #1):
> In 1980, Hellman published his well-known Time-Memory trade-off which showed
> the existence of a time/memory 2^{2k/3} key-recovery attack on any blockcipher
> with k-bit keys. This is 2^{86} for AES, much better than exhaustive
> key-search or known attacks. However from a practical perspective this attack
> has been ignored because Hellman's proof is non-constructive. It shows the
> attack EXISTS but not how to find it. Fiat and Naor (1999) generalize this.
> De, Trevisan and Tulsiani (CRYPTO 2010) provide similar attacks on PRGs. The
> submission extends this to certain elliptic curves, DSA and RSA.
>
> The fact that there exist attacks better than we can actually find and
> implement has been well known for a while, and, as the authors themselves say,
> of no particular practical relevance. So why renewed attention to this theme?
> The authors' say that these attacks disprove certain conjectures made in
> theoretical papers about the security of, say, AES.
>
> That's true, and it is a good point. But why does this matter? This is where
> the story of the paper breaks down. It fails to provide a clear and complete
> picture of the implications. As a result the paper is misleading, leaving the
> impression that something is hugely wrong with the theory. This impression is
> false.
>
> The bottom line is the following. The assurance we get from the bulk of
> security proofs in the practice-oriented theory literature is not changed ONE
> BIT by this paper. It remains exactly what it was, bounds included. How can
> this be? Because I was careful above to talk of the assurance one gets from
> the proofs, not the theorem staments. Some (not all, or even most!) theorem
> statements in the literature use the notation, discussed in the paper, of
> advantage maximized over all adversaries of certain resources. But the
> maximization covers attacks one cannot effectively find. This means that if
> you attempt to get conclusions directly from the theorem statements, you will
> get quantitatively weaker ones than you would like. This (in my view) is the
> actual import of the paper. But almost all proofs in the practice-oriented
> literature provide blackbox reductions. Any attack on the scheme (eg. CBC-MAC)
> translates into an attack on the primitive (eg. AES) of the complexity stated
> in the theorem and proof. Attacks that one cannot find thus become irrelevant,
> and the conclusions one draws are in fact exactly the ones one expects and
> wants from the bounds.
>
> For example, the paper begins by talking of the belief that an attacker
> limited to 2^{100} operations and 2^{50} message blocks cannot break
> AES-CBC-MAC with probability more than 2^{-20}. Fact is that the assurance
> existing proofs provide about this claim is entirely unchanged by the contents
> and arguments of this paper.
>
> That is NOT the message you would get from the paper. The paper says ``but
> undermining these conjectures also means undermining all of the security
> arguments that have those conjectures as hypotheses.'' This is not true. I do
> not dispute that the conjectures made in some papers are false. But the
> conclusions that we want to draw, from a practical perspective, remain in most
> cases the same as prior to this paper.
>
> The exception is proofs not underlain by blackbox reductions, for example ones
> that exploit non-uniformity. But in the practice-oriented literature this
> situation is rare. (The only example I know is Bellare's 2006 proof for
> HMAC.) Even in this case, the proof and approach are not invalidated. It is
> just that the numerical guarantees one can conclude are not at as good as one
> would like.
>
> The paper discusses various possible ``remedies'' for the issue they raise.
> They fail to discuss the most obvious remedy, one that works and is already in
> use, namely that theorems make explicit the use of a blackbox reduction and
> entirely avoid the notation of maximum adversary advantage. Thus the theorem
> statement would be that there is an algorithm B, explicitly given in the
> proof, such that for any adversary A against the scheme, adversary B^A breaks
> the primitive, giving now the relation between the advantages of A and B^A,
> and their running times. Advantages here pertain to adversaries, not resource
> bounds, so there is no maximization. To avoid possible anamolies associated to
> proving existence of a weird B that one cannot find, it is stated that B is
> explicitly given in the proof, and if one wants to be pedantic, the code of B
> could be provided in the theorem statement. This paradigm was explicitly used
> in the past and now implicitly underlies theorem statements in the literature,
> and could simply be made more explicit. The submission does not discuss this,
> but instead discuss at length the deficiencies of numerous other possible
> remedies, leaving the false impression that the problem is insoluble.
>
> Another fact that one does not learn from the paper is that use of the
> offending notation is not so widespread, and in fact later papers have tended
> to adopt the paradigm outline above, in which theorems are stated directly in
> terms of adversary advantage, saying that given A we can build B so that the
> advantage of A is bounded above in a certain way by the advantage of B. In
> this case (given that the derivation of B from A is constructive and
> blackbox), the issue raised by the paper would not arise.
>
> I would state the point of the paper as follows. Let us suppose that the users
> of cryptographic proofs want to look only at theorem statements and ignore the
> proofs. Now if one wants to obtain practical bounds from the theorem
> statements, the maximum advantage notation should be avoided. Not because it
> is wrong, but because the concrete security bounds that emanate are weaker
> than what we would like and expect. This is because the maximum advantage
> notation inadvertently includes in its quantification attacks that can be
> proven to exist but cannot be efficiently found. These issues of language are
> important, but they should not displace the bottom line, which is that the
> bulk of proofs are still giving us the assurance of their stated bounds, and
> that there are simple ways to re-state theorems so that the problem
> disappears. The paper does not make this clear and indeed leaves a reader with
> the opposite impression. The result is to spread mud on provable security in a
> broad and unjustified way.
>
> This paper does make a good point and one that has surprisingly escaped
> attention for years. (The point relies on no more than Hellman's 1980 results
> which were known at the time BKR appeared.) And I believe our field is open
> to, and receptive of, criticism. But criticism should be part of a fair,
> complete and balanced picture. The paper does not provide this.
>
> Some might suggest that it is the job of others to provide the complete and
> balanced perspective lacking in this paper. I disagree. It is the job of any
> scientist, in critiquing, to provide such a perspective themsleves, at least
> to
> some, limited extent. Otherwise, the field is going to degnerate into a series
> of arguments.
>
> The technical content of the paper is disjoint from its message. As noted
> above, the conjectures they claim to disprove were already disproved by
> Hellman's 1980 results. The results of the paper are not needed to do this.
> The paper goes on to provide more proofs of existence of attacks that cannot
> actually be mounted, on various primitives, but why they matter is not clear.
>
> Section 2 of the paper, breaking AES, is 3 pages long, and contains nothing
> new. As the authors indicate at the end of the section, all the attacks they
> discuss have appeared before.
=======================
Our comments:
After receiving the Eurocrypt 2013 reviews we made huge changes to this
paper---the svn diff output from the Eurocrypt submission to the Crypto
submission is more than 2000 lines. We also added Q.27 pointing to our
anonymous page http://pastebin.com/BxQkkq6y with thorough comments on
the reviews.
This Crypto 2013 review is word-for-word identical to one of the
Eurocrypt 2013 reviews, except for the edits discussed below. It
includes many obsolete statements, misconceptions, and outright errors,
all handled or rebutted as part of our Crypto submission.
We've already posted our comments on the Eurocrypt review, and there's
no point in posting those comments again. Many parts of the review have
become even more absurd as statements about the Crypto submission than
they were as statements about the Eurocrypt submission, but elaborating
on this doesn't seem to serve much purpose, since it's clear that the
reviewer didn't even read Section 1 of the Crypto submission.
Let's just look at the edits that were made to the review. - means a
removed line, + means an added line.
>-(They do not cite the CRYPTO 2010 paper but they do refer to its 2009
>-ECCC version.)
Evidently the reviewer checked one bibliography entry, the one he had
complained about, to see that we had updated it.
>-The authors say that these attacks disprove certain conjectures made in
>+The authors' say that these attacks disprove certain conjectures made in
Typo---not a typo removed, but a typo added. Weird.
> That's true, and it is a good point.
>-But what are its implications?
>-Here the discussion of the paper is incomplete and unclear,
>-and as
>+But why does this matter?
>+This is where the story of the paper breaks down.
>+It fails to provide a clear and complete picture of the implications.
>+As
> a result the paper is misleading,
Irrelevant wording changes.
>-the proofs, not the theorem statements.
>+the proofs, not the theorem staments.
Another typo added. Maybe the Eurocrypt system did a spell check.
> But almost all proofs in the practice-oriented
>-literature provide uniform, blackbox reductions.
>+literature provide blackbox reductions.
Could discuss this tweak but the whole sentence is irrelevant---the
referee's central "most proofs are fine" message has never contradicted
anything in the paper. Section 1 now says quite explicitly that our
impression is that most proofs are fine.
>-The exception is proofs underlain by non-uniform complexity.
>+The exception is proofs not underlain by blackbox reductions, for example ones
>+that exploit non-uniformity.
Again could discuss; again not relevant.
>+The paper discusses various possible ``remedies'' for the issue they raise.
>+They fail to discuss the most obvious remedy, one that works and is already in
>+use, namely that theorems make explicit the use of a blackbox reduction and
>+entirely avoid the notation of maximum adversary advantage. Thus the theorem
>+statement would be that there is an algorithm B, explicitly given in the
>+proof, such that for any adversary A against the scheme, adversary B^A breaks
>+the primitive, giving now the relation between the advantages of A and B^A,
>+and their running times. Advantages here pertain to adversaries, not resource
>+bounds, so there is no maximization. To avoid possible anamolies associated to
>+proving existence of a weird B that one cannot find, it is stated that B is
>+explicitly given in the proof, and if one wants to be pedantic, the code of B
>+could be provided in the theorem statement. This paradigm was explicitly used
>+in the past and now implicitly underlies theorem statements in the literature,
>+and could simply be made more explicit. The submission does not discuss this,
>+but instead discuss at length the deficiencies of numerous other possible
>+remedies, leaving the false impression that the problem is insoluble.
A whole new paragraph, full of errors. The primary error is the claim
that this is a solution, and the secondary error is the claim that we
don't discuss this solution.
First of all, Section 1 states
We also recommend refactoring theorems to simplify further changes,
whether those changes are for even better accuracy or for other
reasons.
For example, we recommend factoring the main Bellare--Kilian--Rogaway
CBC-MAC security theorem into "(1) a lower-level theorem stating
Theorem. AdvPRF_{CBC^m(F)}(A) <= AdvPRP_F(UseCBC(A)) + q^2m^2/2^{l-1}
and (2) a cost comparison of UseCBC(A) with A." The details and analysis
of this recommendation, including historical credits to 2001 Stinson and
2006 Rogaway and other papers, occupy almost a full page in B.6.
The reviewer's "obvious remedy" is almost exactly this recommendation
from the paper. The only difference is that the reviewer, copying 2006
Rogaway, merges the two theorems into one theorem (stating definitions
of reductions such as UseCBC inside the theorem), while we factor the
theorems into two (stating definitions of reductions outside the
theorems). B.6 explains the advantages of this factorization; as far as
we know there are no disadvantages.
At this point it should be clear that the reviewer is completely wrong
in claiming that we "fail to discuss" this "remedy". Not only do we
discuss it, but we explicitly _recommend_ a more advanced version of it.
The bigger problem is that this is _not_, in fact, a solution for the
definitional problems raised in the paper. This error is already covered
in Q.20.
By the way, "remedies" is a fabricated quote; the paper does not say
"remedies" (or "remedy").
>-to state theorems
>+to adopt the paradigm outline above, in which theorems are stated
Irrelevant wording tweak.
>-the issue raised by the paper would not appear to arise.
>+the issue raised by the paper would not arise.
Minor increase in confidence level in a paragraph that's already solidly
demonstrated (by Appendix L) to be completely wrong.
>-That
>+Let us suppose that the users
>+of cryptographic proofs want to look only at theorem statements and ignore the
>+proofs. Now
> if one wants to obtain practical bounds from the theorem statements,
We already have detailed comments on the importance of theorems; no need
to repeat.
>-That we should, as a community, consider if there is some replacement.
>-(No convincing replacement emerges from the paper.)
Previously the reviewer was endorsing the idea of _fixing_ the problem.
The paper puts considerable effort into exactly this, but apparently the
reviewer never read any of that. Now the reviewer has even removed this
endorsement.
>+Not because it is wrong,
>+but because the concrete security bounds that emanate are weaker
>+than what we would like and expect.
Addressed in Q.19.
>+This is because the maximum advantage notation
>+inadvertently includes in its quantification
>+attacks that can be proven to exist but cannot be efficiently found.
As before, this is not just a matter of "notation"; there are important
differences between the _sets_ of attacks being considered. "Proven" is
irrelevant.
> the bulk of proofs are still giving us the assurance of their stated
>-bounds.
>+bounds, and that there are simple ways to re-state theorems so that the problem disappears.
See the "remedies" discussion above.
>-provide such a perspective themselves, at least to
>+provide such a perspective themsleves, at least to
Typo.
>-Otherwise, the field is going to degenerate into a series
>+Otherwise, the field is going to degnerate into a series
Typo.
>+The technical content of the paper is disjoint from its message.
This appears to be a claim that the paper is making a false or
unjustified statement. However, the reviewer fails to identify any such
statement. The paper itself quite clearly states what its message is---
Primary contribution of this paper. Our primary goal in this paper is
to convincingly undermine all of the standard security conjectures
reviewed above. Specifically, Sections 2, 3, 4, and 5 show---assuming
standard, amply tested heuristics---that there _exist_
high-probability attacks against AES, the NIST P-256 elliptic curve,
DSA-3072, and RSA-3072 taking considerably less than 2^128 time.
... Secondary contribution of this paper. Our secondary goal in this
paper is to propose a rescue strategy: a new way to define
security---a definition that restores, to the maximum extent
possible, the attractive three-part security arguments described
above.
---but this reviewer goes on and on attacking the strawman message that
there's a problem with many proofs. Sure, if the paper were saying that
there's a problem with many proofs then there would be a lack of
technical justification, but the paper _doesn't_ say any such thing.
>+As noted
>+above, the conjectures they claim to disprove were already disproved by
>+Hellman's 1980 results. The results of the paper are not needed to do this.
False. The reviewer's basic error here is discussed below, but let's
start with some examples showing how absurd the reviewer's position is.
This paper points to the Eurocrypt 1996 Bellare--Rogaway signatures
paper (which has been cited 807 times), the 2005 Bellare--Rogaway
"Introduction to modern cryptography" (cited 70 times), and the Crypto
2006 Bellare HMAC paper (cited 172 times), and in particular to various
"security" conjectures in those papers.
We say, and the reviewer seems to admit, that all of these conjectures
are false. Does the reviewer claim that this was obvious from the
previous literature? If so, how does the reviewer explain
* these papers making the false conjectures in the first place, and
* out of the ~1000 followup papers, not a single one before ours
stating that the conjectures are false?
The paper is quite clear in giving credit for what was actually
published before:
Our attacks on AES, NIST P-256, DSA-3072, and RSA-3072 use many
standard cryptanalytic techniques cited in Sections 2, 3, 4, and 5.
We introduce new cost analyses in all four sections, and new
algorithm improvements in Sections 3, 4, and 5; our improvements are
critical for beating 2^128 in Section 5. In Sections 2, 3, and 4 the
standard techniques were already adequate to (heuristically) disprove
the standard 2^128 concrete-security conjectures, but as far as we
know we were the first to point out these contradictions. We do not
think the contradictions were obvious; in many cases the standard
techniques were published decades _before_ the conjectures!
The reviewer
* takes one part of this---the fact that Hellman's algorithm is
adequate for our disproof of the 2^128 AES security conjecture;
* uses the passive voice to suppress our role here and to falsely
move the disproof back in time---claiming that the 2^128 security
conjecture was "already disproved by Hellman's 1980 results"; and
* ignores everything else that the paragraph says.
The reviewer seems to think that
(1) stating the attack,
(2) analyzing the speed of the attack, and
(3) observing that the attack violates a "security" conjecture
are all the same thing---but that's obviously incorrect; for example, #3
depends on how "security" is defined, while #1 and #2 don't. Note that
quite a few other reviewers seem to understand this distinction.
>+The paper goes on to provide more proofs of existence of attacks that cannot
>+actually be mounted, on various primitives, but why they matter is not clear.
AES, NIST P-256, DSA-3072, and RSA-3072 are the most commonly used
primitives believed to offer 2^128 security. The attacks in Sections
2--5 show that formal-definition "security" is below 2^128, but the
difference depends strongly on the system. This is interesting per se
and is also important for B.1.2.
>+Section 2 of the paper, breaking AES, is 3 pages long, and contains nothing
>+new.
False. As Section 1 states: "We introduce new cost analyses in all four
sections, and new algorithm improvements in Sections 3, 4, and 5." Both
Section 1 and Section 2 clearly state that
* the algorithms in Section 2 are not new, but
* some of the cost analyses in Section 2 are new.
Most importantly, the AT analyses are new, and are important for our
subsequent analyses of various ideas to fix the definitions.
>+As the authors indicate at the end of the section, all the attacks they
>+discuss have appeared before.
This is correct, but our statement "All of the attacks described here
have appeared before." most certainly does not justify this reviewer's
false "Section 2 ... contains nothing new" statement.
The number of errors piled up by this reviewer is truly astonishing.
========================================================================
Crypto 2013 initial review (labelled #3):
> I have sent comments to the authors before. I know they have read them, since
> I also read the authors' response at http://pastebin.com/BxQkkq6y
>
> None of this changed my opinions on the paper.
=======================
Our comments:
That web page quotes ten conflicting reviews---some positive, some
negative. This reviewer says he wrote _one_ of those reviews but doesn't
specify which. This creates an impressive level of ambiguity as to what
this reviewer's position actually is.
Crypto 2013 claims that it has a "rebuttal" process is giving authors
"the opportunity to comment on the initial reviews written on their
submissions". Obviously this reviewer _has_ comments for the authors,
but is quite effectively hiding those comments from us, denying us the
opportunity to comment and subverting the rebuttal process.
Of course, we have posted thorough comments on all of the reviews and
linked to those comments from the paper---but the formal rules say that
committee members aren't required to read these comments. This isn't
just a theoretical possibility; it's crystal clear that some of our
Asiacrypt/Eurocrypt/Crypto reviewers haven't even read through the
_paper_, never mind the comments.
Any self-respecting scientist, seeing that his review is out of whack
with other reviews, would feel obliged to investigate further and not
evade discussions. But the formal rules don't say this. We're faced with
at least two Crypto reviewers who don't seem to have read the paper
itself, and at least one who obviously hasn't read our comments, despite
the obvious conflicts between reviews.
Supposedly the rebuttal is different: the program chairs _do_ tell the
reviewers to read the rebuttal. Reviewers who ignore the rebuttal aren't
merely embarrassing themselves as scientists; they're also violating the
conference rules.
_This_ reviewer says that he has read our response to his earlier review
and that he isn't convinced. He has, however, sabotaged our opportunity
to convince the _other_ reviewers that this reviewer's comments are
wrong.
This review doesn't make even the slightest pretense of normalcy. The
reviewer was obliged to write, and failed to write, a proper scientific
review that states the reviewer's views of the submission. Even if this
had been merely a copy and paste of an earlier review, we would at least
have known the reviewer's position and had an opportunity to respond to
it in our formal rebuttal.
There are two other glaring problems with this review. First, it's nice
to hear that the reviewer read through http://pastebin.com/BxQkkq6y, but
there's a strong indication that the reviewer didn't actually read our
Crypto submission, a quite serious reviewing error. Specifically, even
though our Crypto submission is massively changed from our Eurocrypt
submission---see the link above to the svn diffs---the reviewer says
"the paper" as if the paper had been static throughout.
Second, the reviewer simply dismisses our response in toto. Given the
level of detail in our response to the reviewer's comments (whichever
comments those were), normal scientific standards oblige the reviewer to
address our response at a similar level of detail.