Selected Areas in Cryptology

Selected Areas in Cryptology - Spring 2019

Part II

The second part is taught by
Andreas Hülsing and Tanja Lange
Coding Theory and Cryptology
Eindhoven Institute for the Protection of Information
Department of Mathematics and Computer Science
Technische Universiteit Eindhoven
P.O. Box 513
5600 MB Eindhoven
Netherlands

The easiest ways to reach Andreas and Tanja is by e-mail:

The semester runs for Weeks 6- 21. The first part will be taught from February 5 to March 19, and April 2, by Marc Stevens.
The webpage for the first part is http://homepages.cwi.nl/~stevens/mastermath/2019/. The second part will be taught from March 26, and April 9 onwards by Andreas Hülsing and Tanja Lange.

We will cover only the very basics of quantum algorithms. I recommend you take Ronald de Wolf's Quantum Computing course (also Mastermat) for much more information. That coure runs Monday afternoon at UvA.

Details are filled in as time permits. The official home page is here.
The course takes place Tuesday 10:00 - 12:45 at the University of Utrecht, room BBG 017.

Goal
The goal of this course is to provide insight into cryptography secure against quantum computers (post-quantum cryptography) as well as various methods for the mathematical cryptanalysis of cryptographic systems

Description
Cryptology deals with mathematical techniques for design and analysis of algorithms and protocols for digital security in the presence of malicious adversaries. For example, encryption and digital signatures are used to construct private and authentic communication channels, which are instrumental to secure Internet transactions.
This course in cryptology consists of two main topics:
The first part of the course will cover various generic attacks against common cryptographic primitives (e.g., block ciphers, hash functions) and cover important cryptanalytic attack techniques like time-memory tradeoffs, linear cryptanalysis, differential cryptanalysis and algebraic cryptanalysis.
This second part focuses on post-quantum cryptography dealing with cryptographic systems that are secure even given the existence of quantum computers and the second part focuses on cryptanalysis, the analysis of the security of cryptographic systems.
After a general introduction to cryptography (the constructive side of cryptology) and cryptanalysis the course introduces the main contenders for secure systems: lattice-based encryption, code-based encryption, hash-based signatures, and multivariate-quadratic-based signatures. These systems are examples of public-key systems and this is the main area affected by quantum computers; symmetric-key systems, such as hash functions and block and stream ciphers) are used as building blocks inside them and for the transmission of data in bulk.

Announcements

The lecture on 16 April will start at 11:00 instead of 10:00 and end (as always) at 12:45.

There will be no lecture on 21 May.

Examination

There will be a written exam on June 11, 10:00-13:00, BOL 0.204 (change of room) at University of Utrecht. The retake will be done as oral exams on 4 July in Eindhoven.

Here are some test exercises.

The exercises for the first exam 2015 are here. I will not make solutions available for my part but you can email me what you think is the correct solution and I'll give you feedback.

The exercises for the second exam 2015 are here.

The exercises for the first exam 2017 are here.

The exercises for the second exam 2017 are here.

The exercises for the first exam 2019 are here.

Literature

There are a few books covering topics we touch in class but there is no 'textbook' that covers everything. The following list will be updated as the course proceeds. For general background on cryptography the Handbook of Applied Cryptography by Menezes, van Oorshot, and Vanstone has everything you wanted to know (and more); you can download the chapters freely from the web.
Lots of references on post-quantum cryptography are collected at pqcrypto.org.
A (only slightly outdated) reference on post-quantum crypto is Post-Quantum Cryptography by Bernstein, Buchmann and Dahmen as editors. Some sample pages are available on the Springer page and the authors of the chapters are free to host their chapters on their personal webpages.
It is not necessary to purchase any book to follow the course.

Class notes

To appear during the course.

26 Mar 2019
Lecture by Andreas on motivation for post-quantum cryptography and quantum algorithms.

Slides of first part are available, where we went up to the section heading post quantum cryptography. We covered what part of today's cryptography breaks when facing quantum adversaries and why it is necessary to switch to cryptography that resists such adversaries now. Then I used a slide set by Daniel J. Bernstein to introduce quantum algorithms in a slightly different way than the usual physics notation. We covered NOT, CNOT, Toffoli, and Hadamard gates, discussed how to build more complex circuits out of those and eventually looked at Simon's and Grover's algorithms.

As exercise you are asked to redo the example for Simon's algorithm with paper and pen for yourself in your preferred notation.

Further reading:

A nice discussion of Simon's algorithm in the usual Dirac notation can be found here.
Simon's algorithm can be used to break secret key cryptography if we run the cryptography on a quantum computer (see this paper). Note, this is a far stronger model than post-quantum cryptography where we assume honest users use conventional computers. However, it nicely demonstrates the power of quantum algorithms.

09 Apr 2019
Short introduction to coding theory and code-based crypto (linear block codes, length n, dimension k, minimum distance d=2t+1, Hamming weight, Hamming distance, generator matrix, parity check matrix, c=mG+e, H(c+e)=He).
General decoding problem is NP-complete; this seems to hold also on average (but no proof there). McEliece crypto system (1978) uses this to encrypt. Public key is generator matrix G'=SGP, where G is the generator matrix of a code with a good decoding algorithms, and invertible matrix S and permutation matrix P are chosen randomly to hide the nice structure. The secret key consists of G, S, and P. Bob encrypts m as y=mG'+e using some random error e of weight t. An attacker Eve is confronted with decoding a random(-looking) code G'. Alice decrypts by computing yP^-1 and then using her secret decoding algorithm to recover mS (note that e and eP^-1 have the same weight t). Since S is invertible, Alice recovers m.
Next week we'll deal with attacks, already note that we don't use this to encrypt messages but use it to transmit a key, Hash(m), which is then used with a symmetric cipher to encrypt the actual message.
We then did a detailed investigation of how Goppa codes work and why they have dimension at least n-tm and minimum distance at least 2t+1. Take a look at the slides for the nice representation of the parity-check matrix and the decoding algorithm. Miichiel Marcus hass kindly agreed to make his report available which gives a nicely worked out explanation of Goppa codes and the McEliece system. He also wrote a Jupyter notebook which you can download and use in Sage. Some of the material on my slides and Michiel's text will be covered next week only.
For further reading take the overview chapter by Overbeck and Sendier on coding theory from the book on post-quantum crypto. The pdf is available here. You should also be able to download the full Springer version via your university account. Sendrier also lectured in a summer school at TU/e, you can find the slides and exercises at the PQCRYPTO Summer School on Post-Quantum Cryptography 2017. Doing these excercises and the coding-theory ones from prior exams gives you a good overview of the interesting problems. Note that attacks and Niederreiter's system will be covered next week.

Here are photos from the lecture.

16 April 2019

Niederreiter view on ecryption, Niederreiter's system uses the parity-check matrix instead of a generator matrix. Instead of using the parity-check matrix H of the easy-to-decode code we pick a random invertible (n-k)x(n-k) matrix S, and nxn permutation matrix P and put K=SHP. K is the public key and S and P together with a decoding algorithm for H form the private key. For codes that are suitable for code-based crypto, K looks like a random matrix. Given the syndrome s=Ke a user knowing S computes S^-1s=S^-1(SHP)e=H(Pe). Pe has the same weight as e (because P is a permutation matrix) so this can be decoded by the efficient decoding algorithm for H. Then applying P^-1 gives e.
KEM/DEM paradigm (which means that using the weight-t encoding in Niederreiter is not a problem); discussion on how we can skip applying P and S when using Niederreiter with Goppa codes if we use systematic form.

Definition of syndrome, how to use a syndom decoder for general decoding and vice versa; relationship between generator matrix and parity-check matrix.

Using the slides I briefly covered how to decode Goppa codes. Note that for the XGCD we use that in computing gcd(v,g)=f*v+l*g we start with g=0*v+1*g and v=1*v+0*g and slowly decrease the left-hand side while increasing the coefficients on the right-hand side. In the algorithm we stop once we reach a state of a=b*v+h*g with a and b of degree bounded by t/2 and (t-1)/2; taken modulo g this gives the desired equation and thus σ.
Then I moved on to explain how to find words of small weight in a code and covered brute force search and algorithms by Prange, Lee-Brickel, Leon, and Stern. We didn't do the running time analysis for Leon or Stern. I gave some motivation why the improvements are improvements but didn't trace all the binomial coefficients. Here is a taste of this. I will denote the binomial coefficient n choose k by C(n,k) to save my sanity in typing this.
There are n choose t ways the t error positions can be distributed over the n positions. When we select the n-k columns to make a left-right split and to transform the right hand side into the identity matrix we determine, whether step 2 can be successful. On that level, the likelihood of a selection to work changes from C(k,p)*C(n-k,t-p)/C(n,t) for Lee-Brickel to C(k,p)*C(n-k-z,t-p)/C(n,t) for Leon (because there are p columns chosen out of the k columns on the left, and t-p of those on the right; for Leon in addition we insist on having z positions to be zero in return for a cheaper step). For Stern the probability is (C(k/2,p))^2*C(n-k-z,t-2p)/C(n,t). Take a look at the original Stern paper or our paper reporting on our 2008 attack.
I did not cover the attacks which abuse decryption oracle (Berson and Sloppy Alice); take a look at the slides to understand the exercises.

Further reading

Henk van Tilborg made his book Coding Theory
Christiane Peters wrote her PhD thesis on curves (not post-quantum) and code-based crypto in 2011. You can download her thesis from her publications page.

Here are photos from the lecture.

23 April 2019

A lattice is a discrete subgroup of R^n, usually given by some basis B={v1, v2, ...,,vn}. Then L={Σ aivi | ai in Z}. An integer lattice is one where all vi are vectors in Z^n. In cryptography one is usually given a very skewed basis and the computational task is to find a nice basis in which the vectors are relatively short and close to orthogonal. If the coefficients ai were in R rather than in Z we could do this easily with the Gram-Schmidt orthonormalization algorithm, but for lattices of sufficiently high dimension it is even hard to find a vector of shortest length in a lattice (Shortest-Vector Problem (SVP)). SVP and the related problem of finding a closest lattice point to a given point in R^n, the Closest Vector Problem (CVP) form the basis of several lattice-based cryptosystems. See the slides (links below) for pictures.

Goldreich, Goldwasser, and Halevi showed how to build an ecryption and a signature system on top of CVP. Each signature leaks a bit of information on the secret (good) basis; with enough signatures one can easily see the fundamental parallelepiped (the elements in Σ (-0.5,0.5] vi). For details on an optimized version of the attack see Learning a Parallelepiped: Cryptanalysis of GGH and NTRU Signatures by Phong Nguyen and Oded Regev.

Finally I explained how the NTRU encryption works, why reductions modulo q and 3 can be compatible,m and how finding the secret key is related to finding a short vector. The NTRU system has two general public parameters: namely positive integers n and q where gcd(3,q)=1 and q is much larger than 3. Toy parameters are q=64, N=107; the low end of reasonable is q=128, and N=503. For hand calculations you can use q=101 and N=7.

All computations take place in the ring R=Z[x]/(x^N-1), i.e. all elements are represented by polynomials of degree <N. Some computations additionally reduce modulo 3 or modulo q.

The private key of user Alice is a polynomial f(x) in R which satisfies that f is invertible in R/3=(Z/3)[x]/(x^N-1) and in R/q=(Z/q)[x]/(x^N-1).

To generate her public key, Alice picks a polynomial g(x)in R and computes f_3=f^{-1} in R/3, f_q=f^{-1} in R/q and h=3g*f_q in R/q. Alice's public key is h along with the public parameters q and N.

To encrypt message m(x) which is an element of R with coefficients in the smaller interval [-1,1] to a user with public key h take a random polynomial r(x)\in R$ and compute c=r * h +m in R/q.

To decrypt ciphertext c in R/q use private key f and compute a=f * c in R/q, choosing coefficients in [-(q-1)/2,(q-1)/2]. [If you're a mathematician, lift a to R, i.e. forget about the reduction modulo q]. Then compute m'=a* f_3 in R/3, taking coefficients from [-1,1].

The idea here is that the computation modulo 3 removes the first part, so that we are left with f*f_3*m which is congruent to m modulo 3, because f_3 is the inverse of f modulo 3. However, this mixes computations modulo q and 3, which usually is a bad idea. The reason that this is still likely to work is that the first part is not just some multiple of 3 but that h=3g*f_q and thus the first part of a(x) equals f*r*h=rf*f_q*3g which is equivalent to r*3g modulo q. Now, if r and g are sparse (have very few nonzero coefficient), the existing coefficients are small and 3 is small (all relative to q) then is is very likely that r*3g in R (i.e. without reduction) equals r*3g with reduction to coefficients in [-(q-1)/2,(q-1)/2], which means that all coefficients are indeed divisible by 3 and thus the first part vanishes.

NTRU chooses to have all random polynomials have coefficients in {-1,0,1} and to limit the number of non-zero coefficients. Let Lf be the set of polynomials in R with df coefficients equal to 1, df-1 coefficients equals to -1, and the remaining coefficients equal to 0. The secret key f is chosen to be an element of Lf. Similarly let Lg (and Lr) be the sets of polynomials with exactly dg (and dr) coefficients equal to 1, the same number equal to -1 and the rest equal to 0. For the parameters with N=107 choose df=15, dg=12, and dr;=5; for N=503 choose df=216, dg=72 and dr=55.

I showed how to estimate sizes so that one knows how likely decryption failures are. You can check that this is usually OK for the proposed parameters but that failures can happen.

I showed how turn finding g and f into a lattice problem; this means that all the lattice attacks apply. You can find some other attacks on NTRU in the first part of the slide set. I will cover BLISS and general lattice attacks next week.
I also showed some slides from the latticehacks talk I gave with Daniel J. Bernstein and Nadia Henninger at 34C3. Slides are posted at the link above. There is also a video for the talk; I suggest stopping many times while watching to do some examples with Sage. We will not cover attacking RSA with lattices (though it's fun!) and, as said above, will cover algorithms for SVP next week.

Here are photos from the lecture.

30 April 2018
I started with repeating that the unimodular matrix U should be on the left, rather than on the right. In general, papers and books vary in whether they put the basis vectors as rows or columns, so watch out, but since I put them as rows it should be B'=U*B. In the explanation of why NTRU can be broken using lattice algorithms I can make the resulting vector smaller by taking H to be the multiplication by h/3 matrix.

I showed slides 36 - 52 from Thijs Laarhoven's lec1.pdf slides for a more professional animation of the GGH attack. Take a look at the slides before this to see the effect of the basis on rounding. Of course, you should think of this in many more dimensions

I briefly mentioned Lyubashevsky's lattice-based signature scheme and that it uses rejection sampling to hide a secret distribution. I then showed some simplified version of BLISS using slides 14-18 of my slides. To see the full details, check out the BLISS paper.

One of the oldest methods to find shortest vectors is LLL, named after H. Lenstra, A. Lenstra, and L. Lovasz. The algorithm finds a short vector that is at most a factor d longer than the shortest. This approximation factor d is important when checking whether a system is secure. The shortest vector must be such that there are other vectors more likely to be found. See pages 18 and 19 of my slides for details on LLL..

To illustrate the working of LLL I showed slides 63 - 67 Thijs Laarhoven's lec1.pdf (see above). Open the slides in acroread and click on the arrors to play. BKZ finds shorter vectors than LLL by solving the SVP exactly on blocks of smaller dimension.

I showed slides 56 - 58 of the latticehacks.pdf slides to explain how to find shortest vectors by enumeration. Note that the search tree is exponentially large in the lattice dimension. Pruning reduces the success probability but also reduces the cost, so much that even "extreme pruning" has been proposed and is now used.

I used Thijs Laarhoven's lec2.pdf slides (starting with slide 69) to explain sieving algorithms for computing shortest vectors This brings you all the way to research results (by him) from a paper that appeared at Crypto 2015. There are some more improvements in more recent papers.

Here are photos from the lecture.

Further reading: I mentioned the page on Lattice challenges including a page on SVP challenge.

The lattices used in cryptography are often not general lattices but carry more structure. E.g. the lattice in NTRU comes from computing in the polynomial ring modulo x^N-1. Such ideal lattices might be easier to break than general lattices and there is a separate ideal lattice challenge.

The PQCRYPTO summer school produced lots of slides, videos and exercises on lattices. Take a look here.

07 May 2019

Short summary of elliptic curves over finite fields. Please look up the group operation so that you can actually add points. I focused on explaining concepts in class. We covered the structure of the l-torsion group: for l coprime to the characteristic this is isomorphic to Z/l x Z/l, else it depends on whether the curve is ordinary or supersingular. In the former case the structure is isomorphic to Z/l, else E[l] is just the point at infinity.
An isomorphism between two curves is an invertible group homomorphism between them; for curves in short Weierstrass form this amount to scaling the coordinates. The j-invariant is an invariant of isomorphism classes. An isogeny between two curves is a non-constant homomorphism. This map is typically not injective and maps l-to-1 for some l; the dual isogeny of such an l-isogeny maps in the opposite direction so that their concatenation corresponds to the multiplication-by-l map on the first curve. Tate's theorem says that two curves over a field F_q are isogenous if and only if they have the same number of points Note that the image of the isogeny is a subset of the image curve and the kernel is a group of points of order dividing l. Every cyclic subgroup of an elliptic curve defines an isogeny; explicit formulas are given by Vélu.

We covered the CRS key exchange, first abstractly using the action of a commutative group on a set and then giving some intuition of how to use it for isogenies acting on the set of elliptic curves. To give a proper definition we need that the isogenies are in one-to-one correspondence to ideal classes in the ideal class group of the endomorphism ring of the curve. We discussed the structure of the l-isogeny graph and how these form volcanoes for ordinary curves. On the crater we have a sense of direction and the CRS system works on the overlay of isogeny graphs for different primes l_i. For many more details and really nice pictures of how the different isogeny graphs overlay see these slides by Chloe Martindale an Lorenz Panny, which also explains the rest of CRS and our new CSIDH scheme. Take a look at https://csidh.isogeny.org for more information on CSIDH.

Some students stayed a bit longer and I explained the volcano part in more detail for l=3; you can see this on the third photo.

I wrote an overview paper on isogenies; the plan was to cover all of this and I would still recommend reading the SIDH part; please let me know if you find errors. Here are photos from the lecture.

The PQCRYPTO summer school produced some of slides, videos and exercises on isogenies. Take a look here. If you are less familiar with elliptic curves do the first exercises. Note that some part of this is on SIDH which we did not cover.

14 May 2019
Lecture given by Andreas Hülsing.
Hash-based signatures make minimal security assumptions in the sense that they only require a secure hash function and no additional, mathematical hard problem like SIS, LWE, RSA, or DL. It can be shown that secure schemes can be built from hash functions that are resistant against preimage and second-preimage attacks (and some variants thereof). It is easy to build a one-time signature from these functions. To extend to multiple signatures Merkle trees and trees of Merkle trees are used.

Andy covered Lamport and Winternitz one-time signtures; how to build the stateful signature scheme XMSS and the stateless signature scheme SPHINCS.

Finally he briefly covered PICNIC but skipped the last few slides on multivariate signatures because of time. If you want to learn more about multivariate signatures take a look at Albrecht Petzoldt's presentation at the 2017 summer school and solve the exercises.

Andy made his slides available so that you can check the details.
Further reading/watching:

Andy also lectured on hash-based signatures at the 2017 summer school. Follow the link for the video, slides, and exercises.
For the NIST competition we submitted an improved version of SPHINCS which we called SPHINCS+. On the website you can also find the original SPHINCS publication. SPHINCS+ is now in the second round of the NIST competition.
Adam Langley from Google wrote a blog post giving more motivation for stateless hash-based signatures ... should you need any.