2WF80 -- Introduction to cryptology - Winter 2017

Contents Announcements Exams Literature Pictures and slides Old exams

Tanja Lange
Coding Theory and Cryptology
Eindhoven Institute for the Protection of Information
Department of Mathematics and Computer Science
Room MF 6.104B
Technische Universiteit Eindhoven
P.O. Box 513
5600 MB Eindhoven
Netherlands

Phone: +31 (0) 40 247 4764

The easiest ways to reach me wherever I am:
e-mail:tanja@hyperelliptic.org

This page belongs to course 2WF80 - Introduction to cryptology. This course is offered at TU/e as part of the bachelor's elective package 'Security'. The official page is here.

Contents
Classical systems (Caesar cipher, Vigenère, Playfair, rotor machines), shift register sequences, DES, RC4, RSA, Diffie-Hellman key exchange, cryptanalysis by using statistics, factorization, attacks on WEP (aircrack).

Some words up front: Crypto is an exciting area of research. Learning crypto makes you more aware of the limitations of security and privacy which might make you feel less secure but that's just a more accurate impression of reality and it a good step to improve your security.
Here is a nice link collection of software to help you stay secure https://prism-break.org/en/ and private https://www.privacytools.io/.

Announcements

You should have participated in "2WF50 - Algebra" or "2WF90 - Algebra for security" before taking this course. If not you can find some material in the Literature section.

All lectures take place Mondays 10:45 - 12:30 in AUD 10 and Thursdays 13:45 - 17:30 in Flux 1.03. There is a holiday break between Christmas and New year so that there are no lectures between 22 Dec and 07 Jan. There will also be no lectures on 08 Jan and 11 Jan but on 15 Jan and 18 Jan.

Gustavo Banegas is the teaching assistant for this course.

Literature and software

It is not necessary to purchase a book to follow the course.

For some background on algebra see

Some nice books on crypto (but going beyond what we need for this course) are For easy prototyping of cryptoimplementations I like the computer algebra system Sage. It is based on python and you can use it online or install it on your computer (in a virtual box in case you're running windows).
For encrypting your homeworks you should use GPG/PGP. If you're running linux then GnuPG is easy to install. If you're using windows I recommend using GPG4win; if you're using MAC-OS you can use GPG Suite. We are OK with having only the attachment encrypted, but for proper encryption of your email you might want to look into Enigmail which works well with Thunderbird.

Examination

30% of the grade is determined by homeworks. There will be two sets of homework during the quarter. You may hand in your homework in groups of 2 or 3. To make sure that you get used to crypto we require solutions to be sent encrypted with GPG/PGP. Each participant must have communicated with me at least once using GPG/PGP. You can find my public key for tanja@hyperelliptic.org on the key servers and on my homepage here. For Gustavo use gustavo@cryptme.in as email address and check for fingerprint 83458248E2E3D43F.

There will be an exam on 22 January 2018, 13:30 - 16:30 (with a retake on April 16, 18:00 - 21:00) which accounts for the remaining 70% of the grade. You may use a simple (non-programmable) calculator but no cell-phones or other devices containing calculator applications. You may not use books or your class notes.

Here is a test exam. Note that the CRT exercise would have somewhat smaller keys for the exam.

Class notes

This section will fill in gradually after the lectures. I'll provide short summaries and links to pictures of the blackboards. The homeworks will be posted here as well.

13 Nov 2017
Substitution cipher, Caesar cipher, Viginere, Playfair system. Some statements about the number of possible keys for these schemes.
Pictures of black boards and videos are here.

16 Nov 2017
Here is the exercise sheet for block 5 and 6: exercise-1.pdf. See also the raw data if paste fails.

For most of the exercises the solution is obvious when you have it. There are many more pages on the web with tools for cryptanalysis of classical ciphers, e.g. https://www.guballa.de/vigenere-solver, http://www.braingle.com/brainteasers/codes/index.php, http://www.cryptool-online.org, http://axion.physics.ubc.ca/cbw.html.

In the lecture we discussed the column transposition cipher (see pictures). You can play with it in the C1.3 exercise of the old Mystery Twister if you have Flash Player installed.

We discussed a bit about rotor machines and cryptanalysis. Visit the exhibition of rotor machines from the Cryptomuseum that is currently on show in the MetaForum and take a look at their website (and museum if you get a chance).

We discussed how to break the Hill cipher given some plaintext-ciphertext pairs and in particular repeated the extended Euclidean algorithm.

One-time pad with bits, and problems with reuse. Try the 4th challenge in the old Mystery Twister to see the reuse problems.

Finally, we discussed stream ciphers. These are much more practical than the OTP in that the key is much shorter. To encrypt a message, expand the key into a stream of pseudo-random bits and xor those to the message, i.e., treat the stream-cipher output as the one-time pad. To encrypt multiple message it becomes necessary to remember how many bits have been used and either stay in that state or forward by that many positions the next time one uses the cipher. This is impractical. Initialization Vectors (IVs) deal with that problem in that they move the beginning of the stream to a random postion. The IV is then sent in clear along with the ciphertext, so that the receiving end can compute the same starting position. More on this on Monday.
Pictures of black boards and videos are here.

21 Nov 2017
Feedback shift registers and how to use them for encryption; k-th order feedback sequence (=a sequence with k coefficients), period, pre-period(=tail), ultimately periodic sequences, Linear feedback shift registers (LFSRs), want c0=1 (else there is just a delay in output); can run backwards, is periodic, max period is 2^n-1, relation to matrix multiplication. characteristic polynomial of the matrix.
Note that we did the last part only for characteristic 2, in general we need to pay attention to signs and P(x)=det(xI-C).
Pictures of black boards and videos are here.

23 Nov 2017
Here is the exercise sheet for block 5 and 6: exercise-2.pdf.

We summarized the results of exercise 1 to derive some conjectures on the periods and factorization patterns (see blackboard pictures). For exercise 3 we clarified what α in the expression would mean and did a quick round of finite fields.

Quick primer on matrices, covering eigenvalues, eigenvectors, the characterisitic polynomial P(x) and that P(C)=O.

Defintion of irreducible polynomials, order/period of a polynomial, some examples using the characterisitc polynomials from the exercises; this order matches the order of the matrix C -- because P is C's characteristic polynomial; Order of C matches order of its characteristic polynomial because P(C)=0 means C and x play the same role, i.e. we compute the powers of C modulo P. Rabin's irreducibility test, order of an irreducible polynomial of degree n divides 2n-1.

Pictures of black boards and videos are here.

The first homework is due on December 07, 2017 at 13:45. Here is the first homework sheet. At this point you can solve the first half of the exercises, you will have all material necessary by next Thursday.
Please remember to submit your homework by encrypted, signed email. Don't forget to include your public key for me to reply.

The second homework sheet will be posted on Dec 14, again you will not have all background for the exercises, but can solve the first part of it. The homework will be due on Jan 8. I plan on skipping lectures that week and instead doing lectures on Jan 15 and 18, so you can still ask questions about the corrections.

27 Nov 2017
An irreducible polynomial generates a sequence with period equal to its order, i.e ord(C)=ord(P) is the period length of the longest period. An irreducible polynomial is called primitive if it generates a sequence of period 2^k1, i.e. the maximum possible.

Generating function S(x) satisfies S(x)=F(x)/P*(x), where * gives the reciprocal, P is the characteristic polynomial and F(x) is a polynomial of degree less than deg(P). The characteristic polynomial is the polynomial of smallest degree with this property. The characteristic polynomial of the sequence {si+ti} is the lcm of the characteristic polynomials of the two sequences. This means, we can now analyze LFSRs by analyzing the irreducible factors of their characteristic polynomials. We did not study how to handle repeated factors but from the examples on Thursday this does not look like a good idea.

One can recover the state of an LFSR from k output bits and then compute all future (and past) outputs. If the polynomial P is unkown we need to solve a linear system in the c_i, which needs 2k consecutive outputs. If k is unknown, attack for k=1, k=2, ... For each of them compute the candidate P and check for consistency with the following outputs to get confirmation or contradictions.

LFSRs are used in practice because they are small and efficient, but they need a non-linear output filter. As an example I mentioned Grain which is one of the finalists of the eStream competition on stream ciphers. Grain uses an LFSR together with a NFSR and an output function.

As a bad example I mentioned A5/2 which is a stream cipher used in GSM encryption. I mentioned a bit of the weird history of it but forgot to mention that the design was secret. One of the first postings on it with some details on the history an attack idea for A5/1 by Ross Anderson is from 1994, but lots of details were missing. The full algorithm descriptions of A5/1 and its purposefully weakened sibling A5/2 were reverse engineered and posted in 1999 by Marc Briceno, Ian Goldberg, and David Wagner. The same group also showed a devastating attack on A5/2, allowing for real-time decryption. Sadly enough, the A5 algorithms allow downgrade attacks, so this is a problem for any phone which has code for it, which is most until recently. Also A5/1 does not offfer 2^54 security (54 bits is the effective key length) but only 2^24 (with some precomputation/space). However, A5/2 is broken even worse, in 2^16 computations, with efficient code online, e.g A5/2 Hack Tool.

A nice overview of lightweight ciphers, including more modern and less broken ones is given by Alex Biryukov and Leo Perrin.
Further reading on finite fields and LFSRs, is in Lidl/Niederreiter (see literature section), or David Kohel's lecture notes.

Pictures of black boards and videos are here.

30 Nov 2017
Here is the exercise sheet for block 5 and 6: exercise-3.pdf.

Results from the exercises:
RC4 has a strong bias towards 0 in the second byte. We saw this experimentally and also gave the explanation. The first byte has a strong bias towards the first key byte unless the first key byte it zero, we showed this by tracing the state vector S.

RC4 does not provide for refreshing the key stream, so one needs to remember the last values for j and i. WEP needs a place to put some per connection data and uses key bits for that, so we get the known biases plus known key bits plus known plaintext/ciphertext pairs. Aircrack uses these to break WEP encryption.
I mentioned slides on more biases of RC4 by Daniel J. Bernstein. They are available here.

Concept of public-key cryptography: each user has two key, a private/secret key and a public key. Public-key crypto is also called asymmetric crypto,
PGP (which you need to use to submit your homework) is based on public-key crypto, you need to send your public key to Gustavo and Tanja along with your homework. Some explanantions why it is OK to publish the pubic key; some discussions on fingerprints.

Cryptographic hash functions need to provide preimage resistance, second preimage resistance, and collision resistance. If the output of the hash function has n bits then finding a collision takes on average 2n/2 trials (use the birthday paradox to see this) and finding a preimage or second preimage takes on average 2n trials.

Pictures of black boards and videos are here.

04 Dec 2017
Stream ciphers are susceptible to attacks flipping bits in the ciphertext, which cause the same bits to flip in the plaintext. A fingerprint protects against accidental bit filps, but a proper Message Authentication Codes (MACs) need to resist adversarially chosen changes. Communicating parties A and B need an authentication key along with the encryption key. The easiest version of a MAC is to use a hash function to compute cryptographic checksum over the authentication key and the ciphertext. Want to have checksum on the ciphertext for easy and quick rejection of forged packets = Encrypt, then MAC.

We covered design of hash functions using the Merkle-Damgaard construction and how this enables length-extension attacks (and how putting a fixed padding plus length information in the final block avoids these issues). Short summary of hash functions: MD4 is completely broken; for MD5 it's esasy to find collisions, first SHA-1 collisions were computed this year (see https://shattered.io/). SHA-256, SHA-512 and SHA-3 (and the other SHA-3 finalists) are likely to be OK.

Block cipers. ECB (electronic code book) mode encrypts each block separately, this means that identical blocks encrypt the same way. A famous example of how weak this is is the ECB penguin. More reasonable modes are CBC, OFB, and CTR. These modes ensure that identical plaintext blocks do not lead to identical ciphertext blocks. I commented on how easy it is to de/encrypt locally, by that I mean how much effort is needed to de/encrypt block i. In OFB you only need data C_i but also i+1 times the encryption of IV which makes this not local.

Some details on DES, 56 bits for the key is not secure enough! First brute force attack was done with "DES Cracker" for 250k USD. In 2006 a team from Bochum and Kiel built COPACOBANA which can break DES in a weak for 8980 EUR (plus some grad-student time). We'll look more into DES on Thursday.

When you submit your homework please use a subject starting with [2WF80] to make it easier for us to find your homework in our inboxes.

Pictures of black boards and videos are here.

07 Dec 2017
Here is the exercise sheet for block 5 and 6: exercise-4.pdf.

More details on DES. S-box is non-linear part and designed to avoid differential attacks. In the exercises we saw that small changes in the input lead to big changes in the output. Quick comment on PRESENT and that it uses a single S-box and on the current standard AES.
Discussion of brute force attacks on DES. Only 2^56 trials for complete key search. 2-DES is only marginally harder to break than DES, taking 2^57 with a divide-and-conquer approach. Still common use is 3-DES with k1=k3 and use k2 with decryption instead of encryption. This needs 2^112 steps to break.
When you use symmetric crypto, make sure to include a MAC!

Public-key signatures have public verification key and private signing key. Anybody can verify the signature (given the message and the public key) but only the owner of the private key can make valid signatures.
RSA signatures: public key for RSA is (n,e), private key is (n,d), where n=pq for two different primes p and q, φ(n)=(p-1)(q-1) and d is the inverse of e modulo φ(n). We showed how to sign and verify and why proper signatures pass verification. Attention, this is schoolbook RSA, do not use this in practice.
If you don't remember how to compute modulo an integer or what φ(n) is, now is a good moment to recover this.

Pictures of black boards and videos are here.

11 Dec 2017
Use cases for signatures, differences between MACs and signatures.

Pubblic-key encryption requires 3 algorithms: Key generation, encryption, and decryption. RSA encryption. Exponentiation by square and multiply, this takes l squarings and as many multiplications as e has bits set to 1; examples; can choose small e but d must be large/random.

Attack on RSA using gcd computation (problem for any way of using key, and this happened for real, see https://factorable.net/). First problem with schoolbook RSA: we can recover a message that is sent to multiple people, if they all use the same small exponent. More issues with schoolbook RSA: can decrypt linearly related messages; RSA is homomorphic.

Pictures of black boards and videos are here.

14 Dec 2017
Here is the exercise sheet for block 5 and 6: exercise-5.pdf.

Security notions and attack definitions (CPA and CCA), semantic security, ciphertext indistinguishability. schoolbook RSA is not CCA-II secure; IND-CCA security as game between attacker and challenger; the attacker should not have higher probability than guessing in deciding which of two messages m0 and m1 was encrypted, given the messages and one ciphertext.

To make RSA a randomized encryption one uses some padding. We discussed PKCS v1.5 as a negative example and looked at Bleichenbacher's attack. Take a look at https://robotattack.org/ for a very recent use of Bleichenbacher's attack in practice. You should be able to understand details of the full paper Return Of Bleichenbacher's Oracle Threat. RSA-OAEP is a better padding scheme.

Factorization methods: trial division, factoring numbers of the form p*nextprime(p+1), Fermat factorization, p-1 method.

Pictures of black boards and videos are here.

The second homework is due on 11 January 2018 at 13:45. Here is the second homework sheet. At this point you can solve the first half of the exercises, you will have all material necessary by next Thursday.
Please remember to submit your homework by encrypted, signed email. Don't forget to include your public key for me to reply.

18 Dec 2017
Fermat Factorization as algorithm, more details and success chances for Pollard's p-1 method. Namedropping of other factorization methods, see also http://facthacks.cr.yp.to/ for descriptions and code snippets. I mentioned that Nadia Heninger put up some nice code for easy factorizations.

Diffie-Hellman key exchange in different groups, including some insecure ones. A good choice is to use the intergers modulo a large prime or elliptic curve crypto (not covered in this class).
CDHP, DDHP, DLP, relations between these problems. Baby-Step Giant-Step algorithm: any system based on DLP has at most squareroot of the group oder hardness of the DLP.

Pictures of black boards and videos are here.

21 Dec 2017
Here is the exercise sheet for block 5 and 6: exercise-6.pdf.

Thanks to Gustavo for covering the first part today while I was at the diploma ceremony. Gustavo repeated BSGS and did a detailed example. DH has a problem with active man-in-the-middle attacks, he used P for the man-in-the-middle (hmm .... evil prof?).

Details of semi-static DH and that this matters for quick session resumption. ElGamal encryption (for historical purposes), this is not CCA secure: asking for (r,2c) can be used to decrypt (r,c). The additive homomorphism might be what you want, but otherwise you should really not use this but use static-DH instead.

Key-Encapsulation Mechanisms (KEMs) and how this fits with modern use of RSA and DH.

ElGamal signatures. For functionality (and security) we want to sign h(m) and not m; this is also important for RSA signatures
We showed how and why ElGamal signatures and why these systems work.

Last year I wrote some slides for this lecture. You might find them interesting as a different way to explain BSGS.

Pictures of black boards and videos are here.

Enjoy your holidays. If you want to do some crypto take a look at the old exams (below). Email me if you have questions or think you have solutions to old exams (= send me scans of your solutions and I'll send comments back).

Remember, there will be no lectures on 08 and 11 Jan.

15 Jan 2017
Needham-Schroeder authentication protocol and why it doesn't actually prove to B that he is talking to A. Triple DH or DH+ signatures achieves authentication and key freshness.
Some summary of what I expect you to know about polynomial factorization and orders of polynomials.
Shamir secret sharing: allows to share a secret in a t-out-of-n fashion so that any set of t people can recover it; works >y simple Lagrange interpolation.
Note that the secret never needs to be re-computed -- for applications in RSA or DH the shares can be applied individually and then only the per-message secrets be combined. Also note that there is no need to ever have the secret -- it can be generated from t shares; these shares are then re-shared in a t-out-of-n fashion.

Pictures of black boards and videos are here.

18 Jan 2018
Here is the exercise sheet for block 5 and 6: exercise-7.pdf.

Old Exams

This course was given for the first time in Q2 of 2014. Here are the exams so far