Introduction to cryptology

2MBD60 – Introduction to cryptology - Winter 2024

Contents

Announcements

Exams

Literature

Videos and slides

Course notes and exercise sheets

Old exams

Tanja Lange
Coding Theory and Cryptology
Eindhoven Institute for the Protection of Systems and Information
Department of Mathematics and Computer Science
Room MF 6.103
Technische Universiteit Eindhoven
P.O. Box 513
5600 MB Eindhoven
Netherlands

Phone: +31 (0) 40 247 4764

The easiest ways to reach me wherever I am:
e-mail:tanja@hyperelliptic.org

This page belongs to course 2MBD60 - Introduction to cryptology. This course is offered at TU/e as part of the bachelor's elective package Security and used to be called 2WF80 - Introduction to Cryptology. For the 2024/25 edition there is no difference between the courses.

Contents
Classical systems (Caesar cipher, Vigenère, Playfair, rotor machines), shift register sequences, DES, RC4, RSA, Diffie-Hellman key exchange, cryptanalysis by using statistics, factorization, attacks on WEP (aircrack).

Some words up front: Crypto is an exciting area of research. Learning crypto makes you more aware of the limitations of security and privacy which might make you feel less secure but that's just a more accurate impression of reality and it a good step to improve your security.
Here is a nice, though somewhat dated, link collection of software to help you stay secure https://prism-break.org/en/ and private https://www.privacytools.io/.

Announcements

If you study mathematics, you should have participated in "2WF50 - Algebra" and "2WF70 - Algorithmic algebra and number theory".
If you study computer science or any other program you should have participated in "2DBI00 - Linear Algebra and Applications", "2IT50 or 2IT80 - Discrete structures", and "2WF90 - Algebra for security" before taking this course.
If not you can find some material in the Literature section but note that you are on your own for learning this.

Lectures take place Mon block 3 and 4 and Thu block 7 and 8. Because more students signed up than the schedulers assumed we have a bit of an odd mix of lecture rooms, please check the online calendar.
We have three locations for the instruction sessions (Thu block 5 and 6), please join the room that's emptier. We might reduce to two rooms if not enough students show up, but I really recommend doing the instruction sessions and attending them live if you can fit them in your agenda. They are more important than the lecture, and thus get the prime spot. We'll have enough instructors in the rooms so that you can ask your questions and check the answers to your exercises with them. We will not hand out solutions.
In 2020 and 2021 the course has been running online with short videos – one video per topic, so several videos per unit. This year two of the Monday lectures will be streamed and recorded because we couldn't get a big enough room, but most lectures happen only in person.

The teaching assistants/instructors for this course are the following 7 people.

Sebastiano Boscardin, s.boscardin (at) tue.nl, PGP key
Emanuele Di Giandomenico, e.di.giandomenico (at) tue.nl, PGP key
Jonathan Levin (grading only), jl_crypto_course (at) riseup.net, PGP key
Alex Pellegrini (grading only), a.pellegrini (at) tue.nl, PGP key
Abishanka Saha, a.saha1 (at) tue.nl, PGP key
Luc Steenbakkers, l.g.a.m.steenbakkers (at) student.tue.nl. PGP key
Tianxin Tang, t.tang2 (at) tue.nl, PGP key

Please always inlcude all 7 of them when you send in your homeworks -- and make sure that you encrypt to all 7 of them and to your team mates. All email systems that I know of will do this automatically if you have all keys in your address book and the people in the To or Cc field your email.
The teaching assistants from 2023 wrote this guide to homework submission for this course.

We will not have a lecture on 6 Jan.

Literature and software

It is not necessary to purchase a book to follow the course.

For some background on algebra see

Tanja Lange "Number Theory and Algebra" ( Chapter of draft book "Discrete Mathematics")
Tanja Lange "Finite Fields" (Chapter of draft book "Discrete Mathematics")

Some nice books on crypto (but going beyond what we need for this course) are

Jean-Philippe Aumasson "Serious Cryptography", No Starch Press, 2017.
Johannes Buchmann "Introduction to Cryptography", Springer, 2004.
Neal Koblitz "A course in Number Theory and Cryptography", Springer, 1994.
Rudolf Lidl and Harald Niederreiter "Introduction to Finite Fields and their Applications", Cambridge University Press, 1994.
Christof Paar and Jan Pelzl "Understanding Cryptography", Springer, 2010
Doug Stinson "Cryptography: Theory and Practice", CRC Press, 1995
Henk van Tilborg "Fundamentals of Cryptology" Kluwer academic Publishers, Boston, 2000.

For easy prototyping of crypto implementations I like the computer algebra system Sage. It is based on python and you can use it online or install it on your computer (in a virtual box in case you're running windows).
I recorded a video to demonstrate how to use Sage https://www.sagemath.org/, covering basics of finite fields and elliptic curves. The latter do not for this course, so watch start till minute 10:50 and then again for about 2 minutes after 19:30.
I also wrote a short ``cheat sheet'' with commands for Sage, see here

For encrypting your homeworks you should use GPG/PGP. If you're running Linux then GNU uPG is easy to install. GPG4win; if you're using MAC-OS you can use GPG Suite (though I'm getting reports that they changed their system and now charge for the full version, the command-line linux version works for sure).
Thunderbird has good integration and I hear that also outlook can work well with the plugins.
Please always inlcude all 7 of them when you send in your homeworks -- and make sure that you encrypt to all 7 of them and to your team mates. All email systems that I know of will do this automatically if you have all keys in your address book and the people in the To or Cc field your email.
The teaching assistants from 2023 wrote this guide to homework submission for this course.
If absolutely not possible otherwise, we are OK with having only the attachment being encrypted and signed, but prefer proper encryption of the whole email. If you end up going for file encryption only you must ensure that all receipients can decrypt and you're missing out on the support that the email system offers you.

Examination

30% of the grade is determined by homeworks. There will be six sets of homework during the quarter. You should hand in your homework in groups of 3 or 4. To make sure that you get used to crypto we require solutions to be sent encrypted and signed with GPG/PGP. Each participant must have communicated with the TAs at least once using GPG/PGP. You can find the keys for the TAs linked above.
If for some reason you need to email me, you can find my public key for tanja@hyperelliptic.org on the key servers and on my homepage.

The exam will probably not take place on 27 January 2025, 09:00 - 12:00, but one week later. The online timetable is not yet updated as of 15 Jan. please double check on the offical TU/e pages before going to the exam.
The retake is likely scheduled for April 7, 18:00 - 21:00. The exam accounts for the remaining 70% of the grade.

Videos and slides

For 2022 and 2023 the lectures were recorded and posted at TU/e's Yuja site. Search for course code 2WF80 (the old name of this course) to see the recordings. Mine are the 2022/23 and 2023/24 versions, not the 2018/19 version which for some reason laods with preview. Click on the three vertical dots on the right end of the course to select "open" and get to the videos.
For the 2020 and 2021 editions of the course I recorded a lot of short videos which you can find on the YouTube Channel which are better for self study or if you want a more direct way to look up a subject. The course page for 2021 has short descriptions of all videos, slides, and no-cookie links to the YouTube videos. Watch them from there if you're on a low-cookie diet.

Class notes and exercise sheets

This section is extended through the course with notes of what happened in class and links to blackboard pictures. You can also find here the exercise sheets for the sessions on Thursdays.

11 Nov 2024
The frist lecture was be given by Monika Trimoska.
Monika presented the requirements that cryptography should achieve, namely to provide confidentiality, integrity, and authenticity, against attackers that have access to all messages sent and who may attempt to inject messages or to modify messages. Then, presented an introduction to public-key cryptography and signatures and ended with historical ciphers
These three sets of slides:

14 Nov 2024
Note that we have excerises in block 5 &6, lectures in block 7 & 8.
Here is the exercise sheet for block 5 and 6: exercise-1.pdf. See also the raw data if paste fails.
If you get stuck on the Hill cipher watch the 2021 video on XGCD under Math background on the 2021 course page.

For most of the exercises the solution is obvious when you have it. If you're not sure, please ask the TA in your room to check you solution. There are many more pages on the web with tools for cryptanalysis of classical ciphers, e.g. https://www.guballa.de/vigenere-solver, https://www.braingle.com/brainteasers/codes/index.php, http://www.cryptool-online.org, http://practicalcryptography.com/ciphers/, https://www.boxentriq.com/code-breaking.
sadly some of these don't offer https version (kind of hilarious, given the topic, but then it's about attacks ....)

Alfter some more annouuncements regarding the homework and PGP, we covered two more historical ciphers: Playfair and column transposition. For Playfair we discussed how it works and why the encoding is needed. We also covvered some slight variations that sender and receiver could agree to use and that we want a square (or almost square) arrangement, so 36 could also work with all 26 letters and 10 digits -- but then make sure to have an alphanumeric keyword.
For column transposition it's important to remove the gaps between the pieces to hide the width of the grid. This is a nice example of a cipher where knowledge of natural languages was very important in the attacks, not just statistics.
We had some rotor machines from the Cryptomuseum on show in the MetaForum and maybe might get some again, but for now we don't have a nice display box. Check out the extensive website of the Cryptomuseum on crypto machines and spy craft.

As a secure historical example we covered the one-time pad and its practical limitations. You can think of it as Viginiere having the keyword length match the message length, but we also need a random key to get the strong security that is offers (information-theoretic security).

Finally we covered stream ciphers as an example of symmetric-key encryption and as a more practical alternative to one-time pads. All security concerns of one-time pads apply – never use them twice – and additionally stream ciphers need to be checked whether they really produce random-looking streams. Basic essentials to avoid the two-time pad issue is to use an IV (initialization vector) to make sure that the stream cipher has a fresh randomized starting point for each message and that the space for the IV is big enough to avoid repetitions.
We'll cover the more formal staements and some examples of stream ciphers on Monday.

I had forgotten to pack my camera, a student was so nice to take these pictures and send them to me.
Pictures of black boards are here.

The first homework is due on 21 November 2024 at 13:30. Here is the first homework sheet.
For submitting your homework you need to generate a keypair (a private and a public key). As the names suggest you should keep the private key private and can publish your public key. The latter is needed to encrypt messages to you and to verify signatures made by you, so that's what you need to send in along with your homework solution and that's also what your team mates need from you in order to communicate securely with you. You can find the keys of the TAs above under Announcements.
Please remember to submit your homework by encrypted, signed email to all the TAs. Don't forget to include your public key and those of your team mates.

20 Nov 2024
I gave more technical details on stream ciphers. Then we covered feedback-shift registers (FSRs), LFSRs and how to represent them via matrices.
(L)FSRs are examples of stream ciphers. The IV is used as the initial state and the key defines the feedback function. Note that this is very much simplifying things and LFSRs alone are really bad stream ciphers as you can recover the state update from the IV and just n-1 more ooutput bits. However, LFSRs are used in bigger constructions because we can reason well about the periods they geerate (without having to try everything).
We started with some theory on periodic and quasi-period sequences and showed that if a sequence repeats every l steps then l is a multiple of the period. We have seen that depending on the function and the initial state, the output sequence can have a very short period. For LFSRs we have established that the output sequence is periodic (not just ultimately periodic) if c0=1 and that the period divides the order of the state-update materix. We also established that the all-zero state leads to the all-zero sequence of period 1 for any LFSR. We covered 2 complete examples for LFSRs with state-size 3 for all states, one had periods 7 and 1, the other had 4,2,1,1. Note that there are 2^n different states, so the sum needs to match 2^n, so 8 in thess examples.
Finally I stated that the characteristic polynomail of the state update matrix is x^n -c_n-1x^n-1 -c_n-2x- .... - c₁x - c₀. Most of the time we consider sequences modulo 2 and then all the - turn into +. Note that the n is the state size and the c_i are the update coefficients (being 1 for a closed wire and 0 for an open one). I will contninue with the proof for the characteristic polynomial on Thusrday,

One of the students was so nice to take pictures of the board, they are here.

21 Nov 2024
Here is the exercise sheet for block 5 and 6: exercise-2.pdf. You should really try to solve these exercises and make some conjectures about how orders, degrees, and periods fit together. You can call over the TAs for checking and I've put a quizz on Canvas so that you can check the numbers yourself -- and ask the TAs if you didn't get the right answer. I do expect that you use a computer for the larger examples.

In class we covered more details on LFSRs, in particular we took some of the conjectures on orders of C and f(x) that you should have found in the exercise session, turned them into theorems, and proved them. Now that they are proven, you can use them as facts, so proofs are useful.
First we showed that the order of the state-update matrix equals the longest period that that LFSR can produce and that it is produced using starting state S₀ of (00....01). Then I covered the proof that I skipped on Monday that the characteristic polynomial equals x^n - Σc_ixⁱ.
I used Rabin's irreducibility test, you should know that from previous lectures and you should be able to recognize some irreducible polynomials of small degree namely x, x+1, x^2+x+1, x^3+x+1, and x^3+x^2+1 as factors of polynomials. I also stated the irreducible polynomials of degree 4 and that at that degree it is no loger sufficient to check for roots. We computed the order of an irreducible polynomial of degree 3 by just observing the 2^3-1 is prime and thus is divisible only by 1 and 7, the order cannot be less than the degree, hence it is 7. For those of degree 4 we needed a bit more work, but only needed to check whether it was 5 or not, after excluding divisors 1 and 3 as being smaller than the degree. In the first case the degree was not 5 and hence had to be 15 (it needs to divide 2^n-1). In the second case it was 5. A polynomial that achieves maximal order is called a primitive polynomial.
For primitive polynomials we have periods 2^n-1 and 1, for irreducible but not primitive we have (2^n-1)/ord(P) disjoint sequences of period ord(P) and one of period 1.We don't know yet how to handle polynomials that are not irreducible but I briefly showed sums of polynomials as a preview for next week and what is happening when we have multiple factors. See here for the slides that I showed of the sums.
Note that P(x)=det(xI-C) implies that P(C) = 0, where you take powers of C in place of powers of C, so C is interchangable with x modulo P(x) and C^r = 1 corresponds to x^r =1 mod P(x), i.e. the order of C equals the order of P(x), which explains why we defined order of P this way. Note that in the instructions you computed the order of C and the orders of the factors of P(x), so you didn't always get the same results -- or rather, you only got the same results for the cases that P was irreducible.

Pictures of black boards are here.

The second homework is due on Thursday 28 November 2024 at 13:30. Here is the second homework sheet. Please remember to submit your homework by encrypted, signed email to the TAs. Don't forget to include your public key and those of your team mates.

25 Nov 2024
We defined a few more propoerties of LFSRs and their characteristic polynomials so that we had more toold available and could tackle understanding what is actually going on. The main result of this lecture is that the characteristic polynomial of the LFSR that one obtains as the sum of two LFSRs equals the lcm of the characteristic polynomials of the LFSRs that are added up. We also proved that for an irreducible characteristic polynomial all non-zero output sequences have the same period length (I had stated this before, now it's proven). Both of these results needed generating functions, which are very useful tools to know anyways.

This gave us a different way to define the characteristic polynomial of a squeence, namely as the polynomial P of degree n so that when one multiplies the generating function by the reciprocal of P one optains a polynomial of degree less than n. Last year I had gotten stuck on that proof, so I did that one using slides. this time.

Generating functions were also helpful to analyse sums of LFSRs. I showed some examples from these slides which made us understand that normally we get a sequence of period the lcm of the periods of the summand sequences, but we also saw that using the same LFSR twice can give period 1 (because they cancel) and is at best as good as just one of them, so not a smart idea to do this, We also had an odd example where we added two sequences and got a period that was shorter than the input. With the proof we now know how to analyze characteristic polynomials that are composite. I wrote the steps on the board.

On the practical side taking sums of LFSRs can be motivated by asking what we can build from given small hardware components, and in fact many designs are actually built from such pieces, but it is also a useful step in. our quest to understand LFSRs which do not have irreducible LFSRs and we got some ideas – and disproved some.

LFSRs are used in practice because they are small and efficient, but they need a non-linear component to be secure. I covered three ciphers from telecommunication (A5/11, A5/2, and SNOW-3G) which all are based on LFSRs with some nonlinear components. See the slides for their definitions and some details.
These ciphers were in every cell phone using GSM 2G and downgrade attacks were possible for a long time. It sounds like the 54 bits were a compromise between countries wanting strong crypto and others wanting weak crypto. One of the first postings on it with some details on the history (note that the original link does not work now) an attack idea for A5/1 by Ross Anderson is from 1994, but many details were missing. The full algorithm descriptions of A5/1 and its purposefully weakened sibling A5/2 were reverse engineered and posted in 1999 by Marc Briceno, Ian Goldberg, and David Wagner. The same group also showed a devastating attack on A5/2, allowing for real-time decryption. Sadly enough, the A5 algorithms allow downgrade attacks, so this is a problem for any phone which has code for it, which is most until recently. Also A5/1 does not offer 2^54 security (54 bits is the effective key length) but only 2^24 (with some precomputation/space). However, A5/2 is broken even worse, in 2^16 computations, with efficient code online, e.g A5/2 Hack Tool.

Further reading on finite fields and the mathematical theory of LFSRs is in Lidl/Niederreiter (see literature section), van Tilborg (see literature section), and David Kohel's lecture notes.

While I was answering some questions the guy who cleans the boards came in and started wiping, so some of what I wrote is lost. I took photos of what was still there but almost 3/4 of the first board are gone. A student was so nice to let me take photos of their handwritten notes and I'm posting those. Please note that these are inlcuing conntent copied from the slides, thus are more complete than what my board pictures cover and have some overlap with what you get from looking at the pdf files. The photos are here.

28 Nov 2024
Here is the exercise sheet for block 5 and 6: exercise-3.pdf.

The cipher we analyzed in the exercise session is RC4. Even as recently as 2013 this was the preferred symmetric cipher for https. (The alternative was a block cipher in CBC mode, which we'll learn about next week). I find it quite surprising that such a widely used cipher exhibits properties we can find in an exercise session (well, knowing where to look, of course).
Note that RC4 was a secret design, available only as black box implementation. Soon after it was leaked as "arcfour" weaknesses were found.
I summarized some weaknesses on the board but for showing some of the biases I used these slides.

I had planned on covering a bit more today, but it was also good to get the feedback (thanks to GEWIS for moderating).
On Monday I showed a table regarding the best known attacks on some real-world stream ciphers. That table is from https://eprint.iacr.org/2017/511, a paper my Alex Biryukov and Leo Perrin. They wrote a nice overview of lightweight ciphers, including more modern and less broken ones.

Since then even more details on GSM have come out and in 2021 details of two more GSM ciphers, GEA-1 and GEA-2, were published and analyzed. I mentioned that I wanted to show some more, take a look at page 7 of the scientific publication about the attack. The attack made news in 2021 because many new cell phones were still supporting GEA-1, even though it was "removed" from the standard by ETSI in 2013, and because it was unusually weak: GEA-1 has a 64-bit key but it is designed such that after initialization there are only 2^40 different states (instead of 2^64 that should be possible). The design was not public, but a cursory look would not reveal this property; this strongly looks like a back door. GEA-2 does not have this problem (and so far is not discontinued) but it also offers far less security than it should. It is unclear whether the techniques used in the attack were known in the 1990s.

If this didn't make you sufficiently skeptical of everything ETSI, here is another one that came out this summer (2024) TETRA:BURST against the Terrestrial Trunked Radio (TETRA) standard, a standard by ETSI which is used for radio communication incl. for critical infrastructture and emergency servvices, and which totally crumbled under analysis. We had Carlos and Wouter give a talk about their work at the last Security in Times of Surveillance.

Better stream ciphers exist, e.g. the final portfolio from the eSTREAM competition has held up well.

Pictures of black boards are here.

The third homework is due on 05 December 2024 at 13:30. Here is the third homework sheet. Please remember to submit your homework by encrypted, signed email to all the 7 TAs (and not to Tanja). Don't forget to include your public key and those of your team mates. Do not submit as a singleton or pair, the minimum group size is 3.

p>02 Dec 2024
Stream ciphers protect the confidentiality of messages but not the integrity. An attacker flipping a ciphertext bits causes the same bit to flip in the plaintext. Integrity protection is achieved using Message Authentication Codes (MACs).in order to defie MACs we need hash functiojns, so we spent a good portion of the lecture on that topic and will finish MACs on Thursday.
Some warning up front:
MACs belong to symmetric-key crypto and veryifying a MAC tag takes the same key as generating one. Hence the receiver is convinced of the authenticity of a message but cannot convince a 3rd party of it. Somebody asked about collision reistance today and I talked about about signatures. Signatures are verified using a public key (when you send us your homework you incluse that key) and a signature can be verified by a third party and links uniquely to you, the owner of the private key matching that public key.
To build MACs we need the definition of cryptographic hash functions. These compress arbitrary-length messages to fixed-length messages. Hash functions find use also outside crypto, e.g. to deduplicate data and as short fingerptints and you might have encountered some in a course on data structures, but we require more properties. Cryptographic hash functions need to provide pre-image resistance, second pre-image resistance, and collision resistance. If the output of the hash function has n bits then finding a collision takes O(2^n/2) trials (use the birthday paradox to see this) and finding a preimage or second preimage takes on average 2ⁿ trials.
MD4 is completely broken; for MD5 it's easy to find collisions and meaningful ones are possible, see MD5 considered harmful today: Creating a rogue CA certificate. First SHA-1 collisions were computed in 2017 (see https://shattered.io/) followed by a way to create meaningful SHA-1 collisions in https://sha-mbles.github.io/. SHA-2 (which includes SHA-256, SHA-384, and SHA-512) and SHA-3 (and the other SHA-3 finalists) are likely to be OK.
We covered design of hash functions using the Merkle-Damgaard construction. The iterative design of the function makes it eaiser to study it. H is collision resistant if C is, so cryptanalysts can focus on C. Note that the IV is fixed in hash functions.

I used these slides to show HMAC and I commmented that that the siple idea of using H(k,c) is vulnerable to length-extension attacks. I'll pick that up on Thursday..

Pictures of black boards are here.

05 Dec 2024
Here is the exercise sheet for block 5 and 6: exercise-4.pdf.

I stated the attack model for MAC (attacker gets pairs of message and valid tag, either on random messages or on messages of their choice) and showed how a length extension attack on the simple MAC(m,k) = H(k,m) works if H is a hash function following the Merkle–Damgaard design. One can aboid this by usding different hash functions (e.g./ SHA-3), forcing m to have fixed length, ot requiring some particular padding in the last block M_t-1. Be aware that it's hard to get padding requirements right and that the attacker might. Using MAC(m,k) = H(m,k) instead is also an option, but then a collision in H can be extended to a forgery (of course, we don't want collisions to be findable in the first place).
As a small anecdote regarding length-extensions for MD hashes I told the story of us competing to find a close preimage. Our writeup is here.

Note that signatures and MACs satisfy different requirements. Signatures can be used to convince a 3rd party of the authenticity of a document while MACs are part of the symmetric-crypto world and anything that A presents as coming from B could have been generated by A herself.

The exercise session today covered DES as an example of a block cipher. DES is a Feistel cipher. I showed the general design and gave some description of how it is possible to decrypt even though the functions f_i are not invertible. The S-box is the non-linear part; the NSA strengthened the S-boxes in the original design (Lucifer) against differential attacks &bdash; but made the keys much shorter, going from 128 to only 56 bits. 56 bits for the key is not secure enough! First brute force attack was done with "DES Cracker" for 250k USD. In 2006 a team from Bochum and Kiel built COPACOBANA which can break DES in a week for 8980 EUR (plus some grad-student time). We'll discuss attack costs on Moday. In the exercises we saw that small changes in the input lead to big changes in the output.

Block ciphers can encrypt data in blocks of fixed length n. If you encrypt each block separately your encryption is vulnerable to statistical attacks, A famous example of how weak this is is the ECB penguin. The name for this approach is ECB (electronic code book) mode because it's just an electronic version of a substitution cipher. The approach means that identical blocks encrypt the same way which often leaks information (the penguin is still recognizable).
To encrypt messages longer than one block you need to use a mode of operation. Instead of ECB we want to use approaches in which repeating plaintext blocks within a message do not lead to repeating ciphertext blocks. We also need to use an IV so that completely identical messages do not lead to identical ciphertexts. Today we covered CBC, more modes to come on Monday. CBC has been around for a long time. We covered how to get encrypt and decrypt and that encryption is inheriently sequential while decryption is local. A transmission error in C_i leads to a change in M_i and M_i+1.
Always make sure to include a MAC!

Pictures of black boards are here.

The fourth homework is due on 12 December 2024 at 13:30. Here is the fourth homework sheet. Please remember to submit your homework by encrypted, signed email to all 7 TAs (and not to Tanja). Don't forget to include your public key and those of your team mates. Do not submit as a singleton or pair, the minimum group size is 3.

09 Dec 2024
Monika Trimoska was so nice to jump in and cover the lecture for me.
She covered methods for computing exponentiation, introduced RSA, and covered a first attack on schoolbook RSA.
For exponentiation she sent me this pdf file showing what she wrote. You can also take a look at these slides. Try computing a^b mod n on your pocket calculator to get immediate feedback on the importance of modular reductions (or see your computer running out of memory when computing m^b for large b).

She covered schoolbook RSA encryption and why it works using these slides. Then she covered one pitfall of schoolbook RSA which happens because of too small exponents and lack of padding. If e receipients all use public exponent e (and their own moduli n_i) then one can recover m^e modulo the product of the moduli which is m^e as an integer (without reduction because n1*n2*....*ne is larger than m^e) from which one can compute the integer e-th root with a normal calculator. For the description and a numerical example for e=3 see these slides.

We'll see more attacks in this scenario on Thursday (exercise and lecture). This is a good moment to brush up some mathematics, we need Fermat's little theorem, XGCD, and the Chinese Remainder theorem.

12 Dec 2024
Here is the exercise sheet for block 5 and 6: exercise-5.pdf.

To finish off symmetric cryptography I explained OFB and CTR mode in detail using these slides.
With only 56 bits in the key an atack takes 2^56 trial encryptions to find the likely candidate key k given a pair (m, Enc_k(m)). 2-DES has 112 bits in the key but with a meet-in-the-middle attack can be broken in 2^57 computation and 2^56 space. Still in common use at banks is 3-DES, sometimes with k1=k3. Full 3-DES needs 2^112 steps to break, there are some more weaknesses in 2-key 3-DES (k1=k3). I used the second half of these slides.

We then did a context switch to public key crypto, starting where Monika had let off on Monday, namely that schoolbook RSA is a bad choice. The second weakness is if there are related messages, such as m_2=am_1 + b for known a and b, we can recover m_1 from expressions in the public values a,b, c_1 and c_2. The proof of the relation between the expressions A and B is on these slides.

To understand how to get out of the problem of studying all kinds of attacks we looked into security notions. Security notions and attack definitions formalize what we consider an attack and what powers the attacker has. See these slides for more precise definitions.

For signatures, the attack goal is to forge signatures, this could mean to generate any valid (m,s) pair (existential forgery) or to generate valid (m,s) for a meaningful message m (universal forgery).
The abilities of the attacker vary; for signatures it might be a key-only attack (KOA), a known message attack (KMA), or a chosen message attack (CMA). In the latter two cases the attacker sees valid signature pairs (m,s); in CMA the attacker can choose for which messages he sees signatures.
For encryption the goal is to recover plaintext from ciphertext;, i.e. to break one-wayness. We typically request that a scheme is so strong that the attacker learns no information about the plaintext from the ciphertext; this is called semantic security. However, this is hard to deal with in practice. An equivalent and more practical security requirement is indistinguishability: the attacker chooses two messages m0 and m1 and is then presented with a ciphertext c which encrypts one of m0 and m1. The attacker wins if he correctly guesses which, i.e., if he can distinguish the ciphertexts. To deal with a 50% chance of guessing, the advantage of the attacker is defined as the extra chance on top of the 50%.
The attacker may do a chosen-plaintext attack (CPA) or a chosen-ciphertext attack (CCA). There are two versions of CCA security: in CCA-I the attacker gets to request decryptions of arbitrary ciphertexts until he sees challenge ciphertext c; in CCA-II the attacker can request decryptions of ciphertexts c' (not equal to c) also after receiving c.

The following uses these slides. We talked about homomopphic encryption and that it might be a desirable feature. However, if the homomorphic property is not used, it should be avoided as it is an extra tool for an attacker. We observe that schoolbook RSA is homomorphic with respect to multiplication. We showed that any homomorphic system, and thus schoolbook RSA in particular, is not OW CCA-II secure, because of the homomorphic property: to decrypt c the attacker can ask for the decryption of c'=c*2^e, obtain m' and get m = m'/2.
Because schoolbook RSA is deterministic, it is not even IND-CPA secure: the attacker can simply encrypt m0 and m1 himself and compare the results to c.

To make RSA a randomized encryption one uses some padding; however this is also error prone. PKCS#1 v1.5 is a negative example which is broken using Bleichenbacher's attack. Take a look at https://robotattack.org/ for a somewhat recent use of Bleichenbacher's attack in practice. You should be able to understand details of the full paper Return Of Bleichenbacher's Oracle Threat. RSA-OAEP is a better padding scheme.
You can find PKCS#1 v1.5 and the basic obsevation by Bleichenbacher on the last page of these slides

The fifth homework is due on 19 December 2024 at 13:30. Here is the homework sheet. Please remember to submit your homework by encrypted, signed email to all 7 TAs. Don't forget to include your public keys.

16 Dec 2024
I suunarized some things about RSA and that RSA is atypical in that decryption matches signing and encyprion matches verification. Normally such operations are very different (apart from the general data flow that the former need the private key and the latter the public key).

I showed a summary of how RSA get broken and then in detail the p-1 method and how and why it works and showed some numerical examples from these slides.

Today we covered the Diffie–Hellman key exchange as interactive protocol and as semi-static DH where one part uses a long-term public-key using these slides.

After discussing some examples of bad groups and good groups (multiplicative group of a finite field or group of points on an elliptic curve) we coverd a generic attack, i.e., an attack that works for any group.
Eve can compute the shared secret if she can compute a from h_A or b from h_B. There might other ways than computing these discrete logarithms but it is important to choose the groups such that these attack are as hard as possible.
The Baby-Sep-Giant-Step (BSGS) algorithm is an algorithm to compute discrete logarithms in a cyclic group with generator g, i.e. given g and h_A =g^a it computes a.
Put m=floor(√l), where l is the order of g. For us that is p-1 for now. BSGS computess all powers of the generator from g^0=1 up to g^(m-1), these are the baby steps (incrementing by 1 in the exponent). Then it computes S=g^(-m) and checks for each h_A * S^j for j = 0,... whether it is in the list of baby steps; these are the giant steps (moving by -m in the exponent). If a match happens, we have g^i=h_A * S^j = g^a*g^(-mj), thus a=i+mj.
The general method is shown on these slides

I showed a numerical example of BSGS from these slides

19 Dec 2024
The 6th exercise sheet exercise-6.pdf is here. Since some of you might not make it to the instruction you can also verify your results with the quiz on Canvas.

The lecture was given by Monika Trimoska.
She repeated the DLP, CDHO, and DDHP problems and then showed a simple attack against the DDHP that works if g generates the whole multiplicative group F_p^*. The attack reuqires computing the (p-1)/2-th power of g^a and g^c to determine if a and/or c are even. If a is odd we also need to compute the parity of b the same way. We get the parity by observing that g^((p-1)/2) = -1 and g^(p-1) = 1; the former holds for all odd powers of g and the latter for all even powers. If c is chosesn randomly then there is only a 50% chance that the parity of ab and c match giving us a high chance of solving DDH. This problem can be avoided by choosing p = 2p' + 1 and g of order (p-1)/2.
Monika used these slides and did some extra explanations regarding the probabilities on the board; pictures of black boards are are here.

Monika showed ElGamal encryption and ElGamal signatures which both use the discrete logarithm using these slides.
ElGamal encryption requires the message m to be an element of the group, so typically it is easier to use DH to generate a share secret and then symmetric crypto to encrypt and authenticate the message. ElGamal encryption is interesting if one wants homomorphic properties (see the earlier discussion abour RSA being homomorphic and that that means that the scheme cannot be OW-CCA-II secure) as the product of two ciphertexts (taken component wise) is an encryption of the product of the messages. ElGamal encryption starts the same as the DH key exchange, with Alice choosing a secret key a and computing and publishing public key h_a = g^a. Then Bob, who wants to encrypt some message m to Alice, chooses a random k and computes r = g^k. He also takes Alices public key and computes (h_a)^k. This is basically the shared key from DH. He then encrypts the message by multiplying with this key: c = m * (h_a)^k. Finally, he sends (r,c) to Alice. Alice can decrypt by computing m = c / r^a.

ElGamal signatures are a little more involved. We will use l = ord(g). Alice again picks random a and computes and publishes h_a = g^a. To sign a message m, she picks a random k, computes r=g^k, and s=k^{-1}(H(m)-ar) mod l, where H is a cryptographic hash function. Every term in s is defined modulo l, except r, which is defined modulo p. Therefore the value that is send for r should be used exactly as is. The signature is then (r,s). It is accepted when g^{H(m)}-r^s(h_a)^r = 0. Anyone not knowing a cannot compute s. Note that computations in the exponent happen automatically modulo l, and that all computations involving group elements (g and h_a and r) happen modulo p. Breaking the system is as hard as the DDH problem. ElGamal signatures serve as a basis for more modern signature systems such as DSA.

As a third topic, Monika discussed the idea of secret sharing using these slides. In many cases the private key should not be in posession of a single user. A nice analogy is a bank vault which requires multiple keys to open it. The Shamir secret sharing scheme is a t-out-of-N threshold system based on polynomials. The idea behind it is that it requires t points to recover a (t-1)-degree polynomial. The sharer generates a random polynomial of degree (t-1), such that f(0)=s. The values at x=1, x=2, etc. are the shares received by the parties. The polynomial, or only the secret value, can be reconstructed from sufficient shares by Lagrange interpolation. t-1 or fewer users learn nothing about the secret (every possible value has the same probability).
In practice nobody should ever know s. In theory a trusted party could compute and use s, then forget it, and only share the result of the required computation. A better solution is that users locally compute the desired output with their shares, and only then combine their shares to reveal the result (but not s). Also, often you want nobody to know the secret (which could be a secret key), not even the sharer. This can be achieved by generating it in a distributed manner: Everybody picks a random integer. The secret is then the sum of all the integers. To compute it every user shares its integer in a t-out-of-N manner (every user should receive the shares corresponding to the same x-value), and then adds all the shares it received (including the share of its own secret). This is basically adding the underlying random polynomials.

Enjoy your holidays. If you want to do some crypto take a look at the old exams (below). The exam for 2MDB60 will be in person, on paper, without laptops, without books. so pick exams no more recent than Jan 2020 and the exams from the last two years (exam dates in 2023 and 2024) for practicing in the same setting. The intermediate two years (2021 and 2022) had different conditions -- but still fun problems.

The sixth (and last) homework is due on 09 January 2025 at 13:30. Here is the homework sheet. Please remember to submit your homework by encrypted, signed email to all 7 TAs. Don't forget to include your public keys. This is your last chance if you still need to do an encrypted round of communication.

06 Jan 2025
No lecture today.

09 Jan 2025
Here is the exercise sheet for block 5 and 6: exercise-7.pdf.

I gave some hints of how to solve the secret sharing exercise for RSA.
I then commented on the problem with ElGamal encryption that m has to be in the group generated by g. This either causes inconvenience, as some integer values between 0 and p-1 cannot be encrypted or one needs a more complicated method to encode and decode, or it causes a security problem if users just encrypt messages outside G anyways, As an example we covered the case that G consists of the squares in F_p^*, yes (m0) is a square and no (m1) is a non-square, permitting the attacker to easily check which one was encrypted because c is a square for m0 and a non-square for m1. Remember that c is a square iff c^((p-1)/2) is 1 and a non-square if that gives -1.
For an interesting case of use for ElGmal encryption where exactly this problem of not restricting to messages in G happened in real, see this research by Pierrick Gaudry here on the Moscow electronic voting system which didn't ensure that all messages were squares while g generated the subgroup of squares. This meant that an attacker can distinguish whether a vote was for a candidate for whom the corresponding m is a square or not.

Motivated by the problems with ElGamal encryption and RSA encryption we covered two solution for how to encrypt messages. Both fall back to using symmetric crypto to encrypt and authenticate the actual message and public-key crypto to achieve that sender and receiver share the symmetric key.
The first option is to use semi-static DH, covered some lectures ago, where A publishes h_A as public key and B picks a one-time secret k to communicate with her. B sends g^k and the symmetric authenicated encryption of his message und a key h, which is Hash(h_A^k). A can compute the same key as Hash(r^a).
Then I showed the KEM-DEM paradigm and how to turn a PKE into a KEM by encrypting a random message from the message space and using Hash(m,c) as key. This nicely avoids all attacks using that m is human readable or tnat it can have some algebraic structure.

Any system based on DLP has at most square root of the group order hardness of the DLP because of BSGS. For elliptic-curve groups that's also the best known attack complexity while the number field sieve is a faster attack on finite-field DLP which reduces the security per bit of p to that of RSA numbers, so fields need to have 2048 - 4096 bits. This attack is subexponential in p but the subgroup in which the DLP needs to be solved can be smaller. The element g is always taken to generate a prime-order subgroup. Choices are that p= 2p'+1 for p' a prime as well and then g generating the subgroup of squares or that g generates a much smaller (256 bits) subgroup.

I had already covered how an active attacker can mount a man-in-the-middle attack on DH by ipersonating Bob to Alice and Alice to Bob while in fact doing separate key exxchanges with them. Today we covered the Needham-Schroeder identification protocol which was meant to ensure authenticity, i.e., that A is sure that she talks to B and that B is sure that he talks to A. The scheme uses PKE (so no signatures) and assumes that A and B know each others public keys. Nevertheless Eve can impersonate Alice to Bob if she can get Alice to start a conversation with her (no impersonation there, A knows she is talking to E). The reason that this works is that Bob's reply is not linked to the scope of his exchange with Eve and Eve can pass it on to Alice as her own message in the conversation Alice–Eve.. It's easy to fix this by including the public keys of all communicating parties in the encrypted message or include the hash of the transcript in the encrypted message (if space is an issue). We'll get back to this on Monday

Pictures of black boards are here.
Thanks to the student who took them for me.

13 Jan 2025
No lecture, see https://www.tue.nl/en/news-and-events/news-overview/12-01-2025-network-tue-is-shut-down-no-education-on-monday. Please watch https://www.youtube-nocookie.com/embed/BSfoSNr1S58 and https://www.youtube-nocookie.com/embed/8GsZkdXONOk. We"ve covered part of the first video on Thu already.

16 Jan 2025
No instruction or lecture tdday, see TU/e's statement.

20 Jan 2025
In the previous lecture we covered how Eve can use Alice as a decryption oracle to impersonate her to Bob in the Needham-Schroeder protocol. This lecture I covered how to avoid this attack and lessons learned. We also discussed how to avoid this attack using signatures or 3-DH with long-term keys and epehemeral keys.

Finally we covered some unusual types of signatures. Blind signatures are used in taler.net and make constructive use of the homomorphic properties of RSA signatures (without hashing). I showed how the scheme works and also commented on what to watch out for, in particular double spending.
Undeniable signatures limit who can verify a signature and require interaction between the signer and the verifier to verify. I showed Chaum's protocol which uses the DLP for 'signatures' and how an interactive protocol shows that the signature is valid – and how Alice can show that she did in fact not sign a message. For a numerical example see the last two pages from these slides. I also covered disavowal using the slides. Judging fron questions after the lecture, this went a bit too fast for the argument why this proves that Alice is honest and really didn't sign. Do note that only an honest reply from Alice permits dividing by g^f to get a pure power of s -- and the same as in the other reponse when dividing by g^t (and raising to the matching exponents e and r).

Pictures of black boards are here.

23 Jan 2025
Both the instruction and the lecture are on exam preparation. For the instructions please prepare and ask about the exam from January 2024: here.
For the lecture slot you can ask me anything incl old exams. Most relevant are the last six exams and the exams of Jan 2020 and eaerlier, i.e., in the handwritten format which matches what you'll encounter on 3 Feb. It's best if you come prepared with questions or even email me them before; I will talk about them in the order in which I get them.
You can see the questions on the pictures below. For LFSRs I covered the April 2024 exam. I tried to document more than neede by the approach and solution.
Other topics were the modes exercise from April 2024 and NTRU from Jan 2024.

Pictures of black boards are here.

3 Feb
Exam! Please check online before the exam for the room info, looks like we're in the sports complex. Bring some extra pullover, in previous years that room was insufficiently heated for the outside weather and sitting still.

Old Exams

This course was given for the first time in Q2 of 2014. Here are my exams so far

Exam from 03 February 2025: here.
Exam from April 2024 here.
Exam from 22 January 2024: here.
Exam from April 2023 here.
Exam from 23 January 2023: here.
Exam from April 2022: here.
Exam from 24 January 2022: here.
Printing from Ans deleted the pictures. Here are the PCBC MAC and two examples of modes CFB|CBC and CBC|CBC.
Exam from April 2021: here. Printing from Ans deleted the pictures. Here are the simple MAC and two examples of modes CRT|CBC and OFB|CBC.
Exam from 25 January 2021: here.
Printing from Ans deleted the pictures. Here are the LFSR schematic and the MAC from exercise .
Exam from 27 January 2020: here.
Exam from 22 January 2018: here.
Note that in the text of exercise 6 the equation for S_1 should use O_1, not O_i.
Exam from 19 January 2017: here.
Exam from 18 January 2016: here.
Exam from 19 January 2015: here.
Practice exam 2014/2015: here. Note: the exercise about breaking RSA when all recipients have the same exponent and get the same message has too large numbers for your calculator.