# Cryptography

# Introduction to Cryptography

Security and privacy are core principles in computing, enabling a wide
range of applications including online commerce, social networking,
wireless communication, and so on. *Cryptography*, which is concerned
with techniques and protocols for secure communication, is fundamental
to building systems that provide security and privacy. In this unit,
we will examine several cryptographic protocols, which address the
following needs:

*authentication*: proving one’s identity*privacy/confidentiality*: ensuring that no one can read the message except the intended receiver*integrity*: guaranteeing that the received message has not been altered in any way

Our standard problem setup is that we have two parties who wish to
communicate, traditionally named *Alice* and *Bob*. However, they are
communicating over an insecure channel, and there is an eavesdropper
*Eve* who can observe all their communication, and in some cases, can
even modify the data in-flight. How can Alice and Bob communicate
while achieving the goals of authentication, privacy, and integrity?

A central goal in designing a cryptosystem is *Kerckhoff’s principle*:

A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

Thus, we want to ensure that Alice and Bob can communicate securely
even if Eve knows every detail about what protocol they are using,
other than the *key*, a secret piece of knowledge that is never
communicated over the insecure channel.

We refer to the message that Alice and Bob wish to communicate as the
*plaintext*. We want to design a cryptosystem that involves encoding
the message in such a way as to prevent Eve from recovering the
plaintext, even with access to the *ciphertext*, the result of
encoding the message. There are two levels of security around which we
can design a cryptosystem:

*Information-theoretic*, or*unconditional*, security: Eve cannot learn the secret message communicated between Alice and Bob, even with unbounded computational power.*Computational*, or*conditional*, security: to learn any information about the secret message, Eve will have to solve a computationally hard problem.

The cryptosystems we examine all rely to some extent on modular arithmetic. Before we proceed further, we review some basic details about modular arithmetic.

## Review of Modular Arithmetic

Modular arithmetic is a mathematical system that maps the infinitely
many integers to a finite set, those in \(\{0, 1, \dots, n-1\}\) for
some positive *modulus* \(n \in \Z^+\). The core concept in this system
is that of *congruence*: two integers \(a\) and \(b\) are said to be
congruent modulo \(n\), written as

when \(a\) and \(b\) differ by an integer multiple of \(n\):

Note that \(a\) and \(b\) need not be in the range \([0, n)\); in fact, if \(a \ne b\), than at least one must be outside this range for \(a\) and \(b\) to be congruent modulo \(n\). More importantly, for any integer \(i \in \Z\), there is exactly one integer \(j \in \{0, 1, \dots, n-1\}\) such that \(i \equiv j \pmod{n}\). This is because the elements in this set are at most \(n-1\) apart, so no two elements differ by a multiple of \(n\). At the same time, by Euclid’s division lemma, we know that there exist unique integers \(q\) and \(r\) such that

where \(0 \le r < n\). Thus, each integer \(i \in \Z\) is mapped to exactly one integer \(j \in \{0, 1, \dots, n-1\}\) by the congruence relation modulo \(n\).

Formally, we can define a set of *equivalence classes*, denoted by
\(\Z_n\), corresponding to each element in \(\{0, 1, \dots, n-1\}\):

Each class \(\overline{j}\) consists of the integers that are congruent to \(j\) modulo \(n\):

However, the overline notation here is cumbersome, so we usually elide it, making the equivalence classes implicit instead:

We refer to determining the equivalence class of an integer \(i\) modulo
\(n\) as *reducing* it modulo \(n\). If we know what \(i\) is, we need only
compute its remainder when divided by \(n\) to reduce it. More commonly,
we have a complicated expression for \(i\), consisting of additions,
subtractions, multiplications, exponentiations, and so on. Rather than
evaluating the expression directly, we can take advantage of
properties of modular arithmetic to simplify the task of reducing the
expression.

Suppose \(a \equiv a' \pmod{n}\) and \(b \equiv b' \pmod{n}\) for a modulus \(n \ge 1\). Then

and

By definition of congruence, we have \(a - a' = kn\) and \(b - b' = mn\) for some integers \(k\) and \(m\). Then

Since \(a + b\) and \(a' + b'\) differ by an integer (\(k + m\)) multiple of \(n\), we conclude that \(a + b \equiv a' + b' \pmod{n}\). The proof for \(a - b \equiv a' - b' \pmod{n}\) is similar.

Suppose \(a \equiv a' \pmod{n}\) and \(b \equiv b' \pmod{n}\) for a modulus \(n \ge 1\). Then

By definition of congruence, we have \(a - a' = kn\) and \(b - b' = mn\) for some integers \(k\) and \(m\). Then

Since \(ab\) and \(a'b'\) differ by an integer (\(kmn + a'm + b'k\)) multiple of \(n\), we conclude that \(ab \equiv a'b' \pmod{n}\).

Suppose \(a \equiv b \pmod{n}\). Then for any integer \(k \ge 0\),

We can prove Corollary 162 by observing that \(a^k = a \cdot a \cdot \dots \cdot a\) is the product of \(k\) copies of \(a\) and applying Property 160 along with induction over \(k\). We leave the details as an exercise.

The following is an example that applies the properties above to reduce a complicated expression.

Suppose we wish to find an element \(a \in \Z_7\) such that

Note that the properties above do not give us the ability to reduce any of the exponents modulo 7. (Later, we will see Fermat’s little theorem, which does give us a means of simplifying exponents. We also will see fast modular exponentiation, but we will not use that here.) We can use Property 162 once we reduce the base

which we can recursively reduce using the properties above.

Let’s start with \(2^{203}\). Observe that \(2^3 = 8 \equiv 1 \pmod{n}\). Then

Similarly, \(3^3 = 27 \equiv -1 \pmod{7}\), so \(3^6 \equiv 1 \pmod{7}\). This gives us

We also have \(4^3 = 2^6 = (2^3)^2 \equiv 1 \pmod{7}\), so

Combining these using the addition and multiplication properties above, we get

We can now reduce the full expression:

Thus, \(a \equiv 4 \pmod{7}\).

### Fast Modular Exponentiation

There are several ways to compute a modular exponentiation \(a^b \pmod{n}\) efficiently, and we take a look at two here.

The first is to apply a top-down, divide-and-conquer strategy. We have:

This leads to the following algorithm:

This gives us the following recurrence for the number of multiplications and modulo operations:

Applying the Master theorem, we get \(T(b) = \O(\log b)\).

Furthermore, the numbers in this algorithm are always computed modulo \(n\), so they are at most as large as \(n\). Thus, each multiplication and modulo operation can be done efficiently with respect to \(\O(\log n)\), and the algorithm as a whole is efficient.

An alternative method is to apply a bottom-up strategy. Here, we make use of the binary representation of \(b\),

where \(b_i\) is either 0 or 1. Then

Thus, we can compute \(a^{2^i}\) for each \(0 \le i \le r\), where \(r = \floor{\log b}\). We do so as follows:

The operation \(\floor{b/2^i}\) is a right shift on the binary representation of \(b\), and it can be done in linear time. As with the top-down algorithm, we perform \(\O(\log b)\) multiplication and modulo operations, each on numbers that are \(\O(\log n)\) in size. Thus, the runtime is efficient in the size of the input.

### Division and Modular Inverses

We have seen how to do addition, subtraction, multiplication, and
exponentiation in modular arithmetic, as well as several properties
that help in reducing a complication expression using these
operations. What about division? Division is not a closed operation
over the set of integers, so it is perhaps not surprising that
division is not always well-defined in modular arithmetic. However,
for some combinations of \(n\) and \(a \in \Z_n\), we can determine a
*modular inverse* that allows us to divide by \(a\). Recall that in
standard arithmetic, dividing by a number \(x\) is equivalent to
multiplying by the \(x^{-1}\), the multiplicative inverse of \(x\). For
example:

In the same way, division by \(a\) is defined modulo \(n\) exactly when an inverse \(a^{-1}\) exists for \(a\) modulo \(n\).

Let \(n\) be a positive integer and \(a\) be an element of \(\Z_n^+\). An
*inverse* of \(a\) modulo \(n\) is an element \(b \in \Z_n^+\) such that

An inverse of \(a\), denoted as \(a^{-1}\), exists modulo \(n\) if and only if \(a\) and \(n\) are coprime, i.e. \(\gcd(a, n) = 1\).

A modular inverse can be efficiently found using the *extended
Euclidean algorithm*, a modification of Euclid’s algorithm.

The complexity bound is the same as for Euclid’s algorithm – \(\O(\log x)\) number of operations.

Given \(x > y \ge 0\), the extended Euclidean algorithm returns a triple \((g, a, b)\) such that

and \(g = \gcd(x, y)\).

We demonstrate that \(g = ax + by\) by (strong) induction over \(y\):

**Base case:**\(y = 0\). The algorithm returns \((g,a,b) = (x,1,0)\), and we have\[ax + by = 1\cdot x + 0\cdot 0 = x = g\]**Inductive step:**\(y > 0\). Let \(x = qy + r\), so that \(q = \floor{x/y}\) and \(r = x \bmod y < y\). Given arguments \(y\) and \(r\), the recursion produces \((g,a',b')\). By the inductive hypothesis, we assume that\[g = a'y + b'r\]The algorithm computes the return values \(a\) and \(b\) as

\[\begin{split}a &= b'\\ b &= a' - b'q\end{split}\]Then

\[\begin{split}ax + by &= b'x + (a' - b'q)y\\ &= b'(qy + r) + (a' - b'q)y\\ &= b'r + a'y\\ &= g\end{split}\]Thus, we conclude that \(g = ax + by\) as required.

Given \(x\) and \(y\), when \(g = \gcd(x, y) = 1\), we have

which imply that

by definition of modular arithmetic. Thus, the extended Euclidean algorithm computes \(a\) as the inverse of \(x\) modulo \(y\) and \(b\) as the inverse of \(y\) modulo \(x\), when these inverses exist.

We compute the inverse of 13 modulo 21 using the extended Euclidean algorithm. Since 13 and 21 are coprime, such an inverse must exist.

We keep track of the values of \(x\), \(y\), \(g\), \(a\), and \(b\) at each recursive step of the algorithm. First, we trace the algorithm from the initial \(x = 21, y = 13\) down to the base case:

Step |
\(x\) |
\(y\) |
---|---|---|

0 |
\(21\) |
\(13\) |

1 |
\(13\) |
\(8\) |

2 |
\(8\) |
\(5\) |

3 |
\(5\) |
\(3\) |

4 |
\(3\) |
\(2\) |

5 |
\(2\) |
\(1\) |

6 |
\(1\) |
\(0\) |

We can then trace the algorithm back up, computing the return values \(g, a, b\):

Step |
\(x\) |
\(y\) |
\(g\) |
\(a\) |
\(b\) |
---|---|---|---|---|---|

6 |
\(1\) |
\(0\) |
\(1\) |
\(1\) |
\(0\) |

5 |
\(2\) |
\(1\) |
\(1\) |
\(0\) |
\(1-0\cdot\floor{\frac{2}{1}}=1\) |

4 |
\(3\) |
\(2\) |
\(1\) |
\(1\) |
\(0-1\cdot\floor{\frac{3}{2}}=-1\) |

3 |
\(5\) |
\(3\) |
\(1\) |
\(-1\) |
\(1-(-1)\cdot\floor{\frac{5}{3}}=2\) |

2 |
\(8\) |
\(5\) |
\(1\) |
\(2\) |
\(-1-2\cdot\floor{\frac{8}{5}}=-3\) |

1 |
\(13\) |
\(8\) |
\(1\) |
\(-3\) |
\(2-(-3)\cdot\floor{\frac{13}{8}}=5\) |

0 |
\(21\) |
\(13\) |
\(1\) |
\(5\) |
\(-3-5\cdot\floor{\frac{21}{13}}=-8\) |

Thus, we have \(by = -8 \cdot 13 \equiv 1 \pmod{21}\). Translating \(b\) to an element of \(\Z_{21}\), we get \(b = -8 \equiv 13 \pmod{21}\), which means that 13 is its own inverse modulo 21. We can verify this:

The extended Euclidean algorithm is a constructive proof that when \(gcd(a, n) = 1\) for \(n \in \Z^+\) and \(a \in \Z_n^+\), an inverse of \(a\) exists modulo \(n\). Show that no such inverse exists when \(gcd(a, n) > 1\), completing the proof of Theorem 166.

## One-time Pad

We now take a look at the first category of cryptosystems, that of information-theoretic security, which relies on one-time pads.

A *one-time pad* is an encryption technique that relies on a key
with the following properties:

The key is a random string at least as long as the plaintext.

The key is preshared between the communicating parties over some secure channel.

The key is only used once (hence the name

*one-time pad*).

As an example of a scheme that uses a one-time pad, consider a plaintext message

where each symbol \(m_i\) is a lowercase English letter. Let the secret key also be composed of lowercase letters:

Here, the key is the same length as the message. We encrypt the message as

by taking the sum of each plaintext character and the corresponding key character modulo 26:

We map between lowercase characters and integers modulo 26, with 0
taken to be the letter *a*, 1 to be the letter *b*, and so on. Then
decryption is as follows:

Applying both operations results in \(D_k(E_k(m)) = m\), the original plaintext message.

As a concrete example, suppose \(m = \texttt{flower}\) and \(k = \texttt{lafswl}\). Then the ciphertext is

and we have \(D_k(E_k(m)) = \texttt{flower}\).

Observe that Eve has no means of learning any information about \(m\)
from the ciphertext \(\texttt{qltoac}\), other than that the message is
six letters (and we can avoid even that by padding the message with
random additional data). Even if Eve knows that the message is in
English, she has no way of determining which six-letter word it is.
For instance, the ciphertext \(\texttt{qltoac}\) could have been
produced by the plaintext \(\texttt{futile}\) and key \(\texttt{lragpy}\).
In fact, for **any** six-letter sequence \(s\), there is a key \(k\) such
that \(E_k(s) = c\) for any six-letter ciphertext \(c\). Without having
access to either the plaintext or key directly, Eve cannot tell
whether the original message is \(\texttt{flower}\), \(\texttt{futile}\),
or some other six-letter sequence.

Thus, a one-time-pad scheme provides information-theoretic security –
an adversary cannot recover information about the message that they do
not already know. In fact, one-time-pad schemes are the **only**
cryptosystems that provide this level of security. However, there are
significant tradeoffs, which are exactly the core requirements of a
one-time pad:

The key must be preshared between the communicating parties through some other, secure channel.

The key has to be as long as the message, which limits the amount of information that can be communicated given a preshared key.

The key can only be used once.

These limitations make one-time pads costly to use in practice.
Perhaps by relaxing some of these restrictions, we can obtain
“good-enough” security at a lower cost? We first take a look at what
happens when we reduce the key size – in fact, we will take this to
the extreme, reducing our key size to a single symbol. This results in
a scheme known as a *Caesar cipher*.

Let \(m = m_1 m_2 \dots m_n\) be a plaintext message. Let \(s\) be a single symbol to be used as the key. Then in a Caesar cipher, \(m\) is encrypted as

and a ciphertext \(c\) is decrypted as

The result is \(D_s(E_s(m)) = m\).

Observe that a Caesar cipher is similar to a one-time pad, except that the key only has one character of randomness as opposed to \(n\); essentially, we have \(k_i = s\) for all \(i\), i.e. a key where all the characters are the same.

Unfortunately, the Caesar cipher suffers from a fatal flaw in that it
can be defeated by statistical analysis – in particular, the relative
frequencies of letters in the ciphertext allow symbols to be mapped to
the underlying message, using a frequency table of how common
individual letters are in the language in which the message is
written. In English, for instance, the letters *e* and *t* are most
common. A full graph of frequencies of English letters in “average”
English text is as follows:

A Caesar cipher merely does a circular shift of these frequencies, making it straightforward to recover the key as the magnitude of that shift. The longer the message, the more likely the underlying frequencies match the average for the language, and the more easily the scheme is broken.

Suppose the result of applying a Caesar cipher produces the ciphertext

Use frequency analysis to determine both the original message \(m\) and the key \(s\).

Note that another weakness in the Caesar-cipher scheme as described above is that the key is restricted to one of twenty-six possibilities, making it trivial to brute force the mapping. However, the scheme can be tweaked to use a much larger character set, making it harder to brute force but still leaving it open to statistical attacks in the form of frequency analysis.

A scheme that compromises between a one-time pad and Caesar cipher, by making the key larger than a single character but smaller than the message size, still allows information to leak through statistical attacks. Such a scheme is essentially the same as reusing a one-time pad more than once. What can go wrong if we do so?

Suppose we have the following two plaintext messages

where each character is a lowercase English letter. If we encode them both with the same key \(k = k_1 k_2 \dots k_n\), we obtain the ciphertexts

If both these ciphertexts go out over the insecure channel, Eve can observe them both and compute their difference:

Here, all subtractions are done modulo 26. The end result is the character-by-character difference between the two messages (modulo 26). Unless the plaintext messages are random strings, this difference is not random! Again, statistical attacks can be used to obtain information about the original messages.

As a pictorial illustration of how the difference of two messages reveals information, the following is an image encoded with a random key under a one-time pad:

The result appears as just random noise, as we would expect. The following is another image encoded with the same key:

This result too appears as random noise. However it is the **same**
noise, and if we subtract the two ciphertexts, the noise all cancels
out:

The end result clearly reveals information about the original images.

In summary, the only way to obtain information-theoretic security is
by using a one-time-pad scheme, where the key is at least as long as
the message and is truly used only once. A one-time-pad-like scheme
that compromises on these two characteristics opens up the scheme to
statistical attacks. The core issue is that **plaintext messages are
not random** – they convey information between parties by virtue of
not being random. In a one-time pad, the key is the actual source of
the randomness in the ciphertext. And if we weaken the scheme by
reducing the key size or reusing a key, we lose enough randomness to
enable an adversary to learn information about the plaintext messages.

# Diffie-Hellman Key Exchange

Many encryption schemes, including a one-time pad, are *symmetric*,
using the same key for both encryption and decryption. Such a scheme
requires a preshared secret key that is known to both communicating
parties. Ideally, we’d like the two parties to be able to establish a
shared key even if all their communication is over an insecure
channel. This is known as the *key exchange* problem.

One solution to the key-exchange problem is the *Diffie-Hellman*
protocol. The central idea is that each party has its own secret key
– this is a *private key*, since it is never shared with anyone. They
each use their own private key to generate a *public key*, which they
transmit to each other over the insecure channel. A public key is
generated in such a way that recovering the private key would require
solving a computationally hard problem, resulting in computational
rather than information-theoretic security. Finally, each party uses
its own private key and the other party’s public key to obtain a
shared secret.

Before we examine the details of the Diffie-Hellman protocol, we discuss some relevant concepts from modular arithmetic. Let \(\Z_q\) refer to the set of integers between 0 and \(q-1\), inclusive:

We then define a *generator* of \(\Z_p\), where \(p\) is prime, as
follows:

An element \(g \in \Z_p\), where \(p\) is prime, is a *generator* of
\(\Z_p\) if for every nonzero element \(x \in \Z_p\) (i.e. every
element in \(\Z_p^+\)), there exists a number \(i \in \N\) such that:

In other words, \(g\) is an \(i\)th root of \(x\) over \(\Z_p\) for some natural number \(i\).

As an example, \(g = 2\) is a generator of \(\Z_5\):

Similarly, \(g = 3\) is a generator of \(\Z_7\):

On the other hand, \(g = 2\) is not a generator of \(\Z_7\):

We see that the powers of \(g = 2\) are cyclic, without ever generating the elements \(\{3,5,6\} \subseteq \Z_7^+\).

If \(p\) is prime, then \(\Z_p\) has at least one generator.

This theorem was first proved by Gauss. Moreover, \(\Z_p\) has \(\phi(p-1)\) generators when \(p\) is prime, where \(\phi(n)\) is Euler’s totient function, making it relatively easy to find a generator over \(\Z_p\).

We now return to the Diffie-Hellman protocol, which is as follows:

Suppose two parties, henceforth referred to as *Alice* and *Bob*,
wish to establish a shared secret key \(k\) but can only communicate
over an insecure channel. Alice and Bob first establish public
parameters \(p\), a very large prime number, and \(g\), a generator of
\(\Z_p\). Then:

Alice picks a random \(a \in \Z_p^+\) as her private key, computes \(A \equiv g^a \pmod{p}\) as her public key, and sends \(A\) to Bob over the insecure channel.

Bob picks a random \(b \in \Z_p^+\) as his private key, computes \(B \equiv g^b \pmod{p}\) as his public key, and sends \(B\) to Alice over the insecure channel.

Alice computes

\[k \equiv B^a \equiv (g^b)^a \equiv g^{ab} \pmod{p}\]as the shared secret key.

Bob computes

\[k \equiv A^b \equiv (g^a)^b \equiv g^{ab} \pmod{p}\]as the shared secret key.

This protocol is efficient – suitable values of \(p\) and \(g\) can be
obtained from public tables, and Alice and Bob each need to perform
two modular exponentiations, which can be done efficiently. But is it secure? By Kerckhoff’s principle, Eve knows
how the protocol works, and she observes the values \(p\), \(g\), \(A
\equiv g^a \pmod{p}\), and \(B \equiv g^b \pmod{p}\). However, she does
not know \(a\) or \(b\), since those values are never communicated and so
are known only to Alice and Bob, respectively. Can Eve obtain \(k
\equiv g^{ab} \pmod{p}\) from what she knows? The *Diffie-Hellman
assumption* states that she cannot do so efficiently.

There is no efficient algorithm that computes \(g^{ab} \pmod{p}\) given:

a prime \(p\),

a generator \(g\) of \(\Z_p\),

\(g^a \pmod{p}\), and

\(g^b \pmod{p}\).

What if, rather than attempting to recover \(g^{ab} \pmod{p}\) directly,
Eve attempts to recover one of the private keys \(a\) or \(b\)? This
entails solving the *discrete logarithm* problem.

Let \(q\) be a modulus and let \(g\) be a generator of \(\Z_q\). The
*discrete logarithm* of \(x \in \Z_q^+\) with respect to the base \(g\)
is an integer \(i\) such that

The discrete logarithm problem is the task of computing \(i\), given \(q\), \(g\), and \(x\).

If \(q\) is prime and \(g\) is a generator of \(\Z_q\), then every \(x \in \Z_q^+\) has a discrete log \(i\) with respect to base \(g\) such that \(0 \le i < q - 1\) [1]. Thus, a brute-force algorithm to compute the discrete log is to try every possible value in this range. However, such an algorithm requires checking \(\O(q)\) possible values, and this is exponential in the size of the inputs, which take \(\O(\log q)\) bits to represent. The brute-force algorithm is thus inefficient.

Do better algorithms exist for computing a discrete log? Indeed they
do, including the baby-step giant-step algorithm, which requires
\(\O(\sqrt{q})\) multiplications on numbers of size \(\O(\log q)\).
However, as we saw in primality testing,
this is still not polynomial in the input size \(\O(\log q)\). In fact,
the *discrete logarithm assumption* states that no efficient algorithm
exists.

There is no efficient algorithm to compute \(i\) given:

\(q\),

a generator \(g\) of \(\Z_q\), and

\(g^i \pmod{q}\),

where \(0 \le i < q - 1\).

Neither the Diffie-Hellman nor the discrete logarithm assumption have been proven. However, they have both held up in practice, as there is no known algorithm to defeat either assumption on classical computers. As we will briefly discuss later, the assumptions do not hold on quantum computers, but this is not yet a problem in practice as no scalable quantum computer exists.

# RSA

The Diffie-Hellman protocol uses private and public keys for key
exchange, resulting in a shared key known to both communicating
parties, which can then be used in a symmetric encryption scheme.
However, the concept of a *public-key cryptosystem* can also be used
to formulate an *asymmetric* encryption protocol, which uses different
keys for encryption and decryption. The *RSA (Rivest-Shamir-Adleman)*
cryptosystem gives rise to one such protocol that is widely used in
practice.

As with Diffie-Hellman, the RSA system relies on facts about modular
arithmetic. Core to the working of RSA is *Fermat’s little theorem*
and its variants.

Let \(p\) be a prime number. Let \(a\) be any element of \(\Z_p^+\), where

Then \(a^{p-1} \equiv 1 \pmod{p}\).

Let \(p\) be a prime number. Then \(a^p \equiv a \pmod{p}\) for any element \(a \in \Z_p\), where

As an example, let \(p = 7\). Then:

Fermat’s little theorem can be extended to the product of two distinct primes.

Let \(n = pq\) be a product of two distinct primes, i.e. \(p \ne q\). Let \(a\) be an element of \(\Z_n\). Then

where \(k \in \N\) and \(\phi(n)\) is *Euler’s totient function*, whose
value is the number of elements in \(\Z_n^+\) that are coprime to
\(n\). For a product of two distinct primes \(n = pq\),

We prove Theorem 183 using Fermat’s little theorem. First, we show that for any \(a \in \Z_n\) where \(n = pq\) is the product of two distinct primes \(p\) and \(q\), if \(a \equiv b \pmod{p}\) and \(a \equiv b \pmod{q}\), then \(a \equiv b \pmod{n}\). By definition of modular arithmetic, we have

where \(k\) and \(m\) are integers. Then:

Since \(p\) divides \(kp\), \(p\) also divides \(mq\); however, since \(p\) is a distinct prime from \(q\), this implies that \(p\) divides \(m\) (by Euclid’s lemma, which states that if a prime \(p\) divides \(ab\) where \(a,b\in\Z\), then \(p\) must divide at least one of \(a\) or \(b\)). Thus, \(m = rp\) for some integer \(r\), and we have

Now consider \(a^{1 + k\phi(n)}\). If \(a = 0\), we trivially have

If \(a \ne 0\), we have:

Similarly:

Applying our earlier result, we conclude that \(a^{1 + k\phi(n)} \equiv a \pmod{n}\).

As an example, let \(n = 6\). We have \(\phi(n) = 2\). Then:

Fermat’s little theorem has many proofs; we take a look at a proof that relies solely on the existence of modular inverses (Theorem 166).

Let \(p\) be prime, and let \(a \in \Z_p^+\) be an arbitrary nonzero element of \(\Z_p\). Consider the set of elements

These elements are all nonzero and distinct when taken modulo \(p\):

Suppose \(ka \equiv 0 \pmod{p}\). Since \(a \in \Z_p^+\), it has an inverse \(a^{-1}\) modulo \(p\), so we can multiply both sides by \(a^{-1}\) to obtain \(k \equiv 0 \pmod{p}\). Since \(k \not\equiv 0 \pmod{p}\) for all elements in \(\{1, 2, \dots, p-1\}\), \(ka \not\equiv 0 \pmod{p}\) for all elements \(ka \in S\).

Suppose \(ka \equiv ma \pmod{p}\). Multiplying both sides by \(a^{-1}\), we obtain \(k \equiv m \pmod{p}\). Since all pairs of elements in \(\{1, 2, \dots, p-1\}\) are distinct modulo \(p\), all pairs of elements in \(S = \{1a, 2a, \dots, (p-1)a\}\) are also distinct modulo \(p\).

Since there are \(p-1\) elements in \(S\), and they are all nonzero and distinct modulo \(p\), the elements of \(S\) when taken modulo \(p\) are exactly those in \(\Z_p^+\). Thus, the products of the elements in each set are equivalent modulo \(p\):

Since \(p\) is prime, all elements in \(\Z_p^+\) have an inverse modulo \(p\), so we multiply both sides above by \((1^{-1} \times 2^{-1} \times \dots \times (p-1)^{-1})\) to obtain

Now that we have established the underlying mathematical facts, we take a look at the RSA encryption protocol.

Suppose Bob wishes to send a message \(m\) to Alice, but the two can only communicate over an insecure channel.

Alice chooses two very large, distinct primes \(p\) and \(q\). We assume without loss of generality that \(m\), when interpreted as an integer (using the bitstring representation of \(m\)), is smaller than \(n = pq\); otherwise, \(m\) can be divided into smaller pieces that are then sent separately using this protocol.

Alice chooses an element

\[e \in \Z_n^+ \text{ such that } \gcd(e, \phi(n)) = 1\]as her public key, with the requirement that it be coprime with \(\phi(n) = (p-1)(q-1)\). She computes the inverse of \(e\) modulo \(\phi(n)\)

\[d \equiv e^{-1} \pmod{\phi(n)}\]as her private key.

Alice sends \(n\) and \(e\) to Bob.

Bob computes

\[c \equiv m^e \pmod{n}\]as the ciphertext and sends it to Alice.

Alice computes

\[m' \equiv c^d \pmod{n}\]with the result that \(m' = m\).

This protocol is efficient. Alice can find large primes using an efficient primality test such as the Fermat test – in fact, she need only do this once, reusing the same parameters \(n,e,d\) for future communication. Computing the inverse of \(e\) can be done efficiently using the extended Euclidean algorithm. Finally, modular exponentiation can also be done efficiently.

Does the protocol work? We have

Since \(d \equiv e^{-1} \pmod{\phi(n)}\), we have

by definition of modular arithmetic, where \(k \in \N\). By Theorem 183,

Thus, Alice does indeed recover the intended message \(m\).

Finally, is the protocol secure? The public information that Eve can
observe consists of \(n\), \(e\), and \(c \equiv m^e \pmod{n}\). Eve **does
not** know the private parameters \(p,q,d\), which Alice has kept to
herself. Can Eve recover \(m\)? The *RSA assumption* states that she
cannot do so efficiently.

There is no algorithm that efficiently computes \(m\) given:

a product of two distinct primes \(n\),

an element \(e \in \Z_n^+\) such that \(\gcd(e, \phi(n)) = 1\), and

\(m^e \pmod{n}\).

Rather than trying to compute \(m\) directly, Eve could attempt to
compute \(\phi(n)\), which would allow her to compute \(d \equiv e^{-1}
\pmod{\phi(n)}\) and thus decrypt the ciphertext \(c \equiv m^e
\pmod{n}\) using the same process as Alice. However, recovering
\(\phi(n) = (p-1)(q-1)\) is as hard as factoring \(n = pq\), and the
*factorization hardness assumption* states that she cannot do this
efficiently.

There is no efficient algorithm to compute the prime factorization \(p_1, p_2, \dots p_k\) of an integer \(n\), where

for \(a_i \in \Z^+\).

Like the Diffie-Hellman and discrete logarithm assumptions, the RSA and factorization hardness assumptions have not been proven, but they appear to hold in practice. And like the former two assumptions, the latter two do not hold on quantum computers; we will discuss this momentarily.

Suppose that \(n = pq\) is a product of two distinct primes \(p\) and \(q\). Demonstrate that obtaining \(\phi(n)\) is as hard as obtaining \(p\) and \(q\) by showing that given \(\phi(n)\), the prime factors \(p\) and \(q\) can be computed efficiently.

The *ElGamal encryption scheme* relies on the hardness of the
discrete logarithm problem to perform encryption, like
Diffie-Hellman does for key exchange. The following describes how
Bob can send an encrypted message to Alice using ElGamal. Assume
that a large prime \(p\) and generator \(g\) of \(\Z_p\) have already
been established.

Alice chooses a private key \(a \in \Z_p^+\), computes \(A \equiv g^a \pmod{p}\) as her public key, and sends \(A\) to Bob.

Bob chooses a private key \(b \in \Z_p^+\), computes \(B \equiv g^b \pmod{p}\) as his public key, and sends \(B\) to Alice.

Alice and Bob both compute the shared key \(k \equiv g^{ab} \pmod{p}\).

Bob encrypts the message \(m\) as

\[c \equiv m \cdot k \equiv m \cdot g^{ab} \pmod{p}\]and sends the ciphertext \(c\) to Alice.

Show how Alice can recover \(m\) from the information she knows (\(p\), \(g\), \(a\), \(A\), \(B\), \(k\), \(c\)).

Demonstrate that if Eve has an efficient algorithm for recovering \(m\) from what she can observe (\(p\), \(g\), \(A\), \(B\), \(c\)), she also has an efficient method for breaking Diffie-Hellman key exchange.

## RSA Signatures

Thus far, we have focused on the privacy and confidentiality provided
by encryption protocols. We briefly consider authentication and
integrity, in the form of a *signature* scheme using RSA. The goal
here is not to keep a secret – rather, given a message, we want to
verify the identity of the author as well as the integrity of its
contents. The way we do so is to essentially run the RSA encryption
protocol “backwards” – rather than having Bob run the encryption
process on a plaintext message and Alice the decryption on a
ciphertext, we will have Alice apply the decryption function to a
message she wants to sign, and Bob will apply the encryption function
to the resulting signed message.

Suppose Alice wishes to send a message \(m\) to Bob and allow Bob to verify that he receives the intended message. As with the RSA encryption protocol, Alice computes \((e,n)\) as her public key and sends them to Bob, and she computes \(d \equiv e^{-1} \pmod{\phi(n)}\) as her private key. Then:

Alice computes

\[s \equiv m^d \pmod{n}\]and sends \(m\) and \(s\) to Bob.

Bob computes

\[m' \equiv s^e \pmod{n}\]and verifies that the result is equal to \(m\).

The correctness of this scheme follows from the correctness of RSA encryption; we have:

Thus, if \(m' = m\), Bob can be assured that Alice sent the message – only she knows \(d\), so only she can efficiently compute \(m^d \pmod{n}\). In essence, the pair \(m, m^d \pmod{n}\) acts as a certificate that the sender knows the secret key \(d\).

In practice, this scheme needs a few more details to avoid the
possibility of *spoofing*, or having someone else send a message
purporting to be from Alice. We leave these details as an exercise.

The RSA signature scheme as described above is subject to spoofing – it is possible for a third party to produce a pair \(m, s\) such that \(s \equiv m^d \pmod{n}\).

Describe one way in which someone other than Alice can produce a matching pair \(m, m^d \pmod{n}\).

Propose a modification to the scheme that addresses this issue.

**Hint:**Think about adding some form of padding to the message to assist in verifying that Alice was the author.

## Quantum Computers and Cryptography

The abstraction of standard Turing machines
does not seem to quite capture the operation of *quantum computers*,
but there are other models such as quantum Turing machines and quantum
circuits that do so.
The standard Church Turing thesis still applies
– anything that can be computed on a quantum computer can be computed
on a Turing machine. However, a quantum computer is a probabilistic
model, so the extended Church-Turing thesis does not apply – we do not know whether
quantum computers can solve problems more efficiently than so-called
*classical computers*. There are problems that are known to have
efficient probabilistic algorithms on quantum computers but do not
have known, efficient probabilistic algorithms on classical computers.
Discrete logarithm and integer factorization, the problems that are
core to Diffie-Hellman and RSA, are two such problems; they are both
efficiently solvable on quantum computers by Shor’s algorithm.

In practice, scalable quantum computers pose significant implementation challenges, and they are a long way from posing a risk to the security of Diffie-Hellman and RSA. As of this writing, the largest number known to have been factored by Shor’s algorithm on a quantum computer is 35, which was accomplished in 2019. This follows previous records of 21 in 2012 and 15 in 2001. Compare this to factorization on classical computers, where the 829-bit RSA-250 number was factored in 2020. Typical keys currently used in implementations of RSA have at least 2048 bits, and both quantum and classical computers are far from attacking numbers of this size. Regardless, post-quantum cryptograpy is an active area of research, so that if quantum computers do eventually become powerful enough to attack Diffie-Hellman or RSA, other cryptosystems can be used that are more resilient against quantum attacks.

In complexity-theoretic terms, the decision versions of discrete logarithm and integer factorization are in the complexity class \(\BQP\), which is the quantum analogue of \(\BPP\); in other words, \(\BQP\) is the class of languages that have efficient two-sided-error randomized algorithms on quantum computers. We know that

but we do not know whether this containment is strict. And like the
relationship between \(\BPP\) and \(\NP\), most complexity theorists do
not believe that \(\BQP\) contains all of \(\NP\). Neither the decision
versions of discrete logarithm nor of integer factorization are
believed to be \(\NP\)-Complete – they are believed (but not proven) to
be *NP-Intermediate*, i.e. in the class \(\NPI\) of languages that are
in \(\NP\) but neither in \(\P\) nor \(\NP\)-Complete. It is known that if \(\P
\ne \NP\), then the class \(\NPI\) is not empty.

On the other hand, if \(\P = \NP\), then computational security is not possible – such security relies on the existence of hard problems that are efficiently verifiable, and no such problem exists if \(\P = \NP\).