# 「Cryptography-MIT6875」: Lecture 6 - Number Theory

**Topics covered:**

- Groups, Order of a group and the Order of an element, Cyclic Groups.
- The Multiplicative Group $\mathbb{Z}_N^*$ and $\mathbb{Z}_P^*$ for a prime $P$.
- Generators of $\mathbb{Z}_P^*$.
- Primes, Primality Testing.
- The Discrete Logarithm (DLOG) problem and a candidate OWF.
- Diffie-Hellman assumptions: DDH and CDH.

# Groups

An **Abelian group** $\mathbb{G}=(S,\star)$ is a set $S$ together with a operation $\star :S\times S\rightarrow S$ which satisfies

**Identity**: There is an element $\mathcal{I}\in S$ such that for $a\in S$, $a\star \mathcal{I}=\mathcal{I}\star a=a$.**Inverse**: For every $a\in S$, there is an element $b\in S$ such that $a\star b =b\star a=\mathcal{I}$.**Associativity**: For every $a,b,c\in S$, $a\star (b\star c)=(a\star b)\star c$.**Commutativity**: For every $a,b\in S$, $a\star b=b\star a$.

## Order of a group and the order of an element

**The order of a group**is__the number of elements in it__, namely $|S|$.**The order of an element**$g\in S$ is__the minimal number of times__one has to__perform the group operation on $g$ to get to the identity element__$\mathcal{I}$.

That is, $\mathrm{ord}(g)=\min_{i>0}\{g^i=\mathcal{I}\}$.

## Generator of a Group

A **generator** of a group $\mathbb{G}$ is __an element of order__ $|\mathbb{G}|$. In other words,

## Cyclic group

A group $\mathbb{G}$ is called **cyclic** __if it has a generator.__

**Proof**[2]

## Discrete Logarithms

Let $\mathbb{G}$ be a cyclic group. We know that $\mathbb{G}$ has a generator $g$, and that every $h\in \mathbb{G}$ can be written as $h=g^x$ for a **unique** $x\in\{1,2,\dots,|\mathbb{G}|\}$.

We write

$$

x=\operatorname{dlog}_g(h)

$$

to denote the fact that $x$ is the **discrete logarithm** of $h$ to the base $g$.

We will **look for groups** where __computing the group operation is easy__ (namely, polynomial time) but __computing discrete logarithms is hard__ (namely, exponential or sub-exponential time).

Our source for such groups will come from number theory.

Discrete logarithms in $\mathbb{Z}_N$ are, for better or worse, easy.

# Baby (Computational) Number Theory

The complexity of basic operations with numbers. $n$ denotes the input length for each of these operations.

**Greatest Common Divisors:**

The **greatest common divisor** (gcd) of positive integers $a$ and $b$ is __the largest positive integer $d$ that divides both $a$ and $b$.__

$a$ and $b$ are **relatively prime** if their gcd is 1.

# The Multiplicative Group $\mathbb{Z}_N^*$

The **multiplicative group** of numbers mod $N$, denoted $\mathbb{Z}_N^*$, consists of the set

__with multiplication mod $N$__ being the **operation**.

**Some further facts about** $\mathbb{Z}_N^*$

The

**order**of $\mathbb{Z}_N^*$, the number of__positive integers smaller than__$N$ that are__relatively prime to it__, is called the**Euler totient function**of $N$ denoted $\varphi(N)$.$\varphi(p)=p-1$ if $p$ is prime.

$\varphi(p^k)=p^k-p^{k-1}$ if $p$ is prime.

$\varphi(pq)=\varphi(p)\varphi(q)$ if $\operatorname{gcd}(p,q)=1$.

If $N=\prod _i p_i^{\alpha_i}$ is the prime factorization of $N$, then $\varphi(N)=\prod_i p_i^{\alpha_i-1}(pi-1)$.

# The Multiplicative Group $\mathbb{Z}_p^*$

$\mathbb{Z}_p^*$ is Cyclic.The following theorem is a very important property of $\mathbb{Z}_p^*$ when $P$ is prime.

**Proof**[2]

There are several followup questions.

__How many__generators are there for $\mathbb{Z}_p^*$ ?__How to tell (efficiency)__if a given element $g$ is a generator ?__How to sample__a random generator for $\mathbb{Z}_p^*$ ?

## $\mathbb{Z}_p^{*}$ and $\mathbb{Z}_{P-1}$

Before proceeding further, let us note the following structural fact about $\mathbb{Z}_P^*$.

There are two groups are **isomorphic** with an isomorphism $\phi$ that maps $x\in\mathbb{Z}_{P-1}$ to $g^x\in \mathbb{Z}_P^*$. [isomorphic: 同构的]

In particular, consider $\phi(x)=g^x \pmod P$.

We have $\phi(x+y)=\phi(x)\cdot \phi(y)$.

The isomorphism is **efficiently computable** __in the forward direction__ (exponentiation, using the repeated squaring algorithm) but **not known to be efficiently computable** __in the reverse direction.__

For example, consider $\mathbb{Z}_7^*$ and $\mathbb{Z}_6$:

- $\mathbb{Z}_7^* = \{1,2,3,4,5,6\}$ and there is a generator $g=5$.
You can get $\{g,g^2,g^3,g^4,g^5,g^6\} =\{5, 4,6,2,3,1\}$.

When you__perform the multiplication__on $g$, you will**wrap around**the group.

That’s an intuitive reason why discrete logarithm is hard in $\mathbb{Z}_p^*$. - $\mathbb{Z}_6=\{1,2,3,4,5,6\}$ and the generator $g=1$.
The group operation in $\mathbb{Z}_6$ is addition.

You can get $\{g,g^2,g^3,g^4,g^5,g^6\}=\{1,2,3,4,5,6\}$.

When you perform the addition on $g$, you**just walk along**the group. - We can know that there is a one-to-one mapping $\phi$ from $\mathbb{Z}_6$ to $\mathbb{Z}_7^*$, that is $\mathbb{Z}_6\cong \mathbb{Z}_7^*$ .

Here is another quick application of this isomorphism:

**Lemma :**

Let $P$ be an odd prime.

If $g$ is a generator of $\mathbb{Z}_P^*$, then so is $g^x$ as long as $x$ and $P-1$ are __relatively prime.__

**Proof:**

There is a generator $g’$ in $\mathbb{Z}_{P-1}$corresponding to $g$ in $\mathbb{Z}_P^*$ from the isomorphism $\mathbb{Z}_P^*\cong \mathbb{Z}_{P-1}$.

$x$ and $P-1$ are relatively prime, and there are two statements.

In group $\mathbb{Z}_P^*$, if $g$ is a generator, then $g^x$ is a generator.

In group $\mathbb{Z}_{P-1}$, if $g’$ is a generator, then $xg’$ is a generator.

From the **isomorphism**, we can turn the question in $\mathbb{Z}_P^*$ to a relative question in $\mathbb{Z}_{P-1}$.

It’s easy to **prove** that if $x$ is a generator in $\mathbb{Z}_{P-1}$ (which can iterate all elements), then $ax$ is also a generator in $\mathbb{Z}_{P-1}$ (which can also iterate all elements) where $\operatorname{gcd}(a, P-1)=1$.

- The main idea is to suppose for a
**contradiction**that there are two different $x_1$ and $x_2$,i.e. $x_1\ne x_2 \pmod{P-1}$, satisfying $ax_1=ax_2 \pmod{P-1}$. - Then we know $P-1\mid a(x_1-x_2)$.
- From $\operatorname{gcd}(a, P-1)=1$, we know $P-1\mid x_1-x_2$.
- So $x_1= x_2 \pmod{P-1}$. (contradiction)
- QED.

As a **corollary**, we immediately derive the fact __that $\phi(P-1)$ elements__ of $\mathbb{Z}_p^*$ are **generators**.

## $\mathbb{Z}_p^*$ has lots of generators

For the first question, $\mathbb{Z}_p^*$ has lots of generators.

But __how large__ is $\varphi(P-1)$ asymptotically? This is answered by the following classical theorem.

In other words, if you __pick a random element__ of $\mathbb{Z}_p^*$, you will see a generator **with probability**

$$

\varphi(P-1)/(P-1)=\Omega(1/\log\log P)

$$

__which is polynomial__ in $1/\log P$. So, reasonably **often**.[asymptotical: 渐进的]

Difference between Big $\mathcal{O}$ and Big $\Omega$:

Big $\mathcal{O}$ and Big $\Omega$ function are used in computer science todescribe the performance or complexity of an algorithm.

- Big $\mathcal{O}$ is used to describe
the worst caserunning time for an algorithm.- But, Big $\Omega$ notation, on the other hand, is used to describe
the best caserunning time for a given algorithm.

If you want to get a generator in $\mathbb{Z}_p^*$, you will **hit a generator very quickly** just __keeping picking random__ in $\mathbb{Z}_p^*$.

Then it comes to the second question.

## How to tell a generator of $\mathbb{Z}_p^*$ ?

The answer is that you **can tell in poly. time** whether $g\in \mathbb{Z}_p^*$ is a generator __given the factorization of $P-1$.__

### Given the factorization of P-1

At first, we know that a generator is an element of order $P-1$, so it’s easy to check whether $g^{P-1}=1 \pmod P$.

Then we need to **check** if there is __some smaller power__ of $g$ that equals $1$ since the order is the minimal power.

However, there are __a large number of divisors__ of $P-1$, roughly $P^{1/\log \log P}$ which is **not polynomial** (in $\log P$).

It turns out, you **do not** need to __check all divisors,__ **but rather** __only the terminal divisors__ (or, the maximal factors less than $P-1$).

If $P-1=\prod_i q_i^{\alpha_i}$, the t**erminal divisors** are $(P-1)/q_i$ for each $i$.

**For example:**

- If $P-1=p_1^2p_2^3p_3^2$.
- Suppose $g=p_1^2p_2^3$, then $g^{p_1^2p_2^3}=1$.

You can get some powers of $g^{p_1^2p_2^3}$ that equals $1$.

And the power to $g^{p_1^2p_2^3}$ is__the multiplication to the exponent__$p_1^2p_2^3$.- You can get $g^{p_1^2p_2^3p_3}=1$ which is
__smaller power__than $P-1$. - You can also get $g^{p_1^2p_2^3p_3^2}=1$ which is useless since $P-1=p_1^2p_2^3p_3^2$.
- So you only
__hit one terminal divisor__, that is $(P-1)/p_3$.

- You can get $g^{p_1^2p_2^3p_3}=1$ which is
- Suppose $g=p_1p_2^2p_3$, then $g^{p_1p_2^2p_3}=1$.

Similarly, you can get some powers of $g^{p_1p_2^2p_3}$ that equals $1$.- You can get $g^{p_1p_2^3p_3^2}=1$ by power $p_2p_3$.

You hit one terminal divisor $(P-1)/p_1$. - You can get $g^{p_1^2p_2^2p_3^2}=1$ by power $p_1p_3$.

You hit one terminal divisor $(P-1)/p_2$. - You can get $g^{p_1^2p_2^3p_3}=1$ by power $p_1p_2$.

You hit one terminal divisor $(P-1)/p_3$. - So you
__can hit 3 terminal divisors.__

- You can get $g^{p_1p_2^3p_3^2}=1$ by power $p_2p_3$.
- The
**conclusion**is that__if there is a smaller power__of $g$ that equals 1, then you**can definitely**__hit some powers of terminal divisor__that equals 1, which are maximal factors of $P-1$. So you can**only check all the terminal divisiors**.

*Algorithm: *

The following **algorithm** works on $g$ and __the prime factorization__ of $P-1$.

- For each $i$, check if $g^{(P-1)/q_i}\overset{?}= 1\pmod P$.
- If yes for
**any**$i$, say “not a generator”. - Otherwise, say “generator”.

### Given only g and P

That’s nice.

But can one tell if $g$ is a generator __given only $g$ and $P$ ?__

(as __opposed to the prime factorization__ of $P-1$ which is in general hard to compute)

We don’t know, so there are some ways around it.

*Solution 1: *

Pick a random $P$ __together with its prime factorization__ of $P-1$.

This, it turns out, can be done due to a clever algorithm of Kalai [6].

*Solution 2: *

Pick $P=2Q+1$ where $Q$ is prime.

Such primes like $P$ are called **safe primes**, and $Q$ is called a **Sophie-Germain prime** after the famous mathematicians.

While there are infinitely many primes, it has only been **conjectured** that __there are infinitely many Sophie-Germain primes__. This remains **unproven**.

Moreover, the **Sophie-Germain primes** are considered __as the hardest class of primes for discrete logarithms.__

Solution 2 is what people typically use in practice.

**In practice**, you __just pick a random prime__ as $Q$ and **check** whether $P=2Q+1$ is a prime.

It’s efficient **as long as** the __Sophie-Germain primes are in dense distribution.__

The above solution 2 brings about new questions.

How to sample a random prime and how to test primality ?

## Primes

There are some questions about primes.

__How many__primes of $n$-bit length ?__How to test__primality ?__How to sample__a $n$-bit random prime ?

### How many primes

The **prime number theorem** tells us that __there are sufficiently many prime numbers.__

In particular, letting $\pi(N)$ denote __the number of prime numbers less than__ $N$, we know that $\pi(N)=\Omega(N/\log N)$.

Thus, if you __pick a random number smaller than__ $N$, **with probability** $1/\log N$ (which is $1$/**polynomial** in bit-length $n$ in question) you have a prime number at hand.

### Primality Testing

How to recognize that a given number is prime ?

This has been the subject of extensive research in computational number theory with many polynomial-time algorithms, culminating with the **deterministic polynomial-time primality testing algorithm** of Agrawal, Kayal and Saxena[1] (a.k.a. AKS) in 2002.

### Sample a prime

How to sample a $n$-bit random prime ?

The above two facts put together tell us how to __generate a random $n$-bit prime number.__

- Just
__pick a random number less than__$2^n$. - And
__test__if it is prime.

In expected $n$ iterations of this procedure, you will find a $n$-bit prime number, even a random prime number.

# One-way Functions

One-way function is the atom of cryptography.

A one-function is a function $f$ that is __easy to compute but hard to invert__ on average.

How to define a one-way function ?

**Take 1 (Wrong): **

For every p.p.t. algorithm $A$ and __every chosen__ $x$,there is a negligible function $\mu$, s.t.

$$

\operatorname{Pr}[A(f(x))=x]<\mu(n)

$$

It is hard for $A$ to __guess the inverse__ of $f(x)$ __for every chosen__ $x$.

It seems right.

How about the function $f(x)=0$ which maps any $x$ to $0$ ?

The probability of __guessing the inverse of $0$ is negligible__ if the domain size of $x$ is sufficiently large. Because it is essentially __random__.

But $f(x)=0$ is **not** one-way function obviously.

**Take 2: **

For every p.p.t. algorithm $A$ and every chosen $x$, there is a negligible function $\mu$,s.t.

$$

\operatorname{Pr}[A(f(x))\in \tilde{f}(f(x))]<\mu(n)

$$

where $\tilde{f}$ is __the inverse function.__

The **difference** of the new definition is that $A$ is trying to __guess some possible inverses__ of $f(x)$ rather than the exactly chosen $x$.

In this definition, $f(x)=0$ is not one-way function.

## Candidate: DLOG

Let us present an informal one-way function **candidate**.

$$

f(P,g,x)=(P,g,g^x \pmod P)

$$

Computing this function __can be done in time polynomial__ in the input length.

However, **inverting** is the __discrete logarithm problem__ which is conjectured to be hard.

Defined formally below.

In fact, this is not only a one-way function, but can also __be made into a family of one-way permutations.__

More on that later, we will see that **one-way permutations** can be used to build **pseudorandom generators**; and as we saw already, **pseudorandom generators** can be sued to build **pseudorandom functions** and stateless secret-key encryption and authentication.

We can do all the crypto we saw so far __based on the hardness of the discrete logarithm problem.__

Route: OWF→PRG→PRF.

However, going via this route may __not be the most efficient.__

So, we will look at related problems and try to **build more efficient PRGs and PRFs.**

## The Diffie-Hellman Assumptions

There is another route to __build PRGs and PRFs efficiently__, that is **Diffie-Hellman Assumptions.**

Route: DH→PRG.

Route: DH→PRF.

Given $g^x$ and $g^y\mod P$, you can **easily** compute $g^{x+y}=g^x\cdot g^y \pmod P$. But it is **hard** to compute $g^{xy}\pmod P$.

If you can compute discrete logarithms, then you can compute $x$ from $g^x$, and raise $g^y$ to $x$ to get $(g^y)^x=g^{xy} \pmod P$.

But discrete log is hard, so this isn’t an efficient way to solve the problem.

### CDH Assumption

The problem, called the **computational Diffie-Hellman (CDH)** problem, appears to be **computationally hard**, in fact __as hard as computing discrete logarithms!__

Moreover, it appears **hard** to even __tell if you are given the right answer or not!__

### DDH Assumption

But this requires some care to formalize.

At first, one **may** think that given $P,g,g^x,g^y$, it is **hard to distinguish** between the right answer $g^{xy} \pmod P$ versus a random number $u \mod P$.

Let us call the assumption that this **decisional problem** is hard the **decisional Diffie-Hellman (DDH) assumption.**

**Decisional Diffie-Hellman Assumption (first take):**

For a random $n$-bit prime $P$ and random generator $g$ of $\mathbb{Z}_P^*$, and random $x,y\in \mathbb{Z}_{P-1}$ and a random number $u\in \mathbb{Z}_P^*$, there is **no polynomial** (in $n$) time algorithm that __distinguishes__ between $(P,g,g^x\pmod P, g^y \pmod P, g^{xy} \pmod P)$ and $(P,g,g^x\pmod P,g^y \pmod P, u\pmod P)$.

However, the assumption turns out to be false.

### DDH is False in $\mathbb{Z}_P^*$

However, the DDH assumption is false in $\mathbb{Z}_P^*$.

It seems awfully strong on first look.

It says not only it is hard to compute $g^{xy}$ from $g^x$ and $g^y$, but also that __not even a single bit of__ $g^{xy}$ can be computed (with any polynomial advantage beyond trivial guessing).

However, there are __some information about $g^{xy}$ indeed dose leak from $g^x$ and $g^y$.__

Before that, let us take a quick detour in **Quadratic Residues.**

$$

h=g^2 \pmod P

$$

Some facts about Quadratic Residues:

- $h$ is a QR if $h=g^2\pmod P$.
- 1/2 of $\mathbb{Z}_P^*$ are QR.
- We
**can tell efficiently**if $h$ is a QR by evaluating $h^{(P-1)/2}\overset{?}=1 \pmod P$. - $h=g^x$ is a QR
**if and only if**$x$ is__even__.

Use the facts of quadratic residue in $\mathbb{Z}_P^*$, we know the following statements are equivalent.

- $g^{xy}$ is a QR.
- iff $xy$ is even.
- iff $x$ is even or $y$ is even.
- iff $g^x$ is a QR or $g^y$ is a QR.

Now, we can **distinguish** between $(P,g,g^x,g^y,g^{xy})$ and $(P,g,g^x,g^y,u)$.

$g^{xy}$ is a quadratic residue __with probability__ $3/4$ while $u$ is a quadratic residue __with probability__ $1/2$.

So the advantage for distinguishing is $1/4$.

Thus, we need to refine our assumption.

Looking at the **core reason** behind the above attack, we see that __there is a 1/2 chance__ that $g^x$ __falls into a subgroup__ (the subgroup of **quadratic residues**, to be precise) and once that happens__, $g^{xy}$ is also in the subgroup__ no matter what $y$ is.

These properties are furthermore detectable in polynomial-time which led us to attack.

A **solution** is to work with **subgroup of $\mathbb{Z}_P^*$ of prime order.**

In particular, we will take $P=2Q+1$ to __be a safe prime__ and __work with__ $\mathcal{QR}_P$, **the subgroup of quadratic residues** in $\mathbb{Z}_P^*$.

$\mathcal{QR}_P$ has **order** $(P-1)/2=Q$ which __is indeed prime!__

By virtue of this, __every non-identity element__ of $\mathcal{QR}_P$ is its **generator** w.r.t. theorem 2.

With this change, we can state the **following DDH assumption** which __is widely believed to be true.__

We know DLOG→CDH→DDH but no implications are known in the reverse directions.

# References

[1] M. Agrawal, N. Kayal, and N. Saxena. Primes is in p. Annals of mathematics, pages 781–793, 2004.

[2] D. Angluin. Lecture notes on the complexity of some problems est number theory. 1982.

[6] A. Kalai. Generating random factored numbers, easily. Journal of Cryptology, 16(4):287–289, 2003.

「Cryptography-MIT6875」: Lecture 6 - Number Theory