# 「Cryptography-ZKP」: Lec7 Poly-commit based on ECC

**Topics:**

- Poly-commit based on Error-correcting Codes
**Argument for Vector-Matrix Product**- Proximity Test
- Consistency Test

**Linear-time encodable code based on expanders**- Lossless Expander
- Recursive Encoding with constant relative distance

An important component in the common paradigm for efficient SNARK is the polynomial commitment scheme.

In Lecture 6 we introduced the KZG polynomial commitment based on bilinear pairing and other polynomial commitments based on discrete-log.

It is worth noting that prover time is dependent on $O(d)$ exponentiations, which is not strictly linear-time.

Today we are going to present a new class of **polynomial commitments based on error-correcting codes.**

Here are the motivations and drawbacks.

**Motivations:**

- Plausibly post-quantum secure
- No group exponentiations

Prover only uses hashes, additions and multiplications. - Small global parameters

**Drawbacks:**

- Large proof size
- Not homomorphic and hard to aggregate

# Background

## Error-correcting Code

Let’s briefly introduce the error-correcting code, which is allowed to correct errors.

A $[n,k,\Delta]$ code is defined below:

- $Enc(m)$: Encode a message of size $k$ to a codeword of size $n$.
**Minimum distance**(Hamming distance) between any two codewords is $\Delta$.

Note that we omit the alphabet $\Sigma$ (binary or field) here, which is another important parameter in code.

A simple example is the **repetition code**, which just repeat each symbol three times.

Consider the binary alphabet with $k=2$ and $n=6$.

The codewords are Enc(00)=000000, Enc(01)=000111, Enc(10)=111000 and Enc(11)=111111 with minimum distance $\Delta=3$.

This repetition code with minimum distance $\Delta=3$ can correct 1 error duting the transmission.

E.g., suppose the transmission can induce at most 1 error, then the message 010111 received from the sender can be decoded to 01.

It is worth mentioning that we are going to __build poly-commit using error-correcting code without efficient decoding algorithm.__

The truth is that we don’t use the decoding algorithm at all.

We define __rate__ and __relative distance__ over a $[n,k,\Delta]$ code.

The **rate** $\frac{k}{n}$ represents the ratio of the minimal message in the codeword of size $n$. We want the rate close to 1.

The **relative distance** $\frac{\Delta}{n}$ represents the ratio of the different locations between any two codewords.

E.g., repetition code with rate $\frac 1 a$ repeats each symbol $a$ times with $n=ak$, and has $\Delta=a$ and relative distance $\frac 1 k$.

We want rate and relative distance as big as possible, but increasing rate could decrease the relative distance.

The __trade-off between the rate__ and the distance of a code is well studied in code theory.

## Linear Code

The most common type of code is **linear code**.

An important **property** __is any linear combination of codewords is also a codeword.__

It is equivalent to say that __encoding can always be represented as vector-matrix multiplication__ between $m$ (of size $k$) and the generator matrix (of size $k\times n$).

Moreover, the __minimum (Hamming) distance__ is the same as the __codeword with the least number of non-zeros__ (**weight**).

(The weight of a codeword indicats the number of non-zeros.)

The subtraction of any two codewords is also a codeword so the number of the different locations directly implies the weight of another non-zero codeword.

### Reed-Solomon Code

A classical construction of linear code is **Reed-Solomon Code**.

It encodes messages in $\mathbb{F}_p^k$ to codewords in $\mathbb{F}_p^n$.

The **idea of encoding** is __veiwing the message of size $k$ as a unique degree $k-1$ univariate polynomial__ and the **codeword** is the evaluations at $n$ points.

It __treats each symbol of the message as an evaluation at a pre-defined point__ so the polynomial can be uniquely defined by interpolation on the fixed set of public points.

Then we can evaluate $n$ pre-defined public points as the codeword. E.g., $(\omega,\omega^2,\dots, \omega^n)$ for $n$-th root-of-unity $\omega^n=1 \mod p$.

The **minimal distance** is $\Delta=n-k+1$ (indicating the least number of non-zeros) __since a degree $k-1$ polynomial has at most $k-1$ roots__ (indicating the most number of zero evaluations).

E.g., when $n=2k$, rate is $\frac 1 2$ and relative distance is $\frac 1 2$.

It is pretty good in practice and is almost the best we can achieve.

RS code is a linear code that the encoding algorithm can be represented as vector-matrix multiplication where the vector is the message and the __generator matrix can be derived from Fourier matrix.__

The **encoding time** is $O(n\log n)$ __using the fast Fourier transform (FFT).__

# Poly-commit based on error-correcting codes

Recall the polynomial commitment scheme we discussed in previous lectures.

- keygen generates global parameters for $\mathbb{F}_p^{(\le d)}$.
- Prover commits to a univariate polynomial of degree $\le d$ .
- Later verifier requests to evaluate at point $u$.
- Prover opens $v$ with proof that $v=f(u)$ and $f\in \mathbb{F}_p^{(\le d)}$.

## Reduce PCS to Vec-Max Product

In the poly-commit based on error-correcting codes, we __write the polynomial coefficients in a matrix__ of size $\sqrt{d}$ by $\sqrt{d}$.

For simplicity, we assume $d$ is an exact power.

Note that the **vectorization of the above matrix** forms the original vector of polynomial coefficients, that is:

Hence, the **polynomial behind the matrix** can be written with two indices:

Then the **evaluation** of $f(u)$ can be __viewed as two steps__ as follows.

**Two steps of evaluation:**

**(Vecor-Matrix Product)**__Multiply a vector defined by point $u$__with the matrix of coefficients to get a vector of size $\sqrt{d}$.**(Inner Product)**Multiply the vector of size $\sqrt{d}$__with another vector defined by point $u$__to obtain the final evaluation.

With this nice observation, we actually **reduce the poly-commit to an argument for vector-matrix product.**

Roughly speaking, **prover** can __only evaluate the first step__ and sends a vector of size $\sqrt{d}$ as proof.

**Verifier** __checks whether the Vec-Mat product is correct__ using proof system and __evaluates the second step locally__, which is an inner product of the Vec-Mat product and the vector defined by point $u$.

As a result, the argument for Vec-Mat product gives us a polynomial commitment with $\sqrt{d}$ proof size.

## Argument for Vec-Mat Product

Now our **goal** is to design a scheme to __test the Vec-Mat product without sending the matrix directly.__

## Commit

As usual, we need to commit to the polynomial.

Here we instead __commit to an encoded matrix defined by the polynomial.__

We first use a linear code to encode the original matrix defined by the coefficients of polynomial.

Concretely speaking, we **encode each row with a linear code to compute an encoded matrix** of size $\sqrt{d}\times n$ where $n$ is the size of the codeword.

We will __use a linear code with constant rate__ so that the __size of the encoded matrix is asymptotical to $d$.__

Then we can **commit to each column of the encoded matrix using Merkle tree.**

Recall the Merkle tree commitment introduced in [Lecture 4].

The root hash is served as the __commitment to the encoded matrix.__

Then verifier can request to __open each column individually__ and checks whether the opened column is altered.

It is worth noting that the **key generation** for this Merkle tree commitment is only __sampling a hash function__, resulting in a __constant size global parameters with no trusted setup.__

## Eval and Verify

We actually perform the evaluation together with verification.

It consists of two tests, **proximity test** and **consistency check**.

We fisrt consider __how a malicious prover could cheat in the commitment.__

A malicious prover can commit to a matrix of inappropriate size but it can be recognized easily by the Merkle tree proof.

A **malicious prover** can __commit to an abitrary matrix__ of specified size in which __each row is not a valid codeword.__

E.g., a valid RS code is a vector of evaluations of a polynomial specified by the message.

Hence, verifier uses the **proxomity test** to __test if the committed matrix indeed consists of $\sqrt{d}$ codewords.__

Having checked this proximity test, verifier is convinced that the committed matrix is nearly close to the encoded matrix defined by the original matrix of coefficients.

Then verifier can move to the **consistency check** to __compute (and verifiy) the actual evaluation.__

### Step1: Proximity Test

Ligero [AHIV’2017] and [BCGGHJ’2017] are two independent works to introduce the proximity test.

Ligero proposed the so-called interleaved test using Reed-Solomon code with quasi-linear prover time.

[BCGGHJ’2017] instead used a linear-time encodable code to build the ideal linear commitment model, which is the first work to build SNARK with strictly linear prover time.

Note that the proximity test in these two works are proposed to build general-purpose SNARKs. Here we use it to build poly-commit as a specified protocol.

We first present the description of the proximity test as below.

**Verifier**__samples a random vector__$r$ of size $\sqrt{d}$ and sends it to prover.**Prover**returns the vector-matrix product of the random vector $r$ and the__encoded matrix.__**Verifier**requests to__open several random columns__and**prover**reveals them with Merkle tree proof.**Verifier**performs__3 checks__- The returned vector is a
__codeword__ - The
__opened columns are consistent__with the committed Merkle tree. - The
__inner product__between $r$ and the opened column is consistent with the corresponding element of the returned vector.

- The returned vector is a

The **completeness** is evident.

The vec-mat product computed by the **honest prover** is indeed the __linear combination of rows (codewords) specified by the random vector chosen by the verifier.__

Recall the propery of the linear codes that a linear combination of codewords is a codeword.

So these 3 checks will be passed by verifier.

Let’s intuitively give the **proof of soundness.**

Assume for the __contradiction__ that the malicious prover __commits to a fake matrix__, and computes the vec-mat product by this fake matrix.

**Soundness (Intuition):**

- If the
__vector is correctly computed__, under our assumption, the product is__not a codeword__. → check 1 will be failed. - If the
__vector is false__meaning that the prover just returns an arbitrary codeword, there are__many different locations from the correct answer.__- By check 2, columns are as committed.
- Probability of passing check 3 is
__extreamly small.__

Let’s discuss **the second case** where the __vector sent by the prover is false__ and $w=r^TC$ denotes the correct answer.

In the __ formal proof for soundness:__ [AHIV’2017], it defines a parameter $e<\frac{\Delta}{4}$, which is related to the minimal distance $\Delta$, to

__measure the distance between the committed matrix and the codeword space.__

Concretely speaking, $e$ measures the minimal distance between any row (of the committed matrix) and any codeword (in the codeword space).

If the __committed (fake) matrix is $e$-far from any codeword__ for $e<\frac{\Delta}{4}$, then the **probability** that the __vec-mat product $w=r^T C$ is $e$-close to any codeword__ is $\le \frac{e+1}{\mathbb{F}}$, which is extreamly small.

Then we can rule out this case, and the **remaining case** is that the __correct answer $w=r^TC$ is $e$-far from any codeword.__

__Under this condition, we know there are at least $e$ different positions__ between the codeword sent by prover and the correct answer $w$.

Then the probability that check 3 is true for $t$ random columns is bounded by $(1-\frac e n)^t$ where $\frac e n$ is __constant for the linear code with constant relative distance__, e.g. RS code.

Hence, soundness probability can be reduced to __negligible__ probability. That’s why we want linear codes with constant relative distance.

### Optimization for Proximity Test

There is one optimization for the proximity test.

Instead of sending the codeword, **prover** can __send the message behind the codeword to verifier.__

Note that the message is computed by the random vector and the original matrix defined by the polynomial coefficients.

Then **verifier** can __encode the message to obtain the corresponding codeword__ that is supposed to be sent by prover.

This nice optimization __reduces the proof size__ from $n$ to $k$.

Moreover, there is **no need** for verifier to __perform the first check explicitly that the vector is a codeword__ since the encoding is done by the verifier.

We depict the optimized proximity test as below.

**Verifier**samples a random vector $r$ of size $\sqrt{d}$ and sends it to prover.**Prover**returns the vector-matrix product of the random vector $r$ and the__original matrix of coefficients.__**Verifier**__encodes the message to compute the codeword.__**Verifier**requests to open several random columns and**prover**reveals them with Merkle tree proof.**Verifier**performs__2 checks__~~The returned vector is a codeword~~- The opened columns are consistent with the committed Merkle tree.
- The inner product between $r$ and the opened column is consistent with the corresponding element of the returned vector.

### Step2: Consistency Check

With the **proximity test**, the __verifier knows the committed matrix is close to an encoded matrix with overwhelming probability.__

Next we can **perform the consistency check** to really __test the evaluation of vec-mat product__ between the vector defined by point $u$ and the original matrix of size $\sqrt{d}\times \sqrt{d}$.

The consistency check is almost the same as the proximity test with the optimization mentioned above excetp that the vector is defined by point $u$ rather than a random vector $r$.

Likewise, the verifier encodes the message to compute the codeword so the first check can be removed.

Futhermore, the **verifier** can __use the same opened columns in the proximity test to perform the third check.__

The cosistency check is depicted as below where the first two checks can be removed.

**Knowledge Soundness (Intuition):**

In the consistency test, we actually need to prove the **knowledge soundness.**

By the proximity test, the committed matrix $C$ is close to an encoded matrix that can be uniquely decoded to a matrix $F$ defined by polynomial coefficients.

Intuitively speaking, __there exists an extractor that extracts $F$ by Merkle tree commitment and decoding $C$,__ s.t. $\vec{u}\times F=m$ with probability $1-\epsilon$.

## Summary

To put everything together, the poly-commit scheme based on linear code is described as below.

**PCS based on Linear Code:**

**Keygen**: sample a hash function**Commit**:- encode the coefficient matrix $F$ of $f$ row-wise with a linear code
- compute the Merkle tree commitment col-wise

**Eval and Verify:****Proximity test**: random linear combination of all rows, check its consistency with $t$ random columns**Consistency test**: compute $\vec{u}\times F=m$, encode $m$ and check its consistency with $t$ random columns**Evaluate**$f(u)=<m,u’>$ by verifier where $u’$ is another vector defined by point $u$.

An important thing to point out is that the **proximity test is necessary** for evaluation and verification although it is almost the same as the consistency test.

**Suppose** we only perform the consistency test, then the verifier checks consistency of the __inner-product of vector $\vec{u}$ and the random columns.__

But it **dose not work** since vector $\vec{u}$ is defined in a very structured way.

Intuitively speaking, it has to **use random challenges** chosen by the verifier to guarantee the consistency.

**Properties:**

- Keygen: $O(1)$, transparent setup with constant size $gp$.
- Commit:
- Encoding: $O(d\log d)$ field multiplications using RS code, $O(d)$ using linear-time encodable codes.
- Merkle tree: $O(d)$ hashes, $O(1)$ commitment size.

- Eval: $O(d)$ field multiplications
- Proof size: $O(\sqrt{d})$ (several vectors of size $\sqrt{d}$ )
- Verifier time: $O(\sqrt{d})$

Look at the **concrete performance** in [GLSTW’21] with degree $d=2^{25}$ and linear-time encodable code.

- Commit: 36s
- Eval: 3.2s
- Proof size: 49MB
- Verifier time: 0.7s

It is __excellent in practice__ and __significantly faster than PCS based on pairing or discrete-log__ (such as KZG, Bulletproofs) because it only uses linear operations without any exponentiations.

## Related Works

Let’s disscuss the following up-to-date works based on error-correcting codes.

It **proposed** the __tensor query IOP__ $<f,(\vec{u}\otimes \vec{u}’)>$, which evaluates inner-product of vector $f$ of size $\sqrt{d}$ and another vector generated by tensor product between two sub-vectors of size $\sqrt{d}$. (dimentsion 2)

Note that this IOP only works for the product of specific form.

Moreover, it **generalizes to multiple dimentsions** and performs the proximity test and consistency test dimension by dimension __with smaller proof size__ $O(n^\epsilon)$ for constant $\epsilon<1$.

- Brakedown [GLSTW’21]

This work **proposed** the __polynomial commitment based on tensor query.__

As described above, we don’t use decoding algorithm at all in the poly-commit. The prover just sends the message and the verifier encodes the message to get the corresponding codeword. It __gives relaxation on the design of the poly-commit which allows to use linear codes without efficient decoding algorithm.__

**Unfortunately**, when we __prove the knowledge soundness__, it has to extract the matrix of polynomial coefficients from the committed encoded matrix in which __the efficient decoding is required__. If the decoding algorithm is not efficient, the extractor is not polynomial as well.

Back to this work, the **another contribution** __is showing an alternative way to prove knowledge soundness without efficient decoding algorithm.__

As a result, it enables us to build poly-commit using any linear codes without efficient decoding algorithm.

It improves proof size to $\text{poly}\log (n)$ with a proof composition of tensor IOP and PCP of proximity. [Mie’09]

- Orion [Xie-Zhang-Song’22]

It improves the proof size to $O(\log^2 n)$ with a proof composition of the code-switching technique [Ron-Zewi-Rothblum’20]

Concretely, the proof size is 5.7MB for $d=2^{25}$, which is quite large in practice.

# Linear-time encodable code based on expanders

It is noteworthy that the following line of works all **build SNARKs with linear prover** __using the linear-time encodable code with constant relative distance.__

In the last segment, we are going to present the construction of linear-time encodable code based on expanders.

**Linear-time encodable code** is proposed by [Spielman’96] and generalized from binary to field by [Druk-Ishai’14], which __relies on the expander graph.__

## Expander Graph

Look at the following __bipartite graph__ where each node on the left set has __3 outgoing edges__ connecting to the nodes on the right edges and __every two nodes on the left set connect at least 5 nodes on the right set.__

It is a **good expander** since __every two nodes on the left set have at most 6 outgoing edges.__

In terms of __encoding__, consider the __left nodes as message__ where each symbol is put on a node and the __right nodes is the codeword__.

The encoding of the message is to __compute for each right-side node the sum of nodes connected from the left set,__ which can be represented as the vector-matrix multiplication between the message and the adjacent matrix of the graph so it is __a linear code.__

A **nature question** is __why we need such an expander graph to achieve linear codes with a good relative distance.__

Intuitively speaking, with such a good expander, __even a single non-zero node on the left set can be expanded to many non-zero nodes on the right set__, that is, it amplies the number of non-zeros (weight) from the message to the codeword, enabling us to achieve good relative distance.

## Lossless Expander

We are going to describe the lossless expander in a formal way.

Let $|L|$ denote __the number of left nodes__ and __the number of the right nodes__ is $\alpha |L|$ for a constant $\alpha\in (0,1)$.

The __degree of a left node__ is denoted by $g$.

Consider a good (almost perfect) expander that for every subset $S$ of nodes on the left, the __number of neighers__ $|\Tau(S)|=g|S|$ for $|S|\le \frac{\alpha |L|}{g}$, which is bounded by the number of right nodes.

But it is too good to achieve.

We need to __relax__ the equality and the boundary.

Likewise, the encoding on the lossless expander is to sum up the connected nodes from the left nodes for each right node to compute the codeword.

## Recursive Encoding

Then we move to the construction of linear-time encodable codes, which uses the recursive encoding with the lossless expander.

The encoding algorithm is depicted as below.

Let’s elaborate on the detailed procedure of encoding a message $m$ of size $k$ to a codeword of size $4k$ with rate $1/4$.

**Recursive Encoding:**

__Copy the message__as the first part of the final codeword.__Apply the lossless expander__with $\alpha=\frac 1 2$ to compute the codeword $m_1$ of size $k/2$.__Assume we already had an encoding algorithm__for message $m_1$ of size $k/2$ with rate $1/4$ and__good relative distance__$\Delta$, then we apply it to compute the codeword $c_1$ of size $2k$ as the second part of the final codeword.__Apply another lossless expander__with $\alpha =\frac 1 2$ for messages of size $2k$ to compute the codeword $c_2$ of size $k$ as the third part.- The final codeword is the concatenation $c=m|| c_1||c_2$

The **remaining thing** is __how we get the encoding algorithm for messages of size $k/2$ with rate $1/4$ and good relative distance.__

That’s exactly the **recursiving encoding** that we just use the entire encoding algorithm for the message of size $k/2$.

Hence, we __repeate the entire encoding algorithm__ in recursion from $k/2,k/4,\dots$ until a constant size.

Note that the __lossless expanders used in each iteration are different__ since the size of message are different.

Finally we can __use any code with good distance for a constant-size message__. E.g., Reed-Solomon code.

## Distance of the Code

The recursive way of encoding enables to __achieve a constant relative distance:__

where $\Delta$ is the relative distance of the __code used in the middle__ from $k/2$ to $2k$ and $\frac{\delta}{4g}$ depends on the __expander graph.__

__ Proof of relative distance (case by case):__ [Druk-Ishai’14]

- If weight of $m$ is larger than $4k\Delta’$, then the relative distance is larger than $\frac{4k\Delta’}{4k}=\Delta’$.

→ Done. It means that for all messages with large weight, we automatically get codewords with large weight. - If weight of $m\le 4k\Delta'\le \frac{\delta k}{g}$, the condition of the first lossless expander holds. (since $\Delta'\le \frac{\delta}{4g}$ )
- Let $S$ be the set of non-zero nodes in $m$, then we have $|\Tau(S)|\ge (1-\beta)g|S|$.
- We can set $g\ge 10$ and $\beta < 0.1$, then at least $(1-2\beta)g|S| > 8|S|>0$ vertices in Hamming ball have a unique neighbor in $S$.
- Hence, $m_1$ (the output of this lossless expander) is
__non-zero__. - After applying the encoding for $m_1$ of size $k/2$ with relative distance $\Delta$, the wight of $c_1$ $\ge 2k\Delta\ge 2k\Delta’$.

(The second inequality holds by the definition of min).

- If the weight of $c_1$ is larger than $4k\Delta’$, then the relative distance is larger than $\Delta’$.

→Done - Else, weight of $c_1$ is $\le 4k\Delta’<\frac{\delta2k}{g}$, the condition of the second lossless expander holds.
- Let $S’$ be the set of non-zero nodes in $c_1$, then we can show the weight of $c_2$ is at least $|\Tau(S’)|\ge (1-\beta)g|S’|>8|S’|>16k\Delta’ >(4k)\Delta’$.

## Sampling of the Lossless Expander

With lossless expander, we can build the linear-time encodable codes with constant relative distance.

The last piece of the puzzle is __how to construct the lossless expander.__

[Capalbo-Reingold-Vadhan-Widgerson’2002] proposed an __explicit construction__ of lossless expander.

Note that being explicit is __deterministic__.

__Unfortunately, it has large hidden constant__ so the concrete efficiency is not good.

**An alternative way** is __random sampling__ since a random graph is supposed to have good expansion.

Since the sample space is polynomial, there __is a $1/\text{poly}(n)$ failure probability__ instead of negligible probability.

## Improvements of the Code

- Brakedown [GLSTW’21]

Instead of the plain summations when encoding, it uses __random summations__, which assign a random weight for each edge and performs the weighted summation, to significantly boost the distance.

- Orion [Xie-Zhang-Song’22]

It proposes __an expander testing with a negligible failure probability via maximum density of the graph.__

Let’s sum up the pros and cons of the polynomial commitment (and SNARK) based on linear code.

- Pros
- Transparent setup: $O(1)$
- Commit and Prover time: $O(d)$ field additions and multiplications
- Plausibly post-quantum secure
- Field agnostic

It means that we can use any field.

- Cons
- Proof size: $O(\sqrt{d})$, MBs

「Cryptography-ZKP」: Lec7 Poly-commit based on ECC