# SHPlonk

## Introduction

The KZG commitment was introduced in [KZG10](https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf). The time required for commitment is proportional to the degree of the polynomial, and the commitment consists of a single $$\mathbb{G}\_1$$ element, which is $$C = \[P(s)]\_1$$ for given $$P(X)$$.

The opening proof $$W$$ for a point $$x$$ is defined as:

$$
W = \left\[\frac{P(s) - y}{s - x} \right]\_1
$$

Verification is performed using the following equation:

$$
e(W, \[s]\_2 -\[x]\_2) \cdot e(C - \[y]\_1,\[1]\_2) \stackrel{?}= 1
$$

However, when proving over AIR, multiple polynomials need to be opened at $$x$$ and $$\omega \cdot x$$. In PLONK-based schemes that support custom gates, such as Halo2, polynomials must be opened at various opening points in different combinations. One of the efficient methods for proving in batch settings is **SHPlonk**, which was introduced in [BDFG20](https://eprint.iacr.org/2020/081.pdf).

## Background

We'll now see how [GWC19](https://eprint.iacr.org/2019/953.pdf) opens multiple polynomials at multiple points. Let's start with simple cases.

### Opening Multiple Polynomials at a Single Point

Suppose two polynomials, $$P\_0(X)$$ and $$P\_1(X)$$, are opened at $$x$$ with corresponding values $$y\_0$$ and $$y\_1$$.

1. The verifier provides $$\alpha \leftarrow \mathbb{F}$$ to the prover.
2. The prover generates the following opening proof $$W$$:

$$
W = \left\[ \frac{P(s) - y}{s- x} \right]\_1
$$

Where $$P(X) := P\_0(X) + \alpha \cdot P\_1(X)$$ and $$y := y\_0 + \alpha \cdot  y\_1$$.

Verification is performed using the following equation:

$$
e(W, \[s]\_2 - \[x]\_2) \cdot e(C- \[y]\_1, \[-1]\_2)\stackrel{?}=1
$$

Where $$C := C\_0 + \alpha \cdot C\_1$$ and $$C\_0, C\_1$$ are commitments of $$P\_0(X), P\_1(X)$$ respectively.&#x20;

This approach is almost identical to opening a single polynomial at a single point, with the only difference being the random linear combination applied to the polynomials before constructing the proof.

### Opening a Single Polynomial at Multiple Points

Suppose a polynomial $$P(X)$$ is opened at two points, $$x\_0$$ and $$x\_1$$, with corresponding values $$y\_0$$ and $$y\_1$$.

The prover generates the following opening proof $$W$$:

$$
W = \left\[ \frac{P(s) - r(s)}{(s- x\_0)(s - x\_1)} \right]\_1
$$

Where $$r(X)$$ is defined as:

$$
r(X) := \begin{cases}
P(x\_0) & \text{ if } X = x\_0 \\
P(x\_1) & \text{ if } X = x\_1
\end{cases}
$$

Verification is performed using the following equation:

$$
e(W, \[(s- x\_0)(s - x\_1)]\_2) \cdot e(C - \[r(s)]\_1, \[-1]\_2) \stackrel{?}= 1
$$

Where $$C$$ is the commitment of $$P(X)$$.

There are two key issues with this naive approach:

1. **Increasing** $$\mathbb{G}\_2$$ **​ parameters**: As the number of opening points increases, the required $$\mathbb{G}\_2$$ parameters also increase. For instance, in this case, $$\[s^2]\_2$$ is required.&#x20;
2. **Expensive group operation in** $$\mathbb{G}\_2$$: Operations in $$\mathbb{G}\_2$$ are computationally expensive, leading to higher gas consumption in on-chain, and inefficient in recursive proof generation.

To solve these issues, we modify the protocol as follows:

The prover generates 2 opening proofs, $$W\_0$$ and $$W\_1$$:

$$
W\_0 = \left\[ \frac{Q\_0(s)}{s - x\_0} \right], \quad W\_1 = \left\[ \frac{Q\_1(s)}{s - x\_1} \right]
$$

Where $$Q\_0(X) := P(X) - y\_0$$ and $$Q\_1(X) := P(X) - y\_1$$.

The verifier uses $$\alpha \leftarrow \mathbb{F}$$ to verify the following equation:

$$
e(\[Q\_0(s)]\_1 + \alpha \cdot \[Q\_1(s)]\_1 + x\_0 \cdot W\_0 + \alpha \cdot x\_1 \cdot W\_1, \[1]\_2) \cdot e(-W\_0 - \alpha \cdot W\_1, \[s]\_2) \stackrel{?}= 1
$$

The verifier can obtain $$\[Q\_0(s)]\_1, \[Q\_1(s)]\_1$$ by computing $$C - \[y\_0]\_1$$ and $$C - \[y\_1]\_1$$ respectively.

**Proof.**

By extracting the exponent components, it can be confirmed that the verification equation holds.

$$
\begin{align\*}
\&Q\_0(s) + \alpha \cdot Q\_1(s) + (x\_0 - s) \cdot \frac{Q\_0(s)}{s - x\_0} + \alpha \cdot (x\_1 - s) \cdot \frac{Q\_1(s)}{s - x\_1} \\
\= \&Q\_0(s) + \alpha \cdot Q\_1(s) - Q\_0(s) - \alpha \cdot Q\_1(s) \\
\= &0
\end{align\*}
$$

Now the number of required $$\mathbb{G}\_2$$ parameters is now constant, and the group operations in $$\mathbb{G}\_2$$ are eliminated.

### Opening Multiple Polynomials at Multiple Points

In [GWC19](https://eprint.iacr.org/2019/953.pdf), a method that combines the previous techniques is proposed. Consider the following scenario:

* $$P\_0(X)$$ is opened at $$x\_0$$ with value $$y\_0$$.
* $$P\_1(X)$$ is opened at $$x\_0, x\_1$$ with values $$y\_1, y\_2$$.
* $$P\_2(X)$$ is opened at $$x\_0, x\_1$$ with values $$y\_3, y\_4$$.

1. The verifier provides $$\alpha \leftarrow \mathbb{F}$$ to the prover.
2. The prover generates 2 opening proofs, $$W\_0$$ and $$W\_1$$:

$$
W\_0 = \left\[\frac{Q\_0(s)}{s - x\_0} \right]\_1, \quad W\_1 = \left\[ \frac{Q\_1(s)}{s - x\_1} \right]\_1
$$

Where $$Q\_0(X) := P\_0(X) - y\_0 + \alpha(P\_1(X) - y\_1) + \alpha^2(P\_2(X) - y\_3)$$ and $$Q\_1(X) := P\_1(X) - y\_2 + \alpha(P\_2(X) - y\_4)$$.

Verification is performed with $$\beta \leftarrow \mathbb{F}$$ using the following equation:

$$
e(\[Q\_0(s)]\_1 + \beta \cdot \[Q\_1(s)]\_1 + x\_0 \cdot W\_0 + \beta \cdot x\_1 \cdot W\_1, \[1]\_2) \cdot e(-W\_0 - \beta \cdot W\_1, \[s]\_2) \stackrel{?}= 1
$$

The verifier can obtain $$\[Q\_0(s)]\_1, \[Q\_1(s)]\_1$$ by computing $$\alpha(\alpha(C\_2 - \[y\_3]\_1) + C\_1 - \[y\_1]\_1) + C\_0 - \[y\_0]\_1$$ and $$\alpha(C\_2 - \[y\_4]\_1) + C\_1 - \[y\_2]\_1$$ respectively, where $$C\_0, C\_1, C\_2$$ are the commitments of $$P\_0(X), P\_1(X), P\_2(X)$$.

If there are $$n$$ polynomials of degree $$d$$, and each polynomial is opened at $$t$$ points, the following holds:

* $$\mathsf{srs}$$ **size**: $$(d + 1) \mathbb{G}\_1, 2\mathbb{G}\_2$$
  * $$\mathsf{srs}$$ stands for "structure reference string".
* **Prover work**: $$O( d \cdot t)\mathbb{G}\_1, O(t \cdot d \log d) \mathbb{F}$$
  * To compute each division in $$\frac{Q\_i(X)}{X - x\_i}$$, it takes $$O(d \log d)  \mathbb{F}$$ operations.
  * To compute each proof $$W\_i$$, it takes $$O(d) \mathbb{G}\_1$$ operations.
* **Proof length**: $$t\mathbb{G}\_1$$
* **Verifier work**: $$O(n \cdot t) \mathbb{G}\_1, 2 \mathsf{P}$$
  * To compute all of $$\[Q\_i(s)]\_1$$, it takes $$O(n \cdot t) \mathbb{G}\_1$$ operations.

Here, $$\mathbb{G}\_i$$ represents scalar multiplication in $$\mathbb{G}\_i$$​, $$\mathbb{F}$$ represents addition or multiplication in $$\mathbb{F}$$, and $$\mathsf{P}$$ denotes a pairing operation.

SHPlonk enables a smaller proof length and faster verifier work for this case.

## Protocol Explanation

### Version 1

To illustrate this scheme, let's use the same example as before. If we group the opening points, we can form two sets: $$S\_0 = {x\_0}$$ and $$S\_1 = {x\_0, x\_1}$$. The complete set of opening points is then $$T = {x\_0, x\_1}$$.&#x20;

The zero polynomial $$Z\_{S}(X)$$ on $$S = {x\_0, \dots, x\_{n-1}}$$ is defined as:

$$
Z\_S(X) := (X - x\_0) \cdots (X - x\_{n-1})
$$

The opening proof $$W$$ is constructed as follows:

1. The verifier provides $$\alpha, \beta \leftarrow \mathbb{F}$$ to the prover.
2. The prover computes $$W = \[h(s)]\_1$$ as follows:

$$
h(X) := \frac{Q\_0(X)}{Z\_{S\_0}(X)} + \beta \cdot \frac{Q\_1(X)}{Z\_{S\_1}(X)}
$$

Where $$Q\_0(X) := P\_0(X) - y\_0, Q\_1(X) := P\_1(X) - r\_0(X) + \alpha(P\_2(X) - r\_1(X)), r\_0(X),$$$$r\_0(X)$$ and $$r\_1(X)$$ are defined as follows:

$$
r\_0(X) := \begin{cases}
P\_1(x\_0) & \text{ if } X = x\_0 \\
P\_1(x\_1) & \text{ if } X = x\_1
\end{cases},
\quad
r\_1(X) := \begin{cases}
P\_2(x\_0) & \text{ if } X = x\_0 \\
P\_2(x\_1) & \text{ if } X = x\_1
\end{cases}
$$

To verify the opening proof $$W$$, the following steps are performed:

1. Compute $$Z\_0$$ and $$Z\_1$$ using $$Z\_i := \[Z\_{T \setminus S\_i}(s)]\_2$$:

$$
Z\_0 = \[s]\_2 - \[x\_1]\_2, \quad Z\_1 =\[1]\_2
$$

2. Verify using the following equation:

$$
e(\[Q\_0(s)]\_1, Z\_0)\cdot e(\beta \cdot \[Q\_1(s)]\_1, Z\_1) \stackrel{?}= e(W, \[Z\_T(s)]\_2)
$$

The verifier can obtain $$\[Q\_0(s)]\_1, \[Q\_1(s)]\_1$$ by computing $$C\_0 - \[y\_0]\_1$$ and $$\alpha(C\_2 - \[r\_1(s)]\_1) + C\_1 - \[r\_0(s)]\_1$$ respectively, where $$C\_0, C\_1, C\_2$$ are commitments of $$P\_0(X), P\_1(X), P\_2(X)$$.

**Proof.**

Extracting the exponent components from the above equation, we obtain:

$$
\begin{align\*}
\&Q\_0(s) \cdot Z\_{T \setminus S\_0}(s) + \beta \cdot Q\_1(s) \cdot Z\_{T \setminus S\_1}(s) \\
\=&\left(\frac{Q\_0(s)}{Z\_{S\_0}(s)} + \beta \cdot \frac{Q\_1(s)}{Z\_{S\_1}(s)}\right) \cdot Z\_T(s) \\
\=\&h(s) \cdot Z\_T(s)
\end{align\*}
$$

Thus, the verification equation holds.

If there are $$n$$ polynomials of degree $$d$$, and each polynomial is opened at $$t$$ points, the computational costs are as follows:

* **The number of opening sets** $$s$$: $$1 \le s \le n$$
* $$\mathsf{srs}$$ **size**: $$(d + 1) \mathbb{G}\_1, (t + 1)\mathbb{G}\_2$$
* **Prover work**: $$O( d )\mathbb{G}\_1, O(d \cdot n + s \cdot d \log d) \mathbb{F}$$
  * This doesn't account for computation of each $$r\_i(X)$$, since they are normally small operations.
  * To compute all of  $$Q\_i(X)$$, it takes $$O(d \cdot n) \mathbb{F}$$ operations.
  * To compute each $$\frac{Q\_i(X)}{Z\_{S\_i}(X)}$$, it takes $$O(d \log d) \mathbb{F}$$ operations.
  * To compute a proof $$W$$, it takes $$O(d) \mathbb{G}\_1$$ operations.
* **Proof length**: $$1\mathbb{G}\_1$$
* **Verifier work**: $$O(n ) \mathbb{G}\_1, O(s \cdot t) \mathbb{G}\_2, (s+1 )\mathsf{P}$$
  * To compute all of $$\[Q\_i(s)]\_1$$, it takes $$O(n)\mathbb{G}\_1$$ operations.
  * To compute $$\[Z\_T(s)]\_2$$, it takes $$O(t)\mathbb{G}\_2$$ operations.
  * To compute each $$Z\_i$$, it takes $$O(t)\mathbb{G}\_2$$ operations.

Here, $$\mathbb{G}\_i$$ represents scalar multiplication in $$\mathbb{G}\_i$$​, $$\mathbb{F}$$ represents addition or multiplication in $$\mathbb{F}$$, and $$\mathsf{P}$$ denotes a pairing operation.

Now the same issues mentioned in [Opening a Single Polynomial at Multiple Points](#opening-a-single-polynomial-at-multiple-points) arise again. Since $$\[Z\_T(s)]\_2$$ is the primary cause of these issues, we need to devise a more efficient approach—one that treats the opening as if it were occurring at a single point rather than $$t$$ points.

During the presentation, [BATZORIG ZORIGOO](https://app.gitbook.com/u/lqk5Tx9zY4XYVRfF3ReRDiOhbxG3 "mention") asked a great question: *"What happens if all polynomials are opened at* $$T$$*?"* Since this simplifies $$s$$ to 1, both prover and verifier work can be reduced. However, there are two main issues:

1. **Prover Complexity:** The prover must evaluate the polynomials at every point in $$T$$, which increases the number of opening points to $$n \cdot t$$.
2. **ZK Proof Size:** The ZK proof size increases because it must include all the opened values.

It's important to distinguish between two types of proofs here. The **Proof** mentioned earlier refers specifically to proving polynomial openings. However, in this context, the **ZK Proof** not only includes the opening proof but also contains commitments and the opened values themselves.

### Version 2

Version 2 addresses the issues from the previous approach by introducing one additional proof.

The opening proofs $$W , W'$$ are constructed as follows:

1. The verifier provides $$\alpha, \beta \leftarrow \mathbb{F}$$ to the prover.
2. The prover computes $$W = \[h(s)]\_1$$ as follows:

$$
h(X) := \frac{f(X)}{Z\_T(X)}
$$

Where $$f(X)$$ is defined as:

$$
f(X) := Z\_{T \setminus S\_0}(X) \cdot Q\_0(X) + \beta \cdot Z\_{T \setminus S\_1}(X) \cdot Q\_1(X)
$$

where $$Q\_0(X) := P\_0(X) - y\_0, Q\_1(X) := P\_1(X) - r\_0(X) + \alpha(P\_2(X) - r\_1(X)), r\_0(X),$$$$r\_0(X)$$ and $$r\_1(X)$$ are defined as follows:

$$
r\_0(X) := \begin{cases}
P\_1(x\_0) & \text{ if } X = x\_0 \\
P\_1(x\_1) & \text{ if } X = x\_1
\end{cases},
\quad
r\_1(X) := \begin{cases}
P\_2(x\_0) & \text{ if } X = x\_0 \\
P\_2(x\_1) & \text{ if } X = x\_1
\end{cases}
$$

3. The verifier provides $$z \leftarrow \mathbb{F}$$ to the prover.
4. The prover computes $$W'$$ as follows:

$$
W' = \left\[ \frac{L(s)}{s- z} \right]\_1
$$

Where $$L(X)$$ is defined as:

$$
L(X) := f\_z(X) - Z\_T(z) \cdot h(X)
$$

Here, $$f\_z(X)$$ is defined as:

$$
f\_z(X) := Z\_{T \setminus S\_0}(z)(P\_0(X) - y\_0) + \beta \cdot Z\_{T \setminus S\_1}(z) (P\_1(X) - r\_0(z) + \alpha(P\_2(X) - r\_1(z)))
$$

The polynomial $$L(X)$$ is crucial in solving the issues from the previous approach. Unlike $$Q\_i(X)$$, which vanishes at multiple points, this $$L(X)$$ now vanishes at single point at $$z$$ because:&#x20;

$$
L(z) = f(z) - Z\_T(z) \cdot h(z) = 0
$$

To verify the opening proofs, the verifier checks the following equation:

$$
e(\[f\_z(s)]\_1 - Z\_T(z) \cdot W, \[1]\_2) \stackrel{?}= e(W', \[s]\_2 - \[z]\_2)
$$

The verifier can obtain $$\[f\_z(s)]\_1$$ as follows:

$$
\[f\_z(s)]*1 = \beta \cdot Z*{T \setminus S\_1}(z) (\alpha(C\_2 - \[r\_1(z)]\_1) + C\_1 - \[r\_0(z)]*1) + Z*{T \setminus S\_0}(z)(C\_0 - \[y\_0]\_1)
$$

Where $$C\_0, C\_1, C\_2$$ are commitments of $$P\_0(X), P\_1(X), P\_2(X)$$.

**Proof.**

Extracting the exponent terms from the above equation, we obtain:

$$
\begin{align\*}
\&f\_z(s) - Z\_T(z) \cdot h(s)  \\
\=& L(s) \\
\=&\frac{L(s)}{s-z} \cdot (s - z)
\end{align\*}
$$

Thus, the verification equation holds.

If there are $$n$$ polynomials of degree $$d$$, and each polynomial is opened at most $$t$$ points, the computational costs are as follows:

* **The number of opening sets** $$s$$: $$1 \le s \le n$$
* $$\mathsf{srs}$$ **size**: $$(d+1) \mathbb{G}\_1, 2\mathbb{G}\_2$$
* **Prover work**: $$O( d )\mathbb{G}\_1, O(d \cdot n + s \cdot d \log d) \mathbb{F}$$
  * To compute all of  $$Q\_i(X)$$, it takes $$O(d \cdot n) \mathbb{F}$$ operations.
  * To compute each $$Z\_{T\setminus S\_{i}} \cdot Q\_i(X)$$, it takes $$O(d \log d)\mathbb{F}$$ operations.
  * To compute $$\frac{f(X)}{Z\_T(X)}$$ and $$\frac{L(X)}{X - z}$$, it takes $$O(d \log d) \mathbb{F}$$ operations.
  * To compute $$L(X)$$, it takes $$O(d \cdot n)$$ operations.
  * To compute a proof $$W, W'$$, it takes $$O(d) \mathbb{G}\_1$$ operations.
* **Proof length**: $$2\mathbb{G}\_1$$
* **Verifier work**: $$O(n ) \mathbb{G}\_1, O(1)\mathbb{G}\_2,2 \mathsf{P}$$
  * To compute all of $$\[f(z)]\_1$$, it takes $$O(n)\mathbb{G}\_1$$ operations.

Here, $$\mathbb{G}\_i$$ represents scalar multiplication in $$\mathbb{G}\_i$$​, $$\mathbb{F}$$ represents addition or multiplication in $$\mathbb{F}$$, and $$\mathsf{P}$$ denotes a pairing operation.

#### Optimization on Version 2.

When constructing $$W'$$, it can be "**normalized**", to save one of the scalar multiplications like below.

$$
W' = \left\[ \frac{L(s)}{Z\_{T \setminus S\_0}(z) \cdot (s- z)} \right]\_1
$$

Then, the verification logic is changed accordingly:

$$
e(\[\frac{f\_z(s)}{Z\_{T \setminus S\_0}(z)}]*1 - \frac{Z\_T(z)}{Z*{T \setminus S\_0}(z)} \cdot W, \[1]\_2) \stackrel{?}= e(W', \[s]\_2 - \[z]\_2)
$$

The verifier can obtain $$\frac{\[f\_z(s)]*1}{Z*{T \setminus S\_0}(z)}$$ as follows:

$$
\frac{\[f\_z(s)]*1}{Z*{T \setminus S\_0}(z)} = \beta \cdot \frac{Z\_{T \setminus S\_1}(z)}{Z\_{T \setminus S\_0}(z)} (\alpha(C\_2 - \[r\_1(z)]\_1) + C\_1 - \[r\_0(z)]\_1) +(C\_0 - \[y\_0]\_1)
$$

Notice that the rightmost part of the equation doesn't have scalar multiplication any more.

## Conclusion

By summarizing the key metrics in the following table, we can see that SHPlonk demonstrates superior efficiency compared to GWC19.

|                    | SRS size                                       | Prover work                                                         | Prover length      | Verifier work                                                         |
| ------------------ | ---------------------------------------------- | ------------------------------------------------------------------- | ------------------ | --------------------------------------------------------------------- |
| GWC19              | $$(d + 1) \mathbb{G}\_1, 2\mathbb{G}\_2$$      | $$O(d \cdot t)\mathbb{G}\_1, O(t \cdot d \log d) \mathbb{F}$$       | $$t\mathbb{G}\_1$$ | $$O(n \cdot t) \mathbb{G}\_1, 2 \mathsf{P}$$                          |
| SHPlonk version 1. | $$(d + 1) \mathbb{G}\_1,(t + 1)\mathbb{G}\_2$$ | $$O( d )\mathbb{G}\_1, O(d \cdot n + s \cdot d \log d) \mathbb{F}$$ | $$1\mathbb{G}\_1$$ | $$O(n ) \mathbb{G}\_1, O(s \cdot t) \mathbb{G}\_2, (s+1) \mathsf{P}$$ |
| SHPlonk version 2. | $$(d+1) \mathbb{G}\_1, 2\mathbb{G}\_2$$        | $$O( d )\mathbb{G}\_1, O(d \cdot n + s \cdot d \log d) \mathbb{F}$$ | $$2\mathbb{G}\_1$$ | $$O(n ) \mathbb{G}\_1, O(1)\mathbb{G}\_2, \2 \mathsf{P}$$             |

SHPlonk offers a more efficient proof system, balancing prover and verifier costs while keeping proof sizes minimal. Depending on the use case, Version 1 is preferable for minimizing proof length, whereas Version 2 provides a more scalable solution with constant $$\mathbb{G}\_2$$ parameters and a fixed number of pairing operations while removing scalar multiplications in $$\mathbb{G}\_2$$.

## References

* <https://eprint.iacr.org/2020/081.pdf>

> Written by [Ryan Kim](https://app.gitbook.com/u/cPk8gft4tSd0Obi6ARBfoQ16SqG2 "mention") of Fractalyze
