Sumcheck
Overview
The sumcheck protocol is a cryptographic proof protocol designed to efficiently prove and verify whether the sum of a multivariate polynomial over a specified domain equals a claimed value, given :
A multivariate polynomial g(X1,X2,…,Xv)
A finite domain Bv for any B⊆F
A claimed sum
or in short C1:=∑b∈Bvg(b). We denote the true correct sum by H.
Throughout this article, we assume that B={0,1} which is the case where the evaluation domain is a boolean hypercube. degj(g) denotes the degree of g in variable Xj. We denote the prover and the verifier by P and V.
The goal is for P to convince V that the claimed C1=H, without having V evaluate g on all {0,1}v elements (which would be computationally expensive).
Protocol
In the first round, P sends the univariate polynomial g1(X1) and claims it equals the following:
Note that g1(0)+g1(1)=C1 if P is honest.
V checks if g1(0)+g1(1)=C1. If not, V rejects.
V checks if deg(g1)≤deg1(g). If not, V rejects.
V chooses a random r1∈F and sends it to P. Both V and P set C2:=g1(r1).
Now in the jth round, for 1<j<v, P sends the univariate polynomial gj(Xj) and claims it equals the following:
Note that gj(0)+gj(1)=Cj if P is honest.
V checks if gj(0)+gj(1)=Cj. If not, V rejects.
V checks if deg(gj)≤degj(g). If not, V rejects.
V chooses a random rj∈F and sends it to P. Both V and P set Cj+1:=gj(rj).
This process continues iteratively, reducing the dimension at each step until reaching a single variable.
In round v, P sends gv(Xv) to V claiming it equals:
V checks if gv(0)+gv(1)=Cv. If not, V rejects.
V chooses a random rv∈F and evaluates g(r1,…,rv) with a single oracle query to g. V checks that gv(rv)=g(r1,…,rv). If not, V rejects.
Completeness
Let si(X) be the polynomial that P would have sent in round i if it were honest :
If C1 and g1(X) that P provided in the first round is equal to H and s1(X), respectively, then g1(0)+g1(1) is equal to C1. V checks if this holds, then randomly samples r1 and sends it to P. The only remaining part is merely a recursive invocation of another sumcheck protocol on s1(r1) where the claimed value is C2:=g1(r1). At the bottom of the recursion, which is the last round of the protocol, P passes the verification as gv(rv)=sv(rv)=g(r1,…,rv). Thus, the sumcheck protocol has perfect completeness.
Soundness
What is the soundness error, that is, what is the probability that P passes the protocol even though the first value given in the first step is wrong, i.e. C1=H?
If the following occurs at any round of the protocol, P can pass the verification even if the first claimed value is wrong:
For the challenge ri←$F sampled by V, luckily gi(ri)=si(ri) for the gi=si that P had sent in the previous round.
Once the scenario outlined above occurs, P can proceed onto the remaining rounds with a valid transcript to deceive V. The soundness error δs is bounded by vd/∣F∣, which can be calculated by the Schwartz-Zippel lemma. In other words, V can be convinced that gi(X)=si(X) if gi(ri)=si(ri) for a random sampled ri.
Cost

We denote the cost of evaluation oracle query to g by T. In reality, T is equal to the cost to evaluate g at a single input in Fv.
Communication cost.
At each round, P generates an univariate polynomial gi(X) whose degree is at most degi(g). To uniquely satisfy such polynomial, P needs to send degi(g) evaluations on g.
Prover cost.
The cost of computing such prescribed messages at each round is equal to evaluating gi(X) at degi(g) points. Letting the oracle query cost be T and assuming degi(g)=O(1), the prover's runtime cost at round i is O(2v⋅T).
Verifier cost.
Also assuming degi(g)=O(1), the total verifier cost is dominated by T; the cost of evaluating g only once.
Combining with Fiat-Shamir transform, we have successfully reduced a complicated interactive proof into a simple evaluation task for V. Unfortunately, in many real world problems, V does not have access to a constant cost oracle query to g, or even a single evaluation is prohibitively expensive.
The key part of sumcheck-based protocols is about how we will grant V oracle access. There should be either a way V can efficiently evaluate g, or we should run another protocol for the value g(r) (refer to polynomial commitment scheme).
Example
#SAT
For a n-variate boolean formula ϕ, whose size S is poly(n), calculate
The fastest algorithm ever known to solve #SAT is exponential, which means it's not much better than brute-forcing it.
In addition to that, there is no polynomial time deterministic & non-interactive algorithm V. This is because even if P provides every valid input it knows as a witness, there is no way V can check in polynomial time whether there are more valid inputs. #SAT∈/NP.
However there exists an interactive proof for this, which enables V to verify if the claimed value is correct within polynomial time. #SAT∈IP.
This implies that the interactive proof is more powerful than merely proving an NP statement. It is known that NP⊆PSPACE=IP, where PSPACE denotes the set of all problems that can be solved by Turing machines with polynomial space complexity.
Protocol
Transform ϕ into an arithmetic circuit ψ(x). AND gate can be extended to y⋅z, OR gate to y+z−y⋅z, NOT gate to 1−y. Now extend circuit ψ to a n-variate polynomial g, which means that ∑x∈{0,1}ng(x)=∑x∈{0,1}nϕ(x).
Degree of g is poly(S). Run the sumcheck protocol on g.
References
Written by Ryan Kim and Carson Lee of Fractalyze
Last updated