Pippenger's Algorithm
Reduction of multi-scalar multiplication (MSM) into double and add operations
AKA "Bucket" method
Motivation
Improve upon Double-and-add by reducing the total number of double and add operations needed for MSM.
Algorithm Process
Introduction
Multi-Scalar Multiplication (MSM) refers to the efficient computation of expressions in the following form:
ki: the i-th λ-bit scalar
Pi: the i-th point on an elliptic curve
Pippenger's algorithm speeds up MSM by dividing each scalar ki into multiple s-bit chunks, and then grouping and processing scalars by their corresponding bit positions (windows).
1. Scalar Decomposition
First, we decompose the scalars ki:
s : the number of bits per window
⌈sλ⌉ : the total number of windows
Through this definition, we can rewrite the original MSM summation as follows:
Note that Wj is the sum for the j-th window.
2. Bucket Accumulation
Given our Wj, we denote buckets B as the following:
Or in other words, Bℓ,j refers to the ℓ bucket in window Wj consisting of the summation of points Pi with the same ki,j.
Note that ki,j must be within the range [0,2w−1]; however, if ki,j is 0, as ki,j is the scalar multiplier, the result of the bucket will be 0. Therefore, the practical range of ki,j is [1,2w−1].
3. Bucket Reduction
Thus, in total, a window Wj will be re-structured as follows:
Note that the computation of a singular ℓBℓ,j is computed through a running sum of recursive summations and not through further scalar multiplications.
Complexity Cost:
Original: (2s−1)×ScalarMul+(2s−2)×PointAdd
Bucketed: (2s+1−2)×PointAdd
C++ Implementation
Example
4. Window Reduction - Final MSM Result
With the finished individual window computations, we return to the total MSM summation and add all windows together:
In practice, the computation is calculated recursively from j=⌈sλ⌉ to j=1 like so:
Complexity Cost
C++ Implementation
Complexity Analysis
To compute Wj, n×PointAdd is required: ⌈sλ⌉(n×PointAdd)
Each Wj needs (2s+1−2)×PointAdd: ⌈sλ⌉((2s+1−2)×PointAdd)
Final Q needs: ⌈sλ⌉(s×PointDouble+PointAdd)
Total complexity
Optimization Table (When , n=220)
2
134,218,624
4
67,110,848
8
33,570,784
16
18,874,352
32
68,727,865,336
64
147,573,952,589,680,607,228
128
1,361,129,467,683,753,853,853,498,429,727,074,942,974
🔍 This clearly shows that selecting an optimal s value is critical for performance.
Example
Let's try to run Pippenger's on the following simple example:
1. Scalar Decomposition
Let's define our bits per window s to be 2. In this case, since the maximum amount of bits needed to constrain the values 12, 9, and 13 is 4, we will have ⌈sλ⌉=max_bits/s=4/2=2 windows.
Note that in binary we have:
Decomposing our k0, k1, and k2 results in the following:
2+3. Bucket Accumulation + Reduction
Given our s of 2, we now have buckets of [B1,j,B2,j,B3,j] for a given window Wj. Our windows are now re-structured to buckets like so:
From bottom up, in our bucket accumulation stage we compute the buckets:
B1,1
P2+P3
B2,2
P2
B3,2
P1+P3
...and in our bucket reduction stage we compute the windows:
4. Window Reduction - Final MSM Result
Putting all the windows together for the MSM sum, we get the following:
References
Written by Ryan Kim & Ashley Jeong of Fractalyze
Last updated