Verkle Trees

Verkle trees represent a breakthrough in cryptographic data structures, solving a fundamental limitation that has plagued blockchain and verification systems for years: proof size scalability.

Scaling benefits

The "V" in verkle stands for "vector," which hints at the deeper innovation: instead of proving a path through the tree (like in Merkle trees), you're proving an index within a vector commitment.

This shift is profound—you only need the specific data you're trying to prove, nothing else. No sibling hashes, no complementary paths, just your leaf and a witness.

Unlike traditional Merkle trees where inclusion proofs grow logarithmically with the dataset size (requiring ~log(n) sibling hashes), Verkle trees achieve constant-sized proofs regardless of whether you're proving inclusion in a database of 3,000 entries or 500 trillion, verification always takes the same time and space.

This isn't just a performance improvement; it's a privacy win too, since you never reveal information about other entries in the dataset during verification.

The K factor

Verkle trees have another fascinating design parameter that traditional Merkle trees lack: the branching factor K, which determines how many children each internal node can have.

This seemingly simple choice creates a fundamental tradeoff that's worth understanding deeply. When you increase K, you're making the tree "wider" and "shallower"—construction becomes faster since you're doing fewer levels of polynomial interpolation, but proving becomes slower because each proof must handle larger polynomial degrees.

Conversely, decreasing K makes the tree "narrower" and "deeper," speeding up individual proofs at the cost of longer construction time.

This flexibility is genuinely powerful: you can tune K based on your application's specific needs.

For a voting system with infrequent elections but many verification requests, you'd optimize for faster proving; for a high-frequency trading system, you might prioritize construction speed. This polynomial foundation also enables powerful batching—you can prove multiple leaves simultaneously with a multi-proof that's far smaller than individual proofs combined.

MetaPoll application

For MetaPoll's VDIP system, this means voters can verify their ballots were included without downloading massive election databases, and election officials can provide cryptographic transparency without overwhelming bandwidth or storage requirements.

The result is a verification system that scales from local school board elections to national referendums with identical efficiency guarantees.

Additional Resources:

Getting into the technical math of Verkle Trees

Instead of storing raw data, we commit to a polynomial $P(x)$ where $P(i)$ equals the vote at position $i$

P(x) = \sum_{i=0}^{n-1} v_i \cdot L_i(x)

Where:

$P(x)$ = polynomial encoding all votes
$v_i$ = vote value at position $i$
$L_i(x)$ = Lagrange basis polynomial for position $i$
$n$ = total number of voters

Polynomial Commitment:

Using elliptic curve pairings, we can create a commitment $C$ for some secret $S$ , where the brackets denote elliptic curve scalar multiplication:

C = [P(s)]_1

Where:

$C$ = polynomial commitment
$s$ = secret evaluation point (from trusted setup)
$[·]_1$ = elliptic curve scalar multiplication in group $G_1$

Quotient Polynomial for Inclusion Proof:

When you want to prove your vote at position $i$ was included, the system generates a quotient polynomial:

Q(x) = \frac{P(x) - v_i}{x - i}

Where:

$Q(x)$ = quotient polynomial proving vote $v_i$ at position $i$
$i$ = voter's position in the commitment

Witness Generation:

providing a witness W:

W = [Q(s)]_1

Where:

$W$ = cryptographic witness (the actual proof)

Pairing-Based Verification:

Verification requires just a single pairing check:

e([P(s) - v_i]_1, [1]_2) \stackrel{?}{=} e(W, [s - i]_2)

Where:

$e(⋅,⋅)$ = bilinear pairing function
$[1]_2$ = generator of elliptic curve group $G_2$
$\stackrel{?}{=}$ = verification check (must be equal)

Multi proofs

However, one issue is that each of these proofs maps to just one voter. If we have 100 million voters we don't want to have to publish 100 million proofs, that would be messy and impracticle. Enter the multi proof, a way to combine all the proofs into a single succinct proof. It's kind of like having a folder where you store all the proofs together so anyone can look up their own proof quickly and easily.

It's more complicated than the single proof, but let's explore the math of how this works.

Multi-Proof for Multiple Votes:

For proving multiple votes at positions $I={i1,i2,...,in}I = \{i_1, i_2, ..., i_n\} I={i1,i2,...,in}$ , we need:

Z_I(x) = \prod_{j \in I} (x - j)

Where:

$ZI(x)$ = vanishing polynomial for all queried positions
$I$ = set of voter positions being proved
$n$ = number of votes being proved simultaneously

Multi-Proof Quotient Polynomial:

Q(x) = \frac{P(x) - R(x)}{Z_I(x)}

Where $R(x)R(x) R(x)$ is the remainder polynomial:

R(x) = \sum_{j \in I} v_j \cdot \frac{Z_I(x)}{x - j} \cdot \left(\frac{Z_I(x)}{x - j}\right)^{-1} \bigg|_{x=j}

Simplified as:

R(x) = \sum_{j \in I} v_j \cdot L_j^I(x)

Where:

$R(x)$ = interpolation polynomial through points $(i_j, v_j)$ for $j \in I$
$L_j^I(x)$ = Lagrange basis polynomial for position $j$ within set $I$
$v_j$ = vote value at position $j$

Multi-Proof Witness:

W = [Q(s)]_1

Multi-Proof Verification:

e([P(s)]_1 - [R(s)]_1, [1]_2) \stackrel{?}{=} e(W, [Z_I(s)]_2)

Batch Verification Optimization:

\sum_{t=1}^{m} \gamma^t \cdot e([P(s)]_1 - [R_t(s)]_1, [1]_2) \stackrel{?}{=} \sum_{t=1}^{m} \gamma^t \cdot e(W_t, [Z_{I_t}(s)]_2)

Where:

$γ$ = random challenge from Fiat-Shamir transform
$t$ = index of each multi-proof being batch verified
$I_t$ = set of positions for proof $t$
$W_t$ = witness for proof $t$

Verkle Multi-Proof Efficiency Gains:

Proof Size Efficiency:

Single proof size: $|W| = 48 ∣$ bytes

Multi-proof for $k$ votes: $|W| + |R(s)| = 48 + 32n ∣$ bytes

Traditional approach: $n \times 48 = 48n$ bytes

The percentage efficiency gain depends on how many votes you're proving simultaneously:

n=2 votes: -16.67% (worse than individual proofs)
n=5 votes: 13.33% savings
n=10 votes: 23.33% savings
n=25 votes: 29.33% savings
n=100 votes: 32.33% savings
n=1,000 votes: 33.23% savings
n=10,000 votes: 33.32% savings

Key Insights:

Break-even point: Multi-proofs become beneficial at $n ≥ 3$ votes
Practical sweet spot: Around 25-100 votes gives ~30% savings
Theoretical maximum: Approaches 60% savings as $n$ approaches infinity
Diminishing returns: Most gains achieved by n=100; further increases yield minimal improvement

PreviousArweave Perma Storage NextZero Knowledge - zkSNARKs

Last updated 8 months ago

hashtagScaling benefits

hashtagThe K factor

hashtagMetaPoll application

hashtagAdditional Resources:

hashtagGetting into the technical math of Verkle Trees

hashtagPolynomial Commitment:

hashtagQuotient Polynomial for Inclusion Proof:

hashtagWitness Generation:

hashtagPairing-Based Verification:

hashtagMulti proofs

hashtagMulti-Proof for Multiple Votes:

hashtagMulti-Proof Quotient Polynomial:

hashtagMulti-Proof Witness:

hashtagMulti-Proof Verification:

hashtagBatch Verification Optimization:

hashtagVerkle Multi-Proof Efficiency Gains:

Scaling benefits

The K factor

MetaPoll application

Additional Resources:

Getting into the technical math of Verkle Trees

Polynomial Commitment:

Quotient Polynomial for Inclusion Proof:

Witness Generation:

Pairing-Based Verification:

Multi proofs

Multi-Proof for Multiple Votes:

Multi-Proof Quotient Polynomial:

Multi-Proof Witness:

Multi-Proof Verification:

Batch Verification Optimization:

Verkle Multi-Proof Efficiency Gains: