Breaking Tree Monocultures with Verkle Trees

Breaking Tree Monocultures with Verkle Trees

We are surrounded by trees. Not normal trees but Merkle trees. These forests are very ubiquitous and are produced by virtually all blockchain network. But we know from the biology lessons that monocultures are not great for the ecology. Therefore, from totally not related reasons, the Polymer Labs are moving towards planting a new type of a tree, a Verkle tree. You can read more about Polymer Labs and their IBC infrastructure in our previous post.

In this post we are going to explain what is so interesting in Merkle trees, and what the Verkle trees are bringing to the game.

Proving transaction

If we would like to check correctness (integrity) of transactions in a particular block a naive approach would guide us towards having a set of checksums (hashes) added to each of transaction (and ideally each cryptographically signed). This looks reasonable when we want to quickly verify a particular transaction, we can do it immediately. But it is at the same time inefficient due to storage requirements. Meaning, for each transaction it requires storing an additional signed checksum, which is an additional cost if we take into an account that such block needs to be propagated on the network. Just imagine a block consisting of 1 million of transactions grows by additional 1 million signed checksums — and each signed checksum might even be larger (in terms of bytes) than the transaction itself. Here you can see the inefficiency of the naive approach.

Merkle trees to the rescue

Merkle Tree example.

Hence the Merkle tree (or a hash tree), a construction that enables a more efficient solution for the above problem. By applying a Merkle tree on top of a set of transactions we can decrease the number of signed checksums to just a single one. Meaning that even if we have 1 million of transactions we need to store only a single checksum to verify the correctness of all of these transactions.

That is an efficient on storage solution. But it also comes at a price, computational price. In the naive approach we could verify correctness of a single transaction immediately just by looking at the attached checksum. With Merkle trees it requires more effort, we need to compute a Merkle tree on top of all transactions and compare the root checksum, if it matches with the checksum in the block then we know that that particular transaction (and all others) are correct.

Merkle Proof

Merkle Tree Proof example, where greens are given and blues are calculated.

Now let us think about how to prove to someone that our transaction is included in the block. We can simply point him to the block, show our transaction and ask to construct a Merkle tree and compare its root with the checksum included in the block. Sounds like a bit of work and data to sent.

Thankfully, due to the way the Merkle Tree is constructed, such inclusion proof can be done much more efficient. Instead of the aforementioned naive approach, we can present only a path on the tree to successfully convince the verifier that our transaction is indeed included in that block. By path we mean only a transaction and all neighboring hashes on the way to the root tree, as presented on the above picture. This optimization is logarithmic in its nature, which means that instead of constructing the Merkle tree in we can show a proof of O(log_2(n)) size where n is the number of leafs (transactions in our example).

Verkle Trees

Verkle Tree example, where Cn is a Vector Commitment Pn is a proof.

The Verkle trees are improving the proof size of the Merkle trees by a factor of O(log_2(k)). In other words, they can convince the verifier about the including of our transaction with even fewer data. They are achieving this thanks two tricks.

First, the Merkle tree is a binary tree, meaning that starting from the root of the tree, down to the leafs, each node in the tree has only 2 children. While the Verkle tree is k-ary tree, a tree that is branching k-times instead of 2-times like the Merkle tree.

Second, instead of using hash functions like in Merkle tree, the Verkle tree is using Vector Commitments. We are not going to go into the details about Vector Commitments in this article, but we will only mention that a Vector Commitment includes a proof that all of the neighbors of our leaf are also correct. Hence, the proof can be limited only to the strict path from the leaf (our transaction) to the root. Example below.

Verkle Tree Proof example, path is marked as green.

There are no free dinners, which means that there is a price of such improvement which is paid by multiplying the construction complexity by a branching factor of the Verkle Tree. Meaning, instead of O(n) construction time of Merkle Tree a Verkle Tree is constructed in O(kn) time, where k is the number of branches each node has.

The benefits of having shorter proofs are paramount and there are many interesting applications of Verkle Trees like in transaction inclusion proofing, or zero-knowledge constructions. Definitely, we will see more Verkle Trees planted around us, thanks to projects like Polymer Labs.

Till the next one!

Read more