Convert rvgo's memory implementation to radix trie #83

mininny · 2024-10-03T17:53:58Z

Background

Currently, Asterisc's memory layout is constructed using merkle tree represented as hashmaps where keys are a generalized index and values are the merkle root of the sub-tree or the node.

This design is identical to Cannon, and has some room for improvements. Therefore, in order to make Asterisc more memory efficient, we propose modifying the memory layout as follows.

Proposed Change

Modify Asterisc to use radix trie instead of hashmap.
Currently, the memory of asterisc is laid out as follows.

nodes map[uint64]*[32]byte

We can modify it to something like:

type RadixNode struct {
	Children [1  << 12]*RadixNode
	Hash [1 << 12][32]byte		
	HashExists [(1 << 12) / 64]uint64
	HashValid  [(1 << 12) / 64]uint64

}

Tune the radix trie's branching factor.
In order to better accommodate the typical memory layout, we can use different branching factor at each level of the trie.

Because upper memory levels are more sparse and the pages tend to be adjacent, we can have lower branching factor at upper trie levels, and higher branching factory at lower trie levels. While this is subject to benchmarking and testing, something like this would be possible:

level 1: 10 bits
level 2: 10 bits
level 3: 10 bits
level 4: 10 bits
level 5: 12 bits

Page offset: 12 bits

The tradeoff will also have to take account the complexity of managing different branching factor.

Implement merkleization of the trie.
Leverage the radix trie to do better cache invalidation.
Currently, we invalidate a node by traversing from the leaf node to the root node and nullifying each node.
For a radix trie with fixed branching factor, we can directly access each level and nullify them. Because there are less level to traverse through, it is also more efficient than a binary tree.

Expected Result

With the proposed changes, we can see the following improvements to Asterisc.

More implementation diversity between Asterisc and Cannon. Currently, Cannon and Asterisc uses nearly the same code for their go vm memory. Introducing different memory management to Asterisc allows us to leverage a more thorough fault proof system.
Increased memory. Compared to binary trees, radix tries will have larger number of intermediate nodes compared to a binary tree
Improved performance. These performance improvements can be benchmarked.
1. Avoiding sparse objects. Because there are multiple levels to the radix trie, spare memory will be not allocated.
2. When accessing the values of Go's hashmap, each access results in a key hash. In a radix trie, we can directly access each child node through their pointer.

Impact on binary

This change is breaking change to the golang implementation of the RISCV VM.
This change has no impact on the proof system.
This change has no impact on the solidity implementation of the RISCV VM.

Detailed implementation

Consider a memory where

PageAddrSize = 12
PageKeySize = 52
BranchFactor = [10,10,10,10,12] // branching factor at each level of the radix trie

For an address 0x1234567890123ABC, we use the 0x1234567890123 as the page key, and 0xABC as the page address. Since we’re using radix tries to represent each page key, we can lay out the radix trie like following:

0x1234567890123 = 0b0001001000110100010101100111100010010000000100100011

root
  |
0x48 (binary 0001 0010 00)
  |
0x345 (binary 1101 0001 01)
  |
0x19E (binary 0110 0111 10)
  |
0x90 (binary 0010 0100 00)
  | 
0x123 (binary 0001 0010 0011)

Each node has children which can be up to 2^branchFactor items. All of the children are merkleized to form a the root hash of the node.

For example, radix node 0x123 can have up to 2^12 items. From those 4096 items, we must create the binary merkle tree, with intermediate nodes of the binary merkle tree as well. This results in a total of 8191 merkle hash in one radix node.

These intermediate merkle hash will be used in proof generation, thus need to be cached. Following is the definition for a radix node.

To determine whether the hash is generated at a specific generalized index(gindex), we have hashExists which is a list of bits that is enabled when a value is set for that specific node. HashValid is also a list of bits that is enabled when a valid hash is created at that position, and disabled when the node is invalidated.

type RadixNode struct {	
	Children [1  << 12]*RadixNode // 2 ^ branchFactor
	Hash [1 << 12][32]byte		
	HashExists [(1 << 12) / 64]uint64
	HashValid  [(1 << 12) / 64]uint64
}

AllocPage

func (m *Memory) AllocPage(pageIndex uint64) *CachedPage {
	p := &CachedPage{Data: new(Page)}
	m.pages[pageIndex] = p

	// allocate radix nodes related to this page
	branchPaths := m.gindexToBranchPath(pageIndex)
	current := m.radix

	for _, branch := range branchPaths {
		if current.Children[branch] == nil {
			current.Children[branch] = NewRadixNode()
		}
		current = current.Children[branch]
	}

	return p
}

When a page is allocated, we can create a new radix node for every path from the root node to the leaf radix node.

Any other radix branch that is not initialized will be nil, and we can replace any nil branch with a pre-computed zeroHash.

Optimizing hash caches

Previously, we were using nodes map[uint64]*[32]byte as hashmap with a pointer value type, so we could denote:

address not set = map’s key not set
address is set, but hash isn’t calculated = map’s value is nil
address is set, hash is set = map’s value is non-nil

However, with a list of cache with plain [32]byte type, we need separate structure to remember which hashes are set, and which are valid.

Here, we have

	HashExists [(1 << 12) / 64]uint64
	HashValid  [(1 << 12) / 64]uint64

where

HashExists stands for case b) in the above scenario. When the address is set through AllocPages, HashExists is turned on.
HashValid stands for case c) in the above scenario. When the hash is calculated, HashValid is turned on.

In order to save space, these types are [(1 << 12) / 64]uint64 where the whole node length (1 <<12) is divided among 64 equal uint64, where each node is identified a unique bit in the list of uint64.

Since 1 << 12 is equally divided by 64, we can use the following calculations

hashIndex := gindex >> 6 : to get the index of the outer uint64’s index

hashBit := gindex & 63 : to get the bit position in the uint64.

MerkleizeNode

The overall method of merkleizing the node doesn’t change. We are merkleizing the merkle tree with respect to a specific gindex.

We start from gindex=1 (the top of the tree), to the bottom of the tree by incrementing the gindex by a factor of 2. This allows us to only traverse through the trie once vertically.

Our merkle trie is composed of 5 separate levels of merkle node.

Each individual merkle node may have different branching factor therefore different children/hashes. This will result in different statically determined types of merkle node. Each level will have specific merkleizeNode function.

const (
  // Branching factor at each level
  BF1 = 10 
)

func (m *Memory) MerkleizeNodeLevel1(node *RadixNode, addr, gindex uint64) [32]byte {
	depth := uint64(bits.Len64(gindex))

	// The node is inside this trie level
	if depth <= BF1 {
		hashIndex := gindex >> 6
		hashBit := gindex & 63

		if (node.HashExists[hashIndex] & (1 << hashBit)) != 0 {
			if (node.HashValid[hashIndex] & (1 << hashBit)) != 0 {
				return node.Hashes[gindex] // Valid hash is set
			} else {
				 // Within the radix node, traverse through the node via gindex to create the hash
				left := m.MerkleizeNodeLevel1(node, addr, gindex<<1)
				right := m.MerkleizeNodeLevel1(node, addr, (gindex<<1)|1)

				r := HashPair(left, right)
				node.Hashes[gindex] = r
				node.HashValid[hashIndex] |= 1 << hashBit
				return r
			}
		} else {
		  // The address is not used, so we can use pre-calculated zeroHash
			return zeroHashes[64-5+1-depth]
		}
	}

	// We are at the bottom of the radix node. The child of the bottom node is the radix node at next level, or pages.
	// At this point, we need the merkle root of this children, or the merkle roof of the page
	childIndex := gindex - 1<<BF1
	if node.Children[childIndex] == nil { // Branch not allocated is nil
		return zeroHashes[64-5+1-depth]
	}
	
	// When traversing through the radix trie, 
	// the children index are added up to create the address
	addr <<= BF1
	addr |= childIndex
	
	return m.MerkleizeNodeLevel2(node.Children[childIndex], addr, 1)
}

We can traverse down the radix nodes by creating a statically deterministic branch path from that address.

If the radix node does not exist, we can return zeroHashes for that given gindex.
If the final radix node exists, we can look for the hashIndex which corresponds to the intermediate merkle hash in the binary tree.
- If it exists, return the hash
- If not, traverse through the trie via gindex, and create a merkle hashes and save it to the node.Hashes.

At the final radix level, the leaf nodes represent the pages.

func (m *Memory) MerkleizeNodeLevel5(node *RadixNodeLevel5, addr, gindex uint64) [32]byte {
	depth := uint64(bits.Len64(gindex))

  // The leaf nodes of the trie are actual pages. 
  // We can get the pageInex by adding the address and this specific gindex so far. 
	if gindex >= (1 << BF5) {
		pageIndex := (addr << BF5) | (gindex - (1 << BF5))
		if p, ok := m.pages[pageIndex]; ok {
			return p.MerkleRoot()
		} else {
			return zeroHashes[64-5+1-(depth+40)]
		}
	}

	if node.HashCache[gindex] {
		if node.Hashes[gindex] == [32]byte{} {
			return zeroHashes[64-5+1-depth]
		} else {
			return node.Hashes[gindex]
		}
	}

	left := m.MerkleizeNodeLevel5(node, addr, gindex<<1)
	right := m.MerkleizeNodeLevel5(node, addr, (gindex<<1)|1)
	r := HashPair(left, right)
	node.Hashes[gindex] = r
	node.HashCache[gindex] = true
	return r
}

MerkleProof

When creating a proof for 0x1234567890123ABC the flow is as follows:

There are 5 levels of merkle trie and the page addr is 0xABC

At each level of the merkle trie, we can go through each level of binary tree hash, and collect the sibling hash

func (m *Memory) GenerateProof1(node *RadixNodeLevel1, addr, target uint64) [][32]byte {
	var proofs [][32]byte

  // The idx starts at (0x123 + 1 << 12)
  // then traverses down the binary tree by dividing by 2
	for idx := target + 1<<BF1; idx > 1; idx /= 2 {
		sibling := idx ^ 1
		proofs = append(proofs, m.MerkleizeNodeLevel1(node, addr, sibling))
	}

	return proofs
}

We can run the above code only once for every branch depth.

Invalidation

For invalidation, if address 0x1234567890123ABC is invalidated, we need to invalidate 5 radix node at each level, as well as all of the intermediate merkle hashes they have created.

func (n *RadixNodeLevel1) invalidateHashes(branch uint64) {
	branch = (branch + 1<<BF1) / 2
	for index := branch; index > 0; index >>= 1 {
		hashIndex := index >> 6
		hashBit := index & 63
		n.HashExists[hashIndex] |= 1 << hashBit
		n.HashValid[hashIndex] &= ^(1 << hashBit)
	}
}

Tests and Benchmarks

This change should not break any of the existing tests.

We must benchmark for the following items:

Memory Usage
Time Complexity
Cache Invalidation
Merklization

See https://github.com/ethereum-optimism/asterisc/blob/6229f246175130e537a8c7fd1ac63b7a9bac303e/docs/radix-memory.md for detailed note on testing radix trie performance.

rvgo/fast/radix.go

codecov-commenter · 2024-10-03T18:04:38Z

Codecov Report

Attention: Patch coverage is 89.38547% with 38 lines in your changes missing coverage. Please review.

Project coverage is 62.70%. Comparing base (e1a5b01) to head (e8a0ac4).

Files with missing lines	Patch %	Lines
rvgo/fast/radix.go	90.85%	20 Missing and 10 partials ⚠️
rvgo/fast/page.go	65.21%	8 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #83      +/-   ##
==========================================
+ Coverage   62.05%   62.70%   +0.65%     
==========================================
  Files          26       27       +1     
  Lines        3236     4108     +872     
==========================================
+ Hits         2008     2576     +568     
- Misses       1112     1408     +296     
- Partials      116      124       +8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

rvgo/fast/memory_test.go

rvgo/fast/radix.go

rvgo/fast/memory_test.go

ImTei

Great work! It would be better if we have a design and implementation guide docs in the repo, including visualized example if possible. but it should not be in this PR.

mininny commented Oct 3, 2024

View reviewed changes

rvgo/fast/radix.go Outdated Show resolved Hide resolved

mininny mentioned this pull request Oct 3, 2024

[WIP] Convert rvgo's memory implementation to radix trie testinprod-io/asterisc#6

Closed

pcw109550 reviewed Oct 3, 2024

View reviewed changes

rvgo/fast/memory_test.go Outdated Show resolved Hide resolved

mininny force-pushed the feature/mininny/rvgo-radix-memory branch from c5a040b to f263b31 Compare October 5, 2024 22:40

mininny added 12 commits October 15, 2024 15:53

Refactor hashmap memory representation to radix-trie based memory

c12affa

Clean up code

b682b5a

Add more tests to validate radix behavior

fc6c000

Reduce the number of cache to half

7806c9b

Add simple benchmarking tests

ec492ee

Use array of uint64 instead of boolean list for space efficiency

cd6f8f1

Refactor some tests

63c51ee

Refactor radix implementation using generic struct

d8d3505

Improve rw performance

d415053

Add comments

3d5131a

Move benchmarking test to separate file

260dfad

Reduce slice memory allocation when generating a proof

1b8a470

mininny force-pushed the feature/mininny/rvgo-radix-memory branch from f263b31 to 6229f24 Compare October 15, 2024 21:56

Implement using different branching factor

17d7ec2

mininny force-pushed the feature/mininny/rvgo-radix-memory branch from 6229f24 to 17d7ec2 Compare October 15, 2024 22:01

ImTei approved these changes Oct 16, 2024

View reviewed changes

mininny added 2 commits October 16, 2024 11:30

Update comments about trie node's intermediate hashs

6b3e724

Improve node invalidation performance

e8a0ac4

mininny added this pull request to the merge queue Oct 18, 2024

Merged via the queue into master with commit f6bcdeb Oct 18, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert rvgo's memory implementation to radix trie #83

Convert rvgo's memory implementation to radix trie #83

mininny commented Oct 3, 2024 •

edited

Loading

codecov-commenter commented Oct 3, 2024 •

edited

Loading

ImTei left a comment

Convert rvgo's memory implementation to radix trie #83

Convert rvgo's memory implementation to radix trie #83

Conversation

mininny commented Oct 3, 2024 • edited Loading

Background

Proposed Change

Expected Result

Impact on binary

Detailed implementation

AllocPage

Optimizing hash caches

MerkleizeNode

MerkleProof

Invalidation

Tests and Benchmarks

codecov-commenter commented Oct 3, 2024 • edited Loading

Codecov Report

ImTei left a comment

Choose a reason for hiding this comment

mininny commented Oct 3, 2024 •

edited

Loading

codecov-commenter commented Oct 3, 2024 •

edited

Loading