Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: AVM uses native checkpointing trees instead of TS ephemeral forest #11955

Merged
merged 34 commits into from
Feb 21, 2025

Conversation

dbanks12
Copy link
Collaborator

@dbanks12 dbanks12 commented Feb 12, 2025

Note: in a later PR, we need to iron out a better WorldStateDB interface because it's kind of a sham at the moment.

Performance improvements

For execution of 100 public transfers in the public processor we went from ~11.5s down to ~6.9 seconds (~40% improvement).

Before:
image

After:
image

Changes

AVM State Manager

  1. Deleted Ephemeral forest
  2. State manager performs all merkle accesses directly on the native world state
  3. Forking the state manager calls db.createCheckpoint(). Reverting calls db.revertCheckpoint. Returning/merging calls db.commitCheckpoint()`.
  4. Added more tree insertion tests to avm_simulator.test.ts

Public processor

  1. No longer re-performs merkle insertions for txs that have public calls.
  2. Calls db.createCheckpoint() before processing of a TX. Calls db.revertCheckpoint() or db.commitCheckpoint() afterwards based on the result of execution.
  3. Since the AVM performs merkle insertions itself, the public processor just trusts the state after AVM execution.
  4. If a TX doesn't have public calls, the public processor still performs all merkle insertions.
  5. Post-processing TX validation can't check for duplicate nullifiers anymore if there are public calls because the nullifiers have already been inserted into the tree by the AVM. Duplicate nullifiers will be caught by the AVM / public tx simulator.

WorldStateDB

  1. Removed unused functions
  2. AvmStateManager directly calls the underlying merkle trees often because it needs to. We probably need to iron out a better WorldStateDB interface since it's not sufficient if we need to directly call the underlying MerkleTreeWriteOperations interface often.

TXE

  1. TXE does db.createCheckpoint() before calling PublicTxSimulator and then always does db.revertCheckpoint() afterwards because right now we execute just a single enqueued call at a time. Later during commitState(), the TXE re-applies all state changes from txEffects.
  2. We will iron out a better way to handle public execution in the TXE in the coming days.

@dbanks12 dbanks12 requested a review from fcarreiro as a code owner February 12, 2025 18:47
@dbanks12 dbanks12 changed the title feat: AVM uses native checkpointing trees instead of TS ephemeral forest [WIP] feat: AVM uses native checkpointing trees instead of TS ephemeral forest Feb 12, 2025
@dbanks12 dbanks12 removed the request for review from fcarreiro February 12, 2025 18:47
@dbanks12 dbanks12 changed the base branch from master to pw/checkpointing February 12, 2025 18:48
Base automatically changed from pw/checkpointing to master February 12, 2025 19:37
@dbanks12 dbanks12 added the e2e-all CI: Deprecated, use ci-full. Enable all master checks. label Feb 13, 2025
@dbanks12 dbanks12 force-pushed the db/avm-uses-checkpointing branch from 8460485 to d9defc5 Compare February 19, 2025 17:02
@dbanks12 dbanks12 changed the title [WIP] feat: AVM uses native checkpointing trees instead of TS ephemeral forest feat: AVM uses native checkpointing trees instead of TS ephemeral forest Feb 20, 2025
Comment on lines -319 to -356

/**
* Commit the pending public changes to the DB.
* @returns Nothing.
*/
commit(): Promise<void> {
for (const [k, v] of this.publicCheckpointedWriteCache) {
this.publicCommittedWriteCache.set(k, v);
}
// uncommitted writes take precedence over checkpointed writes
// since they are the most recent
for (const [k, v] of this.publicUncommittedWriteCache) {
this.publicCommittedWriteCache.set(k, v);
}
return this.rollbackToCommit();
}

/**
* Rollback the pending public changes.
* @returns Nothing.
*/
async rollbackToCommit(): Promise<void> {
await this.rollbackToCheckpoint();
this.publicCheckpointedWriteCache = new Map<bigint, Fr>();
return Promise.resolve();
}

checkpoint(): Promise<void> {
for (const [k, v] of this.publicUncommittedWriteCache) {
this.publicCheckpointedWriteCache.set(k, v);
}
return this.rollbackToCheckpoint();
}

rollbackToCheckpoint(): Promise<void> {
this.publicUncommittedWriteCache = new Map<bigint, Fr>();
return Promise.resolve();
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer need caching in this class. AvmStateManager handles caching when merkle ops are off. When merkle ops are on, checkpointed trees are used which is slower than "caching", but necessary. We can probably turn merkle ops off for block building eventually, but that will require quite a bit of work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test doesn't make sense anymore with caching removed from db

Comment on lines -563 to +567
const siloedNullifiers = [new Fr(10000), new Fr(20000), new Fr(30000), new Fr(40000), new Fr(50000)];
const siloedNullifiers = [new Fr(0x10000), new Fr(0x20000), new Fr(0x30000), new Fr(0x40000), new Fr(0x50000)];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for readability of logs

Comment on lines -39 to -49
override checkpoint(): Promise<void> {
return Promise.resolve();
}
override rollbackToCheckpoint(): Promise<void> {
throw new Error('Cannot rollback');
}
override commit(): Promise<void> {
return Promise.resolve();
}
override rollbackToCommit(): Promise<void> {
throw new Error('Cannot rollback');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

txe actually uses the underlying checkpoint functions now so we don't override

Comment on lines -56 to +55
public async checkExists(
siloedNullifier: Fr,
): Promise<[/*exists=*/ boolean, /*isPending=*/ boolean, /*leafIndex=*/ Fr]> {
public async checkExists(siloedNullifier: Fr): Promise<{ exists: boolean; cacheHit: boolean }> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object is cleaner than array for return type

it('Can get undefined contract instance', async () => {
it('Can get undefined bytecode', async () => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong description

Comment on lines -17 to -30
const nullifiers = tx instanceof Tx ? tx.data.getNonEmptyNullifiers() : tx.txEffect.nullifiers;
// Don't need to check for duplicate nullifiers if the tx has public calls
// because the AVM will perform merkle insertions as it goes and will fail on
// duplicate nullifier. In fact we CANNOT check here because the nullifiers
// have already been inserted, and so they will exist in nullifierSource.
if (!hasPublicCalls(tx)) {
const nullifiers = tx instanceof Tx ? tx.data.getNonEmptyNullifiers() : tx.txEffect.nullifiers;

// Ditch this tx if it has repeated nullifiers
const uniqueNullifiers = new Set(nullifiers);
if (uniqueNullifiers.size !== nullifiers.length) {
this.#log.warn(`Rejecting tx ${await Tx.getHash(tx)} for emitting duplicate nullifiers`);
return { result: 'invalid', reason: ['Duplicate nullifier in tx'] };
}
// Ditch this tx if it has repeated nullifiers
const uniqueNullifiers = new Set(nullifiers);
if (uniqueNullifiers.size !== nullifiers.length) {
this.#log.warn(`Rejecting tx ${await Tx.getHash(tx)} for emitting duplicate nullifiers`);
return { result: 'invalid', reason: ['Duplicate nullifier in tx'] };
}

if ((await this.#nullifierSource.nullifiersExist(nullifiers.map(n => n.toBuffer()))).some(Boolean)) {
this.#log.warn(`Rejecting tx ${await Tx.getHash(tx)} for repeating a nullifier`);
return { result: 'invalid', reason: ['Existing nullifier'] };
if ((await this.#nullifierSource.nullifiersExist(nullifiers.map(n => n.toBuffer()))).some(Boolean)) {
this.#log.warn(`Rejecting tx ${await Tx.getHash(tx)} for repeating a nullifier`);
return { result: 'invalid', reason: ['Existing nullifier'] };
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If public calls are present, can't check that nullifiers in txEffects don't already exist in the trees because the AVM now inserts them into the trees. Still need these checks for private-only txs.

Copy link
Contributor

@sklppy88 sklppy88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking off on TXe, thanks for the lovely comments !

@@ -909,10 +916,14 @@ export class TXE implements TypedOracle {
globalVariables.blockNumber = new Fr(this.blockNumber);
globalVariables.gasFees = new GasFees(1, 1);

// Checkpoint here so that we can revert merkle ops after simulation.
// See note at revert below.
await db.createCheckpoint();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a try/finally here to ensure the checkpoint is always reverted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to put a revert in finally because we don't want to revert it on success, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, this is in the TXE. I thought this was in the public processor. Yes, this should be in a try/finally, good call!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into a try/finally!

} else {
// Otherwise, perform all tree insertions for side effects from private
await this.commitTxState(processedTx);
}
nullifierCache?.addNullifiers(processedTx.txEffect.nullifiers.map(n => n.toBuffer()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wonder if this section should be in a try/catch so we don't hit the revert after having already run either commit or revert.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Will fix.

Copy link
Collaborator Author

@dbanks12 dbanks12 Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole block is already in a try/catch. I changed the commitCheckpoint to always happen on success, even for private-only txs. And I moved the commitCheckpoint to the very end of the try block. Does this make sense?
image'

Also I renamed the old commitTxState to doTreeInsertionsForPrivateOnlyTx since that's what it actually does now. I thought I should leave that call where it is so that we don't update totalBlockGas/etc before we do the private-only-tx insertions that could theoretically blow up (although i don't think they should since TX post-processing should do duplicate-nullifier checks).

await this.worldStateDB.commitCheckpoint();
} else {
// Otherwise, perform all tree insertions for side effects from private
await this.commitTxState(processedTx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this commitTxState perform a commit or revert?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like it does. Doesn't this mean we never commit/revert the checkpoint in this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! We would never commit or revert it in this case and that's a bug.

@dbanks12 dbanks12 requested a review from PhilWindle February 21, 2025 02:20
@dbanks12
Copy link
Collaborator Author

Going to merge. In a follow-up PR, I'm going to add some sort of "fork counting" to the checkpointing to prevent someone from trying to revert a parent/grandparent that still has unresolved children.

@dbanks12 dbanks12 enabled auto-merge (squash) February 21, 2025 12:30
@dbanks12 dbanks12 merged commit 1cc5ed1 into master Feb 21, 2025
9 checks passed
@dbanks12 dbanks12 deleted the db/avm-uses-checkpointing branch February 21, 2025 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
e2e-all CI: Deprecated, use ci-full. Enable all master checks.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants