Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: hash contract class logs with poseidon + add preimage to blob #12061

Merged
merged 29 commits into from
Feb 27, 2025

Conversation

MirandaWood
Copy link
Contributor

@MirandaWood MirandaWood commented Feb 17, 2025

Quick and dirty Somewhat clean implementation which:

  • Converts our SHA based contract class LogHashes with poseidon hashes and constrain them in the registerer.
  • Verifies this ^ hash in the base rollup and appends to the blob as fields, eliminating the need for any calldata.
  • Treats the logs as fields in ts instead of bytes.

EDIT: Added log validation to tx in this commit: cb739a2

EDIT 2: We may not need to constrain the hash in the registerer, so can revert to the oracle hashing the log and leaving the constraining to the base rollup in a future PR.

@MirandaWood MirandaWood marked this pull request as ready for review February 18, 2025 11:32
Comment on lines 224 to 229
// TODO(MW): use chunked poseidon if cheaper? Use variable poseidon with length of non-0 fields?
let log_hash =
unsafe { emit_contract_class_unencrypted_log_private(contract_address, log, counter) };
std::hash::poseidon2::Poseidon2::hash(log_to_emit, CONTRACT_CLASS_LOG_SIZE_IN_FIELDS);
// Safety: TODO(MW): constraining below length in the base rollup. Is this good/safe?
let length = unsafe { find_index_hint_from_end(log_to_emit, |elem| !is_empty(elem)) };
// Safety: the below only exists to broadcast the raw log, so we can provide it to the base rollup later to be constrained.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two Qs here:

  • I'm just hashing the entire (padded) log here because it's easy and clean. I could hash just the N input fields. This is a bit more expensive in the circuits (not by much, I'd use the chunk variable hash) but might be a lot faster elsewhere. I already have code to hash N of M fields, and check the remaining M - N fields are all zero.
  • I'm carrying the log length (N) throughout the kernels to track gas since we are only adding N things to the blob. I'm not sure this is even required atm because logs are set to cost 0? I'm also not constraining this length until the base rollup when we verify the hash. I'm not sure if this is ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd potentially go with the unpadded approach, since our recent discussions (via that hackmd) suggest that eventually (probably in a separate PR), we don't actually need to hash in this circuit at all.

We probably do need to track the length, because ultimately these logs should cost a relative fortune. And eventually (as per prev paragraph) only the Base will need to constrain this length (because we'll remove constrained hashing of this log from private-land).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we'll remove constrained hashing of this log from private-land

Shall I just do this now, and make the oracle hash again? I don't remember now - but wasn't there some reason we weren't happy with the original oracle hash even though we verified the hash later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay; I need to sort out my notification settings.

Shall I just do this now, and make the oracle hash again?

Can do! It can either be an oracle, or just make the function unconstrained? But you messaged 14 hours ago, so maybe you've already chosen an approach, which I'll find out when I read further down this page :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't there some reason we weren't happy with the original oracle hash even though we verified the hash later?

I can't remember 😬

@iAmMichaelConnor
Copy link
Contributor

👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - this file is changed by yarn format and I forgot to remove from staged - nothing to do with PR. If the next CI run passes, I'll remove it.

Copy link
Contributor

@iAmMichaelConnor iAmMichaelConnor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wooooh I read it all! Thanks for doing all this! It's quite scary how much of the codebase needs to be touched to change a very very thin "slice" of functionality. It's essentially just changing a public input of the private context, and it blows up into thousands of lines.
Anyway, I've written comments, and a lot of them will be because this is the first time I'm looking at most of these files. :)

let log_hash = poseidon2_hash_subarray_variable(log_to_emit, length);
// Safety: the below only exists to broadcast the raw log, so we can provide it to the base rollup later to be constrained.
unsafe {
emit_contract_class_unencrypted_log_private(contract_address, log_to_emit, counter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forewarning, I still don't really know my way around the codebase or our naming conventions.
Forewarning, sometimes my review comments are pedantic.
I think I've seen a naming pattern for other oracles which just inform typescript about something (e.g. when lining up a hint for some later kernel / rollup circuit): the word "notify" is used. Perhaps this oracle function should be called notify_....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not really touched the oracle that much, but since we actually add state in ts which is different to the state tracked here (raw log vs hashed log), maybe emit still stands. I think the notify is different but unsure?

@@ -217,13 +220,18 @@ pub contract ContractClassRegisterer {
// cannot prove non-registration. Therefore, it is possible that a malicious oracle might prevent sequencers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some of this comment is out of date now, since your special poseidon2_hash_subarray_variable function is able to hash the whole thing in-circuit, so the purpose of the oracle is now just to inform typescript, rather than its previous purpose as a hack to compute the log_hash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I was hoping the reviewer would give me some guidance on this. If we move the hash to the oracle / unconstrained as planned this comment is still correct isn't it? Shall I change it then change it back?

Comment on lines 67 to 74
// TODO(MW): Lazily used /2 instead of CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS, because this keeps overfilling block blobs
// Below line gives error 'Type instantiation is excessively deep and possibly infinite. ts(2589)'
// makeTuple(CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS, Fr.random);
const fields = Array.from({ length: Math.ceil(CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS / 2) }, () => Fr.random());
return new ContractClassLog(
await AztecAddress.random(),
fields.concat(Array.from({ length: Math.floor(CONTRACT_CLASS_LOG_DATA_SIZE_IN_FIELDS / 2) }, () => Fr.ZERO)),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment something like "Only half of the contract class log's fields get filled with random values here (ideally all of the fields would be filled with random values), because...". It took my brain a sec to understand this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not ignore the error, like I saw you do elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - it's a nightmare because they are tuples, not arrays. I couldn't get tuples defined in our usual way working at all

}

async hash() {
return await poseidon2Hash(this.getEmittedFields());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment that this is one of two approaches to hashing this data, to disambiguate from the weird AVM way of hashing the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these logs are ever hashed in the AVM way at all? Information about functions is hashed in the AVM way, but not these logs AFAIK

@@ -170,6 +171,40 @@ export class Tx extends Gossipable {
return logsSource.getPublicLogs({ txHash: await this.getTxHash() });
}

/**
* Gets either revertible or non revertible contract class logs emitted by this tx.
* @param revertible - true for revertible only logs, false for non reverible only logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh gosh, I didn't even pay attention to any logic that discerned between rev and non-rev in the .nr files...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't need to change it because before this PR we use a log hash in the kernels. After this PR we... use a log hash in the kernels :)

Copy link
Collaborator

@LeilaWang LeilaWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with some of Mike's comments, but maybe they can be addressed in followup prs. It looks great to me!

@LeilaWang LeilaWang enabled auto-merge (squash) February 27, 2025 18:18
@LeilaWang LeilaWang merged commit 44c07a3 into master Feb 27, 2025
9 checks passed
@LeilaWang LeilaWang deleted the mw/cc-logs-pos-hash branch February 27, 2025 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants