Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Use BigInt instead of FieldElement in the compiler to be able to detect overflows/underflows #5338

Closed

Conversation

asterite
Copy link
Collaborator

@asterite asterite commented Jun 26, 2024

Description

Problem

Resolves #4631

Summary

Throughout the compiler, integer values are represented using FieldElement, starting at the tokenization level. If an integer value is larger than modulus, it's silently truncated (or, well, wrapped, I guess). To be able to detect that this happens we can't keep FieldElement at this level: we have to carry the unmodified integer up until we do the type range check. So, this PR changes tokens to carry a BigInt value. This BigInt is then passed to HirLiteral and even reaching the Literal enum in the monomorphization phase. Eventually, the BigInt is turned into a FieldElement right when it's needed (right after we do the type range check).

Then, this PR also enhances the type range check logic: now that we can have values that overflow the largest Field's value, we check that. But also, the previous code only checked for maximum positive values. For example, for u8 it would check that the value was less than 256. But for i8 it would check the same thing. This PR changes that check to verify it's in -128..=127 (and similarly for other integer types).

Additional Context

None.

Documentation*

Check one:

  • No documentation needed.
  • Documentation included in this PR.
  • [For Experimental Features] Documentation to be submitted in a separate PR.

PR Checklist*

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt on default settings.

@asterite asterite force-pushed the ab/use-big-int-instead-of-field-element-for-literals branch from 79de236 to 9dcda72 Compare June 26, 2024 13:24
@asterite asterite changed the title Use BigInt instead of FieldElement in noirc_frontend feat: Use BigInt instead of FieldElement in noirc_frontend Jun 26, 2024
@asterite asterite changed the title feat: Use BigInt instead of FieldElement in noirc_frontend feat: Use BigInt instead of FieldElement in the compiler to be able to detect overflows/underflows Jun 26, 2024
Comment on lines +202 to +203
let str = value.abs().to_str_radix(16);
AbiValue::Integer { value: str, sign: value.is_negative() }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Is this correct?

Comment on lines +39 to +41
let min: BigInt = -BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap();
let max: BigInt = BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap() - 1;
min <= *value && *value <= max
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 I wonder if there's a way to do this without allocating the min and max big ints. BigInt can only be compared with other big ints...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick solution would be to use a OnceCell so that we only need to allocate a single bigint and reuse that for all other integers of the same bitsize

}
NumericType::NativeField => {
let modulus = FieldElement::modulus();
let modulus_big_int = BigInt::from_biguint(Sign::Plus, modulus);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I wonder if there's a way to avoid creating this BigInt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a simple global for the time being.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the way to declare global variables in Rust?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const MODULUS: BigInt = blah

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I tried it:

    const MODULUS: BigInt = BigInt::from_biguint(Sign::Plus, FieldElement::modulus());

but I get this error:

cannot call non-const fn `num_bigint::BigInt::from_biguint` in constants
calls in constants are limited to constant functions, tuple structs and tuple variants

@@ -46,6 +47,12 @@ pub enum ExpressionKind {
Error,
}

impl ExpressionKind {
pub fn zero() -> Self {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if it's okay to add these helper functions. We could also inline them.

None => value.to_string(),
};
let msg = format!("{int} is outside the range of the {typ} type");
let msg = format!("{value} is outside the range of the {typ} type");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not for this PR, but there's a similar error that triggers in another part of the compiler that looks like this:

The literal 2⁸ cannot fit into u8 which has range 0..=255

This is really nice because you immediately know what are the min/max values. We could do the same here too.

let fe_i = crate::utils::big_int_to_field_element(value_i);
let i_neg = value_i.is_negative();

// TODO: check if there's a way to turn BigInt into i128 without going through FieldElement
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if there's a way to do this in another way. If not, we can remove this TODO.

Comment on lines -421 to -433
if sign {
match typ {
ast::Type::Field => Literal(Integer(-value, typ, location)),
ast::Type::Integer(_, bit_size) => {
let bit_size: u32 = bit_size.into();
let base = 1_u128 << bit_size;
Literal(Integer(FieldElement::from(base) - value, typ, location))
}
_ => unreachable!("Integer literal must be numeric"),
}
} else {
Literal(Integer(value, typ, location))
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gone from here, but a similar check/conversion is done later on.

@@ -0,0 +1,17 @@
use acvm::{AcirField, FieldElement};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm never too sure about introducing an utils module. It's probably better to put functions next to their types, but I wasn't sure if adding them to where FieldElement is was okay. There's also a function that operates on BigInt and we don't own that one. That said, I also see some other utils.rs files in the repo, so maybe it's fine :-)

Comment on lines +15 to +16
// TODO: use a more efficient way to do this
big_int_to_field_element(big_int).to_u128()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there's a better way to do this, probably iterating the be/le bytes... but I wanted to wait for a review before changing this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
    let bigint_bytes = big_uint.to_bytes_le();
    let u128_bytes: [u8; 16] = bigint_bytes[0..16].try_into().unwrap();
    u128::from_le_bytes(u128_bytes)
}

This should work (up to dealing with endianness issues)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively

pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
    let bigint_u64_le_iter = big_uint.iter_u64_digits();
    let low = bigint_u64_le_iter.next().unwrap_or(0) as u128;
    let high = bigint_u64_le_iter.next().unwrap_or(0) as u128;
    high << 64 + low
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried both approached but they fail the test I just wrote. I'll use those as a base code, I'm sure it's almost there (but I'll switch to the lsp PR now)

Comment on lines -15 to +16
if 21888242871839275222246405745257275088548364400416034343698204186575808495617 == 0 {
if compat::is_bn254() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this PR is a breaking change... but I guess a good one?

Copy link
Contributor

@vezenovm vezenovm Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah even though we refactored is_bn254 to not rely on p == 0 other programs may rely on this functionality.

@asterite asterite marked this pull request as ready for review June 26, 2024 20:51
@asterite asterite requested a review from a team June 26, 2024 20:52
@vezenovm vezenovm changed the title feat: Use BigInt instead of FieldElement in the compiler to be able to detect overflows/underflows feat!: Use BigInt instead of FieldElement in the compiler to be able to detect overflows/underflows Jun 26, 2024
Copy link
Member

@TomAFrench TomAFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have spotted a couple of cases where we're mapping from bigints to fields a bit strangely. Will do another deeper review in a bit.

Comment on lines +39 to +41
let min: BigInt = -BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap();
let max: BigInt = BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap() - 1;
min <= *value && *value <= max
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One quick solution would be to use a OnceCell so that we only need to allocate a single bigint and reuse that for all other integers of the same bitsize

}
NumericType::NativeField => {
let modulus = FieldElement::modulus();
let modulus_big_int = BigInt::from_biguint(Sign::Plus, modulus);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a simple global for the time being.

Comment on lines +50 to +54
if value.is_positive() {
value < &modulus_big_int
} else {
value >= &(-modulus_big_int)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canonical field representation is the range [0,p) so it seems odd to allow negative values. Will come back to this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. However, there's some code that uses negative field values, like this one:

So I was thinking we can at least allow negative values if they are in the range [-p, 0] (or [-p + 1, 0]?). But anything outside [-p, p] is a clear error (maybe?)

Comment on lines +15 to +16
// TODO: use a more efficient way to do this
big_int_to_field_element(big_int).to_u128()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
    let bigint_bytes = big_uint.to_bytes_le();
    let u128_bytes: [u8; 16] = bigint_bytes[0..16].try_into().unwrap();
    u128::from_le_bytes(u128_bytes)
}

This should work (up to dealing with endianness issues)

Comment on lines +15 to +16
// TODO: use a more efficient way to do this
big_int_to_field_element(big_int).to_u128()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively

pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
    let bigint_u64_le_iter = big_uint.iter_u64_digits();
    let low = bigint_u64_le_iter.next().unwrap_or(0) as u128;
    let high = bigint_u64_le_iter.next().unwrap_or(0) as u128;
    high << 64 + low
}

@@ -26,3 +26,4 @@ rust-embed.workspace = true
tracing.workspace = true

aztec_macros = { path = "../../aztec_macros" }
num-traits = "0.2.19"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
num-traits = "0.2.19"
num-traits.workspace = true

We've defined a version to use across the workspace in the root Cargo.toml. Can you replace all the other instances with this?

Comment on lines 4 to 7
pub fn big_int_to_field_element(big_int: &BigInt) -> FieldElement {
let big_uint = big_int.magnitude();
FieldElement::from_be_bytes_reduce(&big_uint.to_bytes_be())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to map the integer -1 to the field element 1 when it should instead be mapped to p-1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. I pushed a commit to fix this.

@@ -19,6 +19,7 @@ pub mod monomorphization;
pub mod node_interner;
pub mod parser;
pub mod resolve_locations;
pub mod utils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub mod utils;
pub(crate) mod utils;

I don't think we need to make this public.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that those are used in aztec_macros and noirc_evaluator.

@asterite asterite closed this Jun 28, 2024
@asterite asterite reopened this Jun 28, 2024
@asterite
Copy link
Collaborator Author

I've been thinking more about this PR and how to approach this problem.

I think that, at the lexer level, if an integer token is larger than the maximum field element, we can give a compile error. That's assuming field elements will always be bigger than the maximum value of u64. Then we can keep storing values as FieldElement. We still have to remember if the value was positive or negative, but we are already doing that.

So, I'll close the PR and later next week open another PR, but let me know otherwise!

@asterite asterite closed this Jun 29, 2024
@asterite
Copy link
Collaborator Author

asterite commented Jul 1, 2024

I just checked in Rust, and if you have a program like this:

fn main() {
    let x = 1243982734987234328947293847298347982374982374982374987324;
}

the error is:

integer literal is too large
value exceeds limit of 340282366920938463463374607431768211455

that's 2**128, so I think they do this too: at the lexer level they already limit the maximum integer, regardless of its destination. I'll do the same :-)

@TomAFrench
Copy link
Member

Hmm, yeah that makes sense. That's a nice simplification.

@asterite
Copy link
Collaborator Author

asterite commented Jul 1, 2024

Related: #449

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too-large field literals silently truncated
3 participants