-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Use BigInt instead of FieldElement in the compiler to be able to detect overflows/underflows #5338
feat!: Use BigInt instead of FieldElement in the compiler to be able to detect overflows/underflows #5338
Conversation
79de236
to
9dcda72
Compare
let str = value.abs().to_str_radix(16); | ||
AbiValue::Integer { value: str, sign: value.is_negative() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭 Is this correct?
let min: BigInt = -BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap(); | ||
let max: BigInt = BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap() - 1; | ||
min <= *value && *value <= max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭 I wonder if there's a way to do this without allocating the min
and max
big ints. BigInt
can only be compared with other big ints...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One quick solution would be to use a OnceCell so that we only need to allocate a single bigint and reuse that for all other integers of the same bitsize
} | ||
NumericType::NativeField => { | ||
let modulus = FieldElement::modulus(); | ||
let modulus_big_int = BigInt::from_biguint(Sign::Plus, modulus); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I wonder if there's a way to avoid creating this BigInt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a simple global
for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the way to declare global variables in Rust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const MODULUS: BigInt = blah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I tried it:
const MODULUS: BigInt = BigInt::from_biguint(Sign::Plus, FieldElement::modulus());
but I get this error:
cannot call non-const fn `num_bigint::BigInt::from_biguint` in constants
calls in constants are limited to constant functions, tuple structs and tuple variants
@@ -46,6 +47,12 @@ pub enum ExpressionKind { | |||
Error, | |||
} | |||
|
|||
impl ExpressionKind { | |||
pub fn zero() -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if it's okay to add these helper functions. We could also inline them.
None => value.to_string(), | ||
}; | ||
let msg = format!("{int} is outside the range of the {typ} type"); | ||
let msg = format!("{value} is outside the range of the {typ} type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not for this PR, but there's a similar error that triggers in another part of the compiler that looks like this:
The literal
2⁸
cannot fit intou8
which has range0..=255
This is really nice because you immediately know what are the min/max values. We could do the same here too.
let fe_i = crate::utils::big_int_to_field_element(value_i); | ||
let i_neg = value_i.is_negative(); | ||
|
||
// TODO: check if there's a way to turn BigInt into i128 without going through FieldElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if there's a way to do this in another way. If not, we can remove this TODO.
if sign { | ||
match typ { | ||
ast::Type::Field => Literal(Integer(-value, typ, location)), | ||
ast::Type::Integer(_, bit_size) => { | ||
let bit_size: u32 = bit_size.into(); | ||
let base = 1_u128 << bit_size; | ||
Literal(Integer(FieldElement::from(base) - value, typ, location)) | ||
} | ||
_ => unreachable!("Integer literal must be numeric"), | ||
} | ||
} else { | ||
Literal(Integer(value, typ, location)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is gone from here, but a similar check/conversion is done later on.
@@ -0,0 +1,17 @@ | |||
use acvm::{AcirField, FieldElement}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm never too sure about introducing an utils
module. It's probably better to put functions next to their types, but I wasn't sure if adding them to where FieldElement
is was okay. There's also a function that operates on BigInt
and we don't own that one. That said, I also see some other utils.rs
files in the repo, so maybe it's fine :-)
// TODO: use a more efficient way to do this | ||
big_int_to_field_element(big_int).to_u128() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure there's a better way to do this, probably iterating the be/le bytes... but I wanted to wait for a review before changing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
let bigint_bytes = big_uint.to_bytes_le();
let u128_bytes: [u8; 16] = bigint_bytes[0..16].try_into().unwrap();
u128::from_le_bytes(u128_bytes)
}
This should work (up to dealing with endianness issues)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively
pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
let bigint_u64_le_iter = big_uint.iter_u64_digits();
let low = bigint_u64_le_iter.next().unwrap_or(0) as u128;
let high = bigint_u64_le_iter.next().unwrap_or(0) as u128;
high << 64 + low
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried both approached but they fail the test I just wrote. I'll use those as a base code, I'm sure it's almost there (but I'll switch to the lsp PR now)
test_programs/compile_failure/unsigned_integer_literal_overflow/src/main.nr
Show resolved
Hide resolved
if 21888242871839275222246405745257275088548364400416034343698204186575808495617 == 0 { | ||
if compat::is_bn254() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this PR is a breaking change... but I guess a good one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah even though we refactored is_bn254
to not rely on p == 0
other programs may rely on this functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have spotted a couple of cases where we're mapping from bigints to fields a bit strangely. Will do another deeper review in a bit.
let min: BigInt = -BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap(); | ||
let max: BigInt = BigInt::from_i128(1_i128 << (bit_size - 1)).unwrap() - 1; | ||
min <= *value && *value <= max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One quick solution would be to use a OnceCell so that we only need to allocate a single bigint and reuse that for all other integers of the same bitsize
} | ||
NumericType::NativeField => { | ||
let modulus = FieldElement::modulus(); | ||
let modulus_big_int = BigInt::from_biguint(Sign::Plus, modulus); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a simple global
for the time being.
if value.is_positive() { | ||
value < &modulus_big_int | ||
} else { | ||
value >= &(-modulus_big_int) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The canonical field representation is the range [0,p)
so it seems odd to allow negative values. Will come back to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. However, there's some code that uses negative field values, like this one:
let x = -1; |
So I was thinking we can at least allow negative values if they are in the range [-p, 0]
(or [-p + 1, 0]
?). But anything outside [-p, p]
is a clear error (maybe?)
// TODO: use a more efficient way to do this | ||
big_int_to_field_element(big_int).to_u128() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
let bigint_bytes = big_uint.to_bytes_le();
let u128_bytes: [u8; 16] = bigint_bytes[0..16].try_into().unwrap();
u128::from_le_bytes(u128_bytes)
}
This should work (up to dealing with endianness issues)
// TODO: use a more efficient way to do this | ||
big_int_to_field_element(big_int).to_u128() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively
pub fn truncate_big_int_to_u128(big_int: &BigInt) -> u128 {
let bigint_u64_le_iter = big_uint.iter_u64_digits();
let low = bigint_u64_le_iter.next().unwrap_or(0) as u128;
let high = bigint_u64_le_iter.next().unwrap_or(0) as u128;
high << 64 + low
}
compiler/noirc_driver/Cargo.toml
Outdated
@@ -26,3 +26,4 @@ rust-embed.workspace = true | |||
tracing.workspace = true | |||
|
|||
aztec_macros = { path = "../../aztec_macros" } | |||
num-traits = "0.2.19" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num-traits = "0.2.19" | |
num-traits.workspace = true |
We've defined a version to use across the workspace in the root Cargo.toml
. Can you replace all the other instances with this?
compiler/noirc_frontend/src/utils.rs
Outdated
pub fn big_int_to_field_element(big_int: &BigInt) -> FieldElement { | ||
let big_uint = big_int.magnitude(); | ||
FieldElement::from_be_bytes_reduce(&big_uint.to_bytes_be()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to map the integer -1
to the field element 1
when it should instead be mapped to p-1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. I pushed a commit to fix this.
@@ -19,6 +19,7 @@ pub mod monomorphization; | |||
pub mod node_interner; | |||
pub mod parser; | |||
pub mod resolve_locations; | |||
pub mod utils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub mod utils; | |
pub(crate) mod utils; |
I don't think we need to make this public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that those are used in aztec_macros
and noirc_evaluator
.
I've been thinking more about this PR and how to approach this problem. I think that, at the lexer level, if an integer token is larger than the maximum field element, we can give a compile error. That's assuming field elements will always be bigger than the maximum value of u64. Then we can keep storing values as So, I'll close the PR and later next week open another PR, but let me know otherwise! |
I just checked in Rust, and if you have a program like this: fn main() {
let x = 1243982734987234328947293847298347982374982374982374987324;
} the error is:
that's 2**128, so I think they do this too: at the lexer level they already limit the maximum integer, regardless of its destination. I'll do the same :-) |
Hmm, yeah that makes sense. That's a nice simplification. |
Related: #449 |
Description
Problem
Resolves #4631
Summary
Throughout the compiler, integer values are represented using
FieldElement
, starting at the tokenization level. If an integer value is larger than modulus, it's silently truncated (or, well, wrapped, I guess). To be able to detect that this happens we can't keepFieldElement
at this level: we have to carry the unmodified integer up until we do the type range check. So, this PR changes tokens to carry aBigInt
value. ThisBigInt
is then passed toHirLiteral
and even reaching theLiteral
enum in the monomorphization phase. Eventually, theBigInt
is turned into aFieldElement
right when it's needed (right after we do the type range check).Then, this PR also enhances the type range check logic: now that we can have values that overflow the largest Field's value, we check that. But also, the previous code only checked for maximum positive values. For example, for
u8
it would check that the value was less than 256. But fori8
it would check the same thing. This PR changes that check to verify it's in-128..=127
(and similarly for other integer types).Additional Context
None.
Documentation*
Check one:
PR Checklist*
cargo fmt
on default settings.