-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Property testing with quickcheck #1159
Property testing with quickcheck #1159
Conversation
a portion of the meta issue for fuzzing #972. The code base reflected here uses quickcheck to generate C headers that include a variety of types including basic types, structs, unions, function prototypes and function pointers. The headers generated by quickcheck are passed to the `csmith-fuzzing/predicate.py` script. Examples of headers generated by this iteration of the tooling can be viewed [here](https://gist.github.com/snewt/03ce934f35c5b085807d2d5cf11d1d5c). At the top of each header are two simple struct definitions, `whitelistable` and `blacklistable`. Those types are present in the vector that represents otherwise primitive types used to generate. They represent a naive approach to exposing custom types without having to intuit generated type names like `struct_21_8` though _any actual whitelisting logic isn't implemented here_. Test success is measured by the success of the `csmith-fuzzing/predicate.py` script. This means that for a test to pass the following must be true: - bindgen doesn't panic - the resulting bindings compile - the resulting bindings layout tests pass ```bash cd tests/property_test cargo test ``` Some things I'm unsure of: At the moment it lives in `tests/property_test` but isn't run when `cargo test` is invoked from bindgen's cargo manifest directory. At this point, the source is genereated in ~1 second but the files are large enough that it takes the `predicate.py` script ~30 seconds to run through each one. In order for the tests to run in under a minute only 2 are generated by quickcheck by default. This can be changed in the `test_bindgen` function of the `tests/property_test/tests/fuzzed-c-headers.rs` file. For now the `run_predicate_script` function in the `tests/property_test/tests/fuzzed-c-headers.rs` file contains a commented block that will copy generated source in the `tests/property_test/tests` directory. Should it be easier? There is some logic in the fuzzer that disallows 0 sized arrays because tests will regulary fail due to issues documented in #684 and #1153. Should this be special casing? After any iterations the reviewers are interested in required to make this a functional testing tool, should/could the fuzzing library be made into its own crate? I didn't move in that direction yet because having it all in one place seemed like the best way to figure out what works an doesn't but I'm interested in whether it might be useful as a standalone library. I'm looking forward to feedback on how to make this a more useful tool and one that provides the right configurability. Thanks! r? @fitzgen
Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @fitzgen (or someone else) soon. |
@Snewt thanks very much for this PR! And thanks for your patience -- I've been on vacation the last week, which is why you haven't heard back from me. I'll try to take a look at the code and answer your open questions today, but it might get pushed back to tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks awesome -- thank you! Very close to what I was hoping for.
I left a lot of nitpick-y comments on the PR, this is not intended as harsh
criticism, just trying to bring us towards our code ideals.
I am very excited for this!
At the top of each header are two simple struct definitions,
whitelistable
andblacklistable
. Those types are present in the vector that
represents otherwise primitive types used to generate. They represent a naive
approach to exposing custom types without having to intuit generated type names like
struct_21_8
though any actual whitelisting logic isn't implemented
here.
I think instead of generating types named whitelistable
or blacklistable
, we
should track the names of types and variables that we generated and then
randomly choose some to whitelist (or none).
We should avoid blacklisting (at least for now) because it intentionally creates
bindings that won't compile, and it is the users' responsibility to provide
alternative definitions instead.
We should however also randomly mark types as opaque.
As far as tracking the names that we've generated, we can either do that:
-
On-the-fly as we generate types. Perhaps by introducing a
ArbitraryWithCurrentScope
trait that is the same asquickcheck::Arbitrary
but with a "current scope" parameter containing all the names of things we've
defined thus far, and implementing that for everything rather thanArbitrary
directly. -
Or we can do it in a secondary pass over the generated AST, a la
MakeUnique
We can do all of this whitelisting and scope tracking in a follow up PR, so we
don't need to dig in too deep right now.
Test success is measured by the success of the
csmith-fuzzing/predicate.py
script. This means that for a test to pass the following must be true:
- bindgen doesn't panic
- the resulting bindings compile
- the resulting bindings layout tests pass
Perfect!
Where should this feature live?
At the moment it lives in
tests/property_test
but isn't run when
cargo test
is invoked from bindgen's cargo manifest directory.
Its fine for this to be a separate crate that is invoked separately. To get
really nitpick-y, I'd probably name this crate "quickchecking" rather than
"property_test".
We should add a new CI job that checks that this crate continues to build, at
minimum. Once bindgen
is pretty reliably passing the property tests we can
start running them in CI as well.
To add the new CI job:
- First, add a row to the
env.matrix
in.travis.yml
:
- LLVM_VERSION="4.0.0" BINDGEN_JOB="quickchecking"
- Second, add a new case to
ci/script.sh
:
"quickchecking")
cd ./tests/quickchecking
# TODO: Actually run quickchecks once `bindgen` is reliable enough.
cargo check
;;
What's an acceptable ammount of time for these tests to take?
At this point, the source is genereated in ~1 second but the files are
large enough that it takes thepredicate.py
script ~30 seconds to run
through each one. In order for the tests to run in under a minute only 2 are
generated by quickcheck by default. This can be changed in thetest_bindgen
function of thetests/property_test/tests/fuzzed-c-headers.rs
file.
This is probably because predicate.py
does cargo run
with bindgen
, so the
first time its called, it needs to build bindgen
. We can explicitly do a
cargo build
of bindgen before we begin quickchecking. Maybe there is another
work around as well.
Backing up a bit: it would be kind of cool if this whole quickchecking crate was
a [[bin]]
target rather than (or in addition to?) a [lib]
target. Then we
could punt all these questions to CLI flags, and allow people to fuzz overnight,
for example.
How do we expose the generated code for easy inspection?
For now the
run_predicate_script
function in the
tests/property_test/tests/fuzzed-c-headers.rs
file contains a
commented block that will copy generated source in thetests/property_test/tests
directory. Should it be easier?
Sounds good, we just need to make sure we don't clobber any existing failing
test case in that directory. We don't want to lose valuable test cases!
Special casing
There is some logic in the fuzzer that disallows 0 sized arrays because
tests will regulary fail due to issues documented in #684 and #1153. Should
this be special casing?
This is the pragmatic approach.
It would be kind of nice to have cargo features control this, and by default we
would not generate code that is already known to be super problematic.
Does the fuzzer warrant its own crate?
After any iterations the reviewers are interested in required to make
this a functional testing tool, should/could the fuzzing library be made into
its own crate? I didn't move in that direction yet because having it all in one
place seemed like the best way to figure out what works an doesn't but I'm
interested in whether it might be useful as a standalone library.
Maybe eventually?
What does it look like to expose more useful functionality?
I'm looking forward to feedback on how to make this a more useful tool
and one that provides the right configurability.
The big next piece is whitelisting and marking types opaque.
Also bitfields.
Generating some C++ and templates and inheritance further down the line.
In general, if you go through the issue tracker (particularly I-bogus-codegen
issues) you can try and get a sense of what kinds of interactions and constructs
are tripping up bindgen
and then think about how to add support for generating
those things and similar interactions.
Thanks!
Thank you!
Very excited to see the next iteration of this PR!
tests/property_test/src/fuzzers.rs
Outdated
1 => DeclarationC::FunctionPtrDecl(FunctionPointerDeclarationC::arbitrary(g)), | ||
2 => DeclarationC::StructDecl(StructDeclarationC::arbitrary(g)), | ||
3 => DeclarationC::UnionDecl(UnionDeclarationC::arbitrary(g)), | ||
_ => DeclarationC::VariableDecl(BasicTypeDeclarationC::arbitrary(g)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more clear of our intent with this match, lets do:
4 => DeclarationC::VariableDecl(...),
_ => unreachable!(),
tests/property_test/src/fuzzers.rs
Outdated
} | ||
|
||
trait MakeUnique { | ||
fn make_unique(&mut self, stamp: usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on what abstraction this trait represents. In general, some doc comments in this file would be super useful. I like to turn on #![deny(missing_docs)]
as a forcing function for myself, and I think that would be beneficial here as well.
tests/property_test/src/fuzzers.rs
Outdated
impl MakeUnique for DeclarationC { | ||
fn make_unique(&mut self, stamp: usize) { | ||
match self { | ||
&mut DeclarationC::FunctionDecl(ref mut d) => d.make_unique(stamp), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is slightly more idiomatic (and slightly shorter) to do
match *self {
DeclarationC::FunctionDecl(ref mut d) => ...,
...
}
tests/property_test/src/fuzzers.rs
Outdated
|
||
impl Arbitrary for DeclarationC { | ||
fn arbitrary<G: Gen>(g: &mut G) -> DeclarationC { | ||
match usize::arbitrary(g) % 5 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.rs/quickcheck/0.4.2/quickcheck/trait.Rng.html#method.gen_range is going to have a proper uniform distribution, where mod won't unless n
is a divisor of usize::MAX
, which it isn't in this case (5
).
tests/property_test/src/fuzzers.rs
Outdated
"whitelistable", | ||
"blacklistable", | ||
]; | ||
match base_type.iter().nth(usize::arbitrary(g) % base_type.len()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.rs/quickcheck/0.4.2/quickcheck/trait.Rng.html#method.choose
g.choose(&base_type).cloned().unwrap()
tests/property_test/src/fuzzers.rs
Outdated
for _ in 1..dimensions { | ||
def += &format!("[{}]", (usize::arbitrary(g) % 15) + 1); | ||
} | ||
ArrayDimensionC { def: def } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time we have foo: foo
inside a struct literal, we can use this shorthand:
ArrayDimensionC { def }
tests/property_test/src/fuzzers.rs
Outdated
|
||
impl MakeUnique for BasicTypeDeclarationC { | ||
fn make_unique(&mut self, stamp: usize) { | ||
self.ident_id += &format!("_{}", stamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh ok so this trait is to make unique identifiers! Yeah, doc comments on the trait's definition would have been super helpful to me as a reader.
tests/property_test/src/fuzzers.rs
Outdated
impl Arbitrary for BasicTypeDeclarationC { | ||
fn arbitrary<G: Gen>(g: &mut G) -> BasicTypeDeclarationC { | ||
BasicTypeDeclarationC { | ||
type_qualifier: TypeQualifierC::arbitrary(g).def, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some intention behind pulling out the strings early, rather than making BasicTypeDeclarationC
's type_qualifier
be a TypeQualifierC
, etc? Was it just easiest? Perhaps to break type recursion?
What I would naively expect would be having structured types instead of strings, well placed Box
es to break type recursion, and the Display
implementations handling all stringification. For example, I'd expect a StructDeclarationC
to have a Vec
of StructFieldC
rather than a string concatenation of its fields, and with something like PointerLevelC
, I wouldn't expect it to contain a String
, but instead just a number, and then its Display
would do
for _ in 0..self.level {
write!(f, "*")?;
}
In general, leveraging types and the type system as far as it will go and as long as we can before devolving into string concatenation (bash programming amirite?) is a good rule of thumb. It gets the compiler to double check that we don't do things like provide a PointerLevelC
where we expect a TypeQualifierC
; with strings the compiler can't help us here.
So, unless there is some underlying reason why we can't abide by this rule in this case, I think we should move to more structured types instead of strings for all of the various BlahC
definitions.
tests/property_test/src/fuzzers.rs
Outdated
fn arbitrary<G: Gen>(g: &mut G) -> StructDeclarationC { | ||
let mut fields_string = String::new(); | ||
// reduce generator size as a method of putting a bound on recursion. | ||
// when size < 1 the empty list is generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize the first word in a sentence.
.output()?) | ||
|
||
// omit close, from tempdir crate's docs: | ||
// "Closing the directory is actually optional, as it would be done on drop." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove this comment.
@fitzgen thanks for the awesome and thorough feedback, those are exactly sort of things I hoped / needed to hear. I'll keep you posted on progress or questions. Ha also I'm all about "nitpick-y" comments, I really appreciate getting familiar with and having the opportunity to live up to this project's code ideals! |
Address requested changes to quickchecking crate.
Note on making sure we don't lose test cases we're interested in preserving:
I'm not sure that Tasks not addressed by this PR:
Figured this would be a good point to update the PR but if any of the above TODO Thanks for taking another look! r? @fitzgen |
For reference, code that was generated after PR change requests were addressed is here. |
- Remove `whitelistable` and `blacklistable` types. - Rename test crate directory from `property_test` to `quickchecking`. - Add new CI job that checks that this crate continues to build. - Revise matching logic to be more idomatic. - Phase out modular arithmetic in favor of `gen_range`. - Incorporate `unreachable!` into match statements. - Revise logic for accessing random element of vector, favor `choose` over `nth`. - Proper punctuation and capitalization in comments. - Using actual structures rather than converting everything to strings in order to leverage type system. - Add `#![deny(missing_docs)]` and filled in documentation required for the project to build again. - Add special case logic so we don't generate structs with `long double` fields as it will cause tests to fail unitl issue \#550 is resolved Note on making sure we don't lose test cases we're interested in preserving: We're copying the directories `TempDir` makes so we get things like this: ``` ├── bindgen_prop.1WYe3F5HZU1c │ └── prop_test.h ├── bindgen_prop.H4SLI1JX0jd8 │ └── prop_test.h ``` I'm not sure that `TempDir` makes any claims about uniqueness, so collisions probably aren't impossible. I'm up for any suggestions on a more bulletproof solution. _Tasks not addressed by this PR:_ * TODO: Add `cargo features` logic to allow generating problematic code. * TODO: Make a [bin] target with CLI to manage test settings. * TODO: Whitelisting and opaque types. * TODO: Generate bitfields, C++, I-bogus-codegen cases. Figured this would be a good point to update the PR but if any of the above TODO items should be incorporated before moving forward I'm up for it! Thanks for taking another look! r? @fitzgen
This seems good enough to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 🎉 🎉
This is awesome!!
Care to file follow up issues for the remaining TODO items?
Thanks 💯 @Snewt !
@bors-servo r+ |
📌 Commit 2aa9b1d has been approved by |
Property testing with quickcheck This PR represents an attempt to address issue #970. It also represents a portion of the meta issue for fuzzing #972. The code base reflected here uses quickcheck to generate C headers that include a variety of types including basic types, structs, unions, function prototypes and function pointers. The headers generated by quickcheck are passed to the `csmith-fuzzing/predicate.py` script. Examples of headers generated by this iteration of the tooling can be viewed [here](https://gist.github.com/snewt/03ce934f35c5b085807d2d5cf11d1d5c). At the top of each header are two simple struct definitions, `whitelistable` and `blacklistable`. Those types are present in the vector that represents otherwise primitive types used to generate. They represent a naive approach to exposing custom types without having to intuit generated type names like `struct_21_8` though _any actual whitelisting logic isn't implemented here_. Test success is measured by the success of the `csmith-fuzzing/predicate.py` script. This means that for a test to pass the following must be true: - bindgen doesn't panic - the resulting bindings compile - the resulting bindings layout tests pass #### Usage ```bash cd tests/property_test cargo test ``` Some things I'm unsure of: #### Where should this feature live? At the moment it lives in `tests/property_test` but isn't run when `cargo test` is invoked from bindgen's cargo manifest directory. #### What's an acceptable ammount of time for these tests to take? At this point, the source is genereated in ~1 second but the files are large enough that it takes the `predicate.py` script ~30 seconds to run through each one. In order for the tests to run in under a minute only 2 are generated by quickcheck by default. This can be changed in the `test_bindgen` function of the `tests/property_test/tests/fuzzed-c-headers.rs` file. #### How do we expose the generated code for easy inspection? For now the `run_predicate_script` function in the `tests/property_test/tests/fuzzed-c-headers.rs` file contains a commented block that will copy generated source in the `tests/property_test/tests` directory. Should it be easier? #### Special casing There is some logic in the fuzzer that disallows 0 sized arrays because tests will regulary fail due to issues documented in #684 and #1153. Should this be special casing? #### Does the fuzzer warrant its own crate? After any iterations the reviewers are interested in required to make this a functional testing tool, should/could the fuzzing library be made into its own crate? I didn't move in that direction yet because having it all in one place seemed like the best way to figure out what works an doesn't but I'm interested in whether it might be useful as a standalone library. #### What does it look like to expose more useful functionality? I'm looking forward to feedback on how to make this a more useful tool and one that provides the right configurability. Thanks! r? @fitzgen
@fitzgen ah awesome! yeah I can definitely file the follow up todos. Thanks!! |
☀️ Test successful - status-travis |
This PR represents an attempt to address issue #970. It also represents a portion of the meta issue for fuzzing #972.
The code base reflected here uses quickcheck to generate C headers that
include a variety of types including basic types, structs, unions,
function prototypes and function pointers. The headers generated by quickcheck
are passed to the
csmith-fuzzing/predicate.py
script. Examples of headersgenerated by this iteration of the tooling can be viewed
here.
At the top of each header are two simple struct definitions,
whitelistable
andblacklistable
. Those types are present in the vector thatrepresents otherwise primitive types used to generate. They represent a naive
approach to exposing custom types without having to intuit generated type names like
struct_21_8
though any actual whitelisting logic isn't implementedhere.
Test success is measured by the success of the
csmith-fuzzing/predicate.py
script. This means that for a test to pass the following must be true:
Usage
Some things I'm unsure of:
Where should this feature live?
At the moment it lives in
tests/property_test
but isn't run whencargo test
is invoked from bindgen's cargo manifest directory.What's an acceptable ammount of time for these tests to take?
At this point, the source is genereated in ~1 second but the files are
large enough that it takes the
predicate.py
script ~30 seconds to runthrough each one. In order for the tests to run in under a minute only 2 are
generated by quickcheck by default. This can be changed in the
test_bindgen
function of the
tests/property_test/tests/fuzzed-c-headers.rs
file.How do we expose the generated code for easy inspection?
For now the
run_predicate_script
function in thetests/property_test/tests/fuzzed-c-headers.rs
file contains acommented block that will copy generated source in the
tests/property_test/tests
directory. Should it be easier?
Special casing
There is some logic in the fuzzer that disallows 0 sized arrays because
tests will regulary fail due to issues documented in #684 and #1153. Should
this be special casing?
Does the fuzzer warrant its own crate?
After any iterations the reviewers are interested in required to make
this a functional testing tool, should/could the fuzzing library be made into
its own crate? I didn't move in that direction yet because having it all in one
place seemed like the best way to figure out what works an doesn't but I'm
interested in whether it might be useful as a standalone library.
What does it look like to expose more useful functionality?
I'm looking forward to feedback on how to make this a more useful tool
and one that provides the right configurability.
Thanks!
r? @fitzgen