Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pull request #301 from fandango-fuzzer/andreas Tutorial for regular expressions #305

Merged
merged 31 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
1c9ef54
Merge pull request #300 from fandango-fuzzer/main
joszamama Feb 3, 2025
4a5c82b
New: a chapter on regexes
andreas-zeller Feb 4, 2025
8c35f70
Doc fix
andreas-zeller Feb 4, 2025
ee20f1a
Fix: bad admonition
andreas-zeller Feb 4, 2025
0e148a2
Doc update
andreas-zeller Feb 4, 2025
fb246da
New: support binary regexes
andreas-zeller Feb 4, 2025
3a04088
Fix: bad refs
andreas-zeller Feb 5, 2025
f6151a1
New: for future citations
andreas-zeller Feb 5, 2025
0846944
Doc update
andreas-zeller Feb 5, 2025
27278f6
Improved diagnostics when checking for failed assertions
andreas-zeller Feb 5, 2025
c062a12
Fix: be sure to have only bytes
andreas-zeller Feb 5, 2025
49fe5c5
Improved diagnostics for mismatched bytes
andreas-zeller Feb 5, 2025
6ea232b
Fix: better check of bytes against bytes
andreas-zeller Feb 5, 2025
7fe4b9b
New: use regexes by default
andreas-zeller Feb 5, 2025
aaa8a57
New: redefined `<byte>` as regexp, dramatically speeding up parsing
andreas-zeller Feb 5, 2025
3c97cb7
Improved diagnostics when printing parser states
andreas-zeller Feb 5, 2025
d744d0e
Doc update
andreas-zeller Feb 5, 2025
3d480c7
Fix: `fuzz` command failed validation
andreas-zeller Feb 5, 2025
2278cfe
New: also use regexes for printable, punctuation, ascii and utf8 chars
andreas-zeller Feb 5, 2025
dc556cb
Always report file positions with four hex digits
andreas-zeller Feb 5, 2025
1b9d5fe
Fix: enable parsing GIF again (but still no validation)
andreas-zeller Feb 5, 2025
66070b3
Merge branch 'main' into andreas
joszamama Feb 5, 2025
c0cbb2a
Fix: parsing bits yielded random results
andreas-zeller Feb 5, 2025
9071a07
Added `--validate` flag for better consistency checking
andreas-zeller Feb 5, 2025
dbd64c0
Added more tests
andreas-zeller Feb 5, 2025
76b1566
Added check for `rgb.fan` parse test
andreas-zeller Feb 5, 2025
5a624a3
Fix: better encoding of mixed quotes
andreas-zeller Feb 5, 2025
b26b2da
Fix: skip alternatives only if regexes are used
andreas-zeller Feb 5, 2025
d6fd902
New: `--format=repr` outputs tree in internal representation (for uni…
andreas-zeller Feb 5, 2025
e48b272
New: more parsing tests
andreas-zeller Feb 5, 2025
b125887
Merge pull request #301 from fandango-fuzzer/andreas
joszamama Feb 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ osascript -e 'tell application "Safari" to set URL of document of window 1 to UR
VIEW_PDF = open $(PDF_TARGET)

# Command to check docs for failed assertions
CHECK_DOCS = grep -l AssertionError $(DOCS)/_build/html/*.html; if [ $$? == 0 ]; then false; else true; fi
CHECK_DOCS = grep -l AssertionError $(DOCS)/_build/html/*.html; if [ $$? == 0 ]; then echo 'Check the above files for failed assertions'; false; else true; fi


# Targets.
Expand Down
33 changes: 29 additions & 4 deletions docs/Binary.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ $ fandango fuzz -f credit_card.fan -n 10

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f credit_card.fan -n 10
!fandango fuzz -f credit_card.fan -n 10 --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -164,6 +164,31 @@ The default is `fuzz --file=mode=auto` (default), which will use `binary` or `te
Avoid mixing non-ASCII strings with bits and bytes in a single grammar.
:::

(sec:byte-regexes)=
### Bytes and Regular Expressions

Fandango also supports [regular expressions](Regexes.md) over bytes.
To obtain a regular expression over a byte string, use both `r` and `b` prefixes.
This is especially useful for character classes.

Here is an example: [`binfinity.fan`](binfinity.fan) produces strings of five bytes _outside_ the range `\x80-\xff`:

```{code-cell}
:tags: ["remove-input"]
!cat binfinity.fan
```

This is what we get:

```shell
$ fandango fuzz -f binfinity.fan -n 10
```

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f binfinity.fan -n 10 --validate
assert _exit_code == 0
```


## Length Encodings
Expand Down Expand Up @@ -221,12 +246,12 @@ Again, all of this goes into a single `.fan` file: [`binary.fan`](binary.fan) ho
Let us produce a single output using `binary.fan` and view its (binary) contents, using `od -c`:

```shell
$ fandango fuzz -n 1 -f binary.fan | od -c
$ fandango fuzz -n 1 -f binary.fan -o - | od -c
```

```{code-cell}
:tags: ["remove-input"]
! fandango fuzz -n 1 -f binary.fan | od -c
! fandango fuzz -n 1 -f binary.fan -o - | od -c
```

The hexadecimal dump shows that the first two bytes encode the length of the string of digits that follows.
Expand All @@ -248,7 +273,7 @@ and obtain the same result:

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -n 1 -f binary-pack.fan | od -c
!fandango fuzz -n 1 -f binary-pack.fan -o - --validate | od -c
assert _exit_code == 0
```

Expand Down
10 changes: 5 additions & 5 deletions docs/Bits.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ $ fandango fuzz --format=bits -f bits.fan -n 1 --start-symbol='<format_flag>'

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz --format=bits -f bits.fan -n 1 --start-symbol='<format_flag>'
!fandango fuzz --format=bits -f bits.fan -n 1 --start-symbol='<format_flag>' --validate
assert _exit_code == 0
```

Expand All @@ -71,7 +71,7 @@ $ fandango fuzz --format=bits -f bits.fan -n 10 -c '<italic> == "\x01" and <bold

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz --format=bits -f bits.fan -n 10 -c '<italic> == "\x01" and <bold> == "\x00"'
!fandango fuzz --format=bits -f bits.fan -n 10 -c '<italic> == "\x01" and <bold> == "\x00"' --validate
assert _exit_code == 0
```

Expand All @@ -83,7 +83,7 @@ $ fandango fuzz --format=bits -f bits.fan -n 1 -c '<italic> == chr(1) and <bold>

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz --format=bits -f bits.fan -n 1 -c '<italic> == chr(1) and <bold> == chr(0)'
!fandango fuzz --format=bits -f bits.fan -n 1 -c '<italic> == chr(1) and <bold> == chr(0)' --validate
assert _exit_code == 0
```

Expand All @@ -95,7 +95,7 @@ $ fandango fuzz --format=bits -f bits.fan -n 1 -c '<format_flag> == chr(0b111100

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz --format=bits -f bits.fan -n 1 -c '<format_flag> == chr(0b11110000)'
!fandango fuzz --format=bits -f bits.fan -n 1 -c '<format_flag> == chr(0b11110000)' --validate
assert _exit_code == 0
```

Expand All @@ -115,7 +115,7 @@ $ fandango fuzz --format=bits -f bits.fan -n 1 -c 'ord(str(<brightness>)) > 10'

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz --format=bits -f bits.fan -n 10 -c 'ord(str(<brightness>)) > 10'
!fandango fuzz --format=bits -f bits.fan -n 10 -c 'ord(str(<brightness>)) > 10' --validate
assert _exit_code == 0
```

Expand Down
10 changes: 5 additions & 5 deletions docs/Constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ $ fandango fuzz -f persons.fan -n 10

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c 'int(<age>) < 50'
!fandango fuzz -f persons.fan -n 10 -c 'int(<age>) < 50' --validate
assert _exit_code == 0
```

Expand All @@ -92,15 +92,15 @@ and we obtain these inputs:

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c '25 <= int(<age>) and int(<age>) <= 45'
!fandango fuzz -f persons.fan -n 10 -c '25 <= int(<age>) and int(<age>) <= 45' --validate
assert _exit_code == 0
```

Start with [`persons.fan`](persons.fan) and add a constraint such that we generate people whose age is a multiple of 7, as in

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c 'int(<age>) % 7 == 0'
!fandango fuzz -f persons.fan -n 10 -c 'int(<age>) % 7 == 0' --validate
assert _exit_code == 0
```
(Hint: The modulo operator in Python is `%`).
Expand Down Expand Up @@ -201,7 +201,7 @@ $ fandango -v fuzz -f persons.fan -n 10 -c 'int(<age>) % 7 == 0'

```{code-cell}
:tags: ["remove-input", "scroll-output"]
!fandango -v fuzz -f persons.fan -n 10 -c 'int(<age>) % 7 == 0'
!fandango -v fuzz -f persons.fan -n 10 -c 'int(<age>) % 7 == 0' --validate
assert _exit_code == 0
```

Expand All @@ -226,7 +226,7 @@ $ fandango -v fuzz -f persons.fan -n 10 -c 'False' -N 50

```{code-cell}
:tags: ["remove-input", "scroll-output"]
!fandango -v fuzz -f persons.fan -n 10 -c 'False' -N 50
!fandango -v fuzz -f persons.fan -n 10 -c 'False' -N 50 --validate
assert _exit_code == 0
```

Expand Down
2 changes: 1 addition & 1 deletion docs/Fuzzing.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Your output will look like this:

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10
!fandango fuzz -f persons.fan -n 10 --validate
assert _exit_code == 0
```

Expand Down
16 changes: 8 additions & 8 deletions docs/Generators.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ then we can have Fandango create names such as

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-nat.fan -n 10
!fandango fuzz -f persons-nat.fan -n 10 --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -127,7 +127,7 @@ This is what the output of the above spec looks like:

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker.fan -n 10
!fandango fuzz -f persons-faker.fan -n 10 --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -161,7 +161,7 @@ The resulting [Fandango spec file](persons-faker-age.fan) produces the desired r

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker-age.fan -n 10
!fandango fuzz -f persons-faker-age.fan -n 10 --validate
assert _exit_code == 0
```

Expand All @@ -178,7 +178,7 @@ These are the ages we get this way:

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker-gauss.fan -n 10
!fandango fuzz -f persons-faker-gauss.fan -n 10 --validate
assert _exit_code == 0
```

Expand All @@ -203,7 +203,7 @@ With this, both random names (`<name>`) and natural names (`<natural_name>`) wil

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker50.fan -n 10
!fandango fuzz -f persons-faker50.fan -n 10 --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -236,7 +236,7 @@ and we get

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker.fan -c '<last_name>.startswith("S")' -n 10
!fandango fuzz -f persons-faker.fan -c '<last_name>.startswith("S")' -n 10 --validate
assert _exit_code == 0
```

Expand All @@ -259,7 +259,7 @@ In case this should work, this is only through some internal Fandango optimizati
Unfortunately, this does not work.
% ```{code-cell}
% :tags: ["remove-input"]
% !fandango fuzz -f persons-faker.fan -c '<first_name> == fake.first_name()' -n 10
% !fandango fuzz -f persons-faker.fan -c '<first_name> == fake.first_name()' -n 10 --validate
% assert _exit_code == 0
% ```
The reason is that the faker returns _a different value_ every time it is invoked, making it hard for Fandango to solve the constraint.
Expand All @@ -275,7 +275,7 @@ $ fandango fuzz -f persons-faker.fan -c 'str(<last_name>).startswith("S")' -c 'i
This would work:
```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker.fan -c 'str(<last_name>).startswith("S")' -c 'int(<age>) >= 25 and int(<age>) <= 35' -n 10
!fandango fuzz -f persons-faker.fan -c 'str(<last_name>).startswith("S")' -c 'int(<age>) >= 25 and int(<age>) <= 35' -n 10 --validate
assert _exit_code == 0
```

Expand Down
3 changes: 2 additions & 1 deletion docs/Gif.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,6 @@ We start with a very short GIF to keep things simple ([source](http://probablypr
We can parse this file using Fandango:

```{code-cell}
!fandango parse -f gif89a.fan tinytrans.gif -o - --format=grammar
!fandango parse -f gif89a.fan tinytrans.gif -o - --format=grammar --validate
assert _exit_code == 0
```
16 changes: 8 additions & 8 deletions docs/ISO8601.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,10 @@ iso8601lib += make_rule("iso8601calendardate",
iso8601lib += make_rule("iso8601year", ["('+'|'-')? <digit>{4}"])
```

And yes, we need digits for specifying a year:
```{code-cell}
iso8601lib += make_rule("digit", [f"'{digit}'" for digit in range(0, 10)])
```
% And yes, we need digits for specifying a year:
% ```{code-cell}
% iso8601lib += make_rule("digit", [f"'{digit}'" for digit in range(0, 10)])
% ```


### Months
Expand Down Expand Up @@ -370,7 +370,7 @@ Let us write it into a `.fan` file, so we can use it for fuzzing:
open('ISO8601.fan', 'w').write(iso8601lib);
```

Here comes [`iso9601.fan`](iso9601.fan) in all its glory:
Here comes [`iso8601.fan`](iso8601.fan) in all its glory:

```{code-cell}
:tags: ["remove-input"]
Expand All @@ -385,7 +385,7 @@ $ fandango fuzz -f iso8601.fan -n 10

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f ISO8601.fan -n 10
!fandango fuzz -f ISO8601.fan -n 10 --validate
assert _exit_code == 0
```

Expand All @@ -396,7 +396,7 @@ $ fandango fuzz -f ISO8601.fan -n 10 -c 'int(<iso8601year>) > 1950 and int(<iso8

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f ISO8601.fan -n 10 -c 'int(<iso8601year>) > 1950 and int(<iso8601year>) < 2000'
!fandango fuzz -f ISO8601.fan -n 10 -c 'int(<iso8601year>) > 1950 and int(<iso8601year>) < 2000' --validate
assert _exit_code == 0
```

Expand All @@ -412,4 +412,4 @@ assert _exit_code == 0
```

Try out more constraints for yourself!
The generated [`ISO9601.fan`](ISO9601.fan) file is available for download.
The generated [`ISO8601.fan`](ISO8601.fan) file is available for download.
2 changes: 1 addition & 1 deletion docs/Invoking.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ And this is what we get:

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f digits.fan -n 10
!fandango fuzz -f digits.fan -n 10 --validate
```

Success! We have created 10 random sequences of digits.
Expand Down
2 changes: 1 addition & 1 deletion docs/Parsing.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ assert _exit_code == 0

We see that input and output are identical (as should always be with parsing and unparsing).

:::{info}
:::{tip}
As it comes to producing and storing outputs, the `parse` command has the same options as the `fuzz` command.
:::

Expand Down
16 changes: 8 additions & 8 deletions docs/Paths.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ $ fandango fuzz -f persons.fan -n 10 -c '<first_name>[0].endswith("x")'

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c '<first_name>[0].endswith("x")'
!fandango fuzz -f persons.fan -n 10 -c '<first_name>[0].endswith("x")' --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -274,7 +274,7 @@ $ fandango fuzz -f persons.fan -n 10 -c '<first_name>.<name>.endswith("x")'

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c '<first_name>.<name>.endswith("x")'
!fandango fuzz -f persons.fan -n 10 -c '<first_name>.<name>.endswith("x")' --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -324,7 +324,7 @@ $ fandango fuzz -f persons.fan -n 10 -c '<first_name>..<ascii_uppercase_letter>

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c '<first_name>..<ascii_uppercase_letter> == "X"'
!fandango fuzz -f persons.fan -n 10 -c '<first_name>..<ascii_uppercase_letter> == "X"' --validate
assert _exit_code == 0
```

Expand All @@ -350,7 +350,7 @@ $ fandango fuzz -f persons.fan -n 10 -c '<start>[0].<last_name>..<ascii_lowercas

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c '<start>[0].<last_name>..<ascii_lowercase_letter> == "x"'
!fandango fuzz -f persons.fan -n 10 -c '<start>[0].<last_name>..<ascii_lowercase_letter> == "x"' --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -420,7 +420,7 @@ $ fandango fuzz -f persons.fan -n 10 -c 'any(n.startswith("A") for n in *<name>)

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c 'exists <name> in <start>: <name>.startswith("A")'
!fandango fuzz -f persons.fan -n 10 -c 'exists <name> in <start>: <name>.startswith("A")' --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -457,7 +457,7 @@ $ fandango fuzz -f persons.fan -n 10 -c 'all(c == "a" for c in *<first_name>..<a

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c '<first_name>..<ascii_lowercase_letter> == "a"'
!fandango fuzz -f persons.fan -n 10 -c '<first_name>..<ascii_lowercase_letter> == "a"' --validate
assert _exit_code == 0
```

Expand Down Expand Up @@ -503,7 +503,7 @@ $ fandango fuzz -f persons.fan -n 10 -c 'int(<age>) > 30 -> <first_name>.startsw

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons.fan -n 10 -c 'int(<age>) > 30 -> <first_name>.startswith("A")'
!fandango fuzz -f persons.fan -n 10 -c 'int(<age>) > 30 -> <first_name>.startswith("A")' --validate
assert _exit_code == 0
```

Expand All @@ -516,7 +516,7 @@ $ fandango fuzz -f persons-faker-gauss.fan -n 10 -c 'int(<age>) > 30 -> <first_n

```{code-cell}
:tags: ["remove-input"]
!fandango fuzz -f persons-faker-gauss.fan -n 10 -c 'int(<age>) > 30 -> <first_name>.startswith("A")'
!fandango fuzz -f persons-faker-gauss.fan -n 10 -c 'int(<age>) > 30 -> <first_name>.startswith("A")' --validate
assert _exit_code == 0
```

Expand Down
Loading