From 4a5c82b105e2160278b800e2ee0212fb4f673065 Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Tue, 4 Feb 2025 10:09:05 +0100 Subject: [PATCH 01/28] New: a chapter on regexes --- docs/Regexes.md | 188 ++++++++++++++++++++++++++++++++++++++++++++++ docs/_toc.yml | 1 + docs/finity.fan | 1 + docs/infinity.fan | 1 + 4 files changed, 191 insertions(+) create mode 100644 docs/Regexes.md create mode 100644 docs/finity.fan create mode 100644 docs/infinity.fan diff --git a/docs/Regexes.md b/docs/Regexes.md new file mode 100644 index 00000000..870a62f4 --- /dev/null +++ b/docs/Regexes.md @@ -0,0 +1,188 @@ +--- +jupytext: + formats: md:myst + text_representation: + extension: .md + format_name: myst +kernelspec: + display_name: Python 3 + language: python + name: python3 +--- + +(sec:regexes)= +# Regular Expressions + +Although the Fandango grammars cover a wide range of input language features, there are situations where they may be a bit cumbersome to work with. +Consider specifying _every digit except for zeros_: this requires you to enumerate all the other digits `1`, `2`, and so on. +This is why Fandango also supports _regular expressions_, which allow you to use a concise syntax for character ranges, repeated characters and more. +Specifying all digits from `1` to `9`, for instance, becomes the short regular expression `r'[1-9]'`. + + +## About Regular Expressions + +Regular expressions form a language on their own and come with several useful features. +To get an introduction to the regular expressions Fandango uses, read the Python [Regular Expression HOWTO](https://docs.python.org/3/howto/regex.html) and check out the Python [Regular Expression Syntax](https://docs.python.org/3/library/re.html#regular-expression-syntax) for a complete reference. + +In Fandango, regular expressions are used for two purposes: + +* When _producing_ inputs, a regular expression is instantiated into a random string that matches the expression. +* When _parsing_ inputs, a regular expression is used to _parse_ and _match_ inputs. + + +## Writing Regular Expressions + +:::{margin} +For Python aficionados: this is actually a Python "raw string" +::: + +In Fandango, a regular expression comes as a string, prefixed with a `r` character. +To express that a digit can have the values `0` to `9`, instead of + +``` + ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" +``` + +you can write + +``` + ::= r'[0-9]' +``` + +which is much more concise. + +Likewise, to match a sequence of characters that ends in `;`, you can write + +``` + ::= r'[^;]+;' +``` + +Besides the `r` prefix indicating a regular expression, it also makes the string a _raw_ string. +This means that backslashes are treated as _literal characters_. +The regular expression `\d`, for instance, matches a Unicode digit, which includes `[0-9]`, and also [many other digit characters](https://en.wikipedia.org/wiki/Numerals_in_Unicode). +To include `\d` in a regular expression, write it _as is_; do not escape the backslash with another backslash (as you would do in a regular string): + +:::{margin} +The expression `r'\\d'` would actually match a backslash, followed by a `d` character. +::: + +``` + ::= r'\d' +``` + +:::{warning} +Be aware of the specific syntax of `r`-strings as it comes to backslashes. +::: + + +## Fine Points about Regular Expressions + +For parsing inputs, Fandango uses the Python [`re`](https://docs.python.org/3/library/re.html) module for matching strings against regular expressions; +for producing inputs, Fandango uses the Python [`exrex`](https://github.com/asciimoo/exrex) module for generating strings that match regular expressions. +All the `re` and `exrex` capabilities and limitations thus extend to Fandango. + +Most notably, `exrex` imposes a _repetition limit_ of 20 on generated strings that in principle can have arbitrary length; a `+` or `*` operator will not expand to more than 20 repetitions. +Thus, a grammar [`infinity.fan`](infinity.fan) + +```{code-cell} +:tags: ["remove-input"] +!cat infinity.fan +``` + +that in principle, could produce arbitrary long sequences `abcabcabcabc...` will be limited to 20 repetitions at most: + +```shell +$ fandango fuzz -f infinity.fan -n 10 +``` + +```{code-cell} +:tags: ["remove-input"] +!fandango fuzz -f infinity.fan -n 10 +assert _exit_code == 0 +``` + +To precisely control the number of repetitions, use the regular expression `{m,n}` construct, limiting the number of repetitions from `m` to `n`. +Let us limit the number of repetitions to the range 1..5: + +```{code-cell} +:tags: ["remove-input"] +!cat finity.fan +``` + +This is what we get: + +```shell +$ fandango fuzz -f finity.fan -n 10 +``` + +```{code-cell} +:tags: ["remove-input"] +!fandango fuzz -f finity.fan -n 10 +assert _exit_code == 0 +``` + +:::{tip} +Remember that _grammars_ also have operators `+`, `*`, `?`, and `{N,M}` which apply to the preceding grammar element, and work like their _regular expression_ counterparts. +Using these, we could also write the above as +``` + ::= "abc"+ +``` +and +``` + ::= "abc"{1,5} +``` +respectively. +::: + + +## Regular Expressions vs. Grammars + +:::{margin} +If it weren't for some regular expression features such as _backreferences_ (check out what `(?P=name)` does), the context-free grammars would be a strict superset of regular expressions - anything that can be expressed in a regular expression can also be expressed in an equivalent grammar. +::: + +In many cases, a grammar can be replaced by a regular expression and vice versa. +This raises the question: When should one use a regular expression, and when a grammar? +Here are some points to help you decide. + +* Regular expressions are often more _concise_ (but arguably harder to read) than grammars. +* If you want to _reference_ individual elements of a string (say, as part of a constraint now or in the future), use a _grammar_. +* Since their underlying model is simpler, regular expressions are _faster_ to generate, and _much faster_ to [parse](Parsing.md) than grammars. +* If your underlying language separates lexical and syntactical processing, use + - _regular expressions_ for specifying _lexical_ parts such as tokens and fragments; + - a _grammar_ for the _syntax_; and + - [constraints](Constraints.md) for _semantic_ properties. + + +:::{warning} +Do not use regular expressions for inputs that are [recursive](Recursive.md). +Languages like HTML, XML, even e-mail addresses or URLs, are much easier to capture as grammars. +::: + + +## Regular Expressions as Equivalence Classes + +The choice of grammars vs. regular expressions also affects the Fandango generation algorithm. +Generally speaking, Fandango attempts to cover all alternatives of a grammar. +If, say, `` is specified as + +``` + ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" +``` + +then Fandango will attempt to produce every digit at least once, and also try to cover digit _combinations_ up to a certain depth. +This is useful if you want to specifically test digit processing, or if each of the digits causes a different behavior that needs to be covered. + +If, however, you specify `` as + +``` + ::= r'[0-9]' +``` + +then Fandango will treat this as a _single_ alternative (with all expansions considered semantically equivalent), which once expanded into (some) digit will be considered as covered. + +:::{tip} +* If you do want or need to _differentiate_ between individual elements of a set (because they would be treated differently), consider _grammar alternatives_. +* If you do _not_ want or need to differentiate between individual elements of a set (because they would all be treated the same), consider a _regular expression_. +::: + diff --git a/docs/_toc.yml b/docs/_toc.yml index 41d9f97b..6a7a1ac8 100644 --- a/docs/_toc.yml +++ b/docs/_toc.yml @@ -15,6 +15,7 @@ parts: - file: Constraints - file: Shell - file: Generators + - file: Regexes - file: Recursive - file: Paths - file: ISO8601 diff --git a/docs/finity.fan b/docs/finity.fan new file mode 100644 index 00000000..c2352778 --- /dev/null +++ b/docs/finity.fan @@ -0,0 +1 @@ + ::= r"(abc){1,5}" \ No newline at end of file diff --git a/docs/infinity.fan b/docs/infinity.fan new file mode 100644 index 00000000..50632180 --- /dev/null +++ b/docs/infinity.fan @@ -0,0 +1 @@ + ::= r"(abc)+" \ No newline at end of file From 8c35f707803ab3f208eb106e53631afad91ff2ef Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Tue, 4 Feb 2025 10:22:16 +0100 Subject: [PATCH 02/28] Doc fix --- docs/Regexes.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/Regexes.md b/docs/Regexes.md index 870a62f4..24eec758 100644 --- a/docs/Regexes.md +++ b/docs/Regexes.md @@ -138,7 +138,8 @@ respectively. ## Regular Expressions vs. Grammars :::{margin} -If it weren't for some regular expression features such as _backreferences_ (check out what `(?P=name)` does), the context-free grammars would be a strict superset of regular expressions - anything that can be expressed in a regular expression can also be expressed in an equivalent grammar. +In theory, context-free grammars are a strict _superset_ of regular expressions - any language that can be expressed in a regular expression can also be expressed in an equivalent grammar. +Practical implementations of regular expressions break this hierarchy by introducing some features such as _backreferences_ (check out what `(?P=name)` does), which cannot be expressed in grammars. ::: In many cases, a grammar can be replaced by a regular expression and vice versa. @@ -151,7 +152,8 @@ Here are some points to help you decide. * If your underlying language separates lexical and syntactical processing, use - _regular expressions_ for specifying _lexical_ parts such as tokens and fragments; - a _grammar_ for the _syntax_; and - - [constraints](Constraints.md) for _semantic_ properties. + - [constraints](Constraints.md) for any semantic properties. +* Prefer grammars and constraints over overly complex regular expressions. :::{warning} From ee20f1a8163f3321e029ced98f0a36961276a283 Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Tue, 4 Feb 2025 10:28:15 +0100 Subject: [PATCH 03/28] Fix: bad admonition --- docs/Parsing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Parsing.md b/docs/Parsing.md index 3750c1a3..ec4f72ae 100644 --- a/docs/Parsing.md +++ b/docs/Parsing.md @@ -93,7 +93,7 @@ assert _exit_code == 0 We see that input and output are identical (as should always be with parsing and unparsing). -:::{info} +:::{tip} As it comes to producing and storing outputs, the `parse` command has the same options as the `fuzz` command. ::: From 0e148a2bb56a182b3450d00093522aca0f929e8f Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Tue, 4 Feb 2025 10:28:47 +0100 Subject: [PATCH 04/28] Doc update --- docs/Stdlib.md | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/docs/Stdlib.md b/docs/Stdlib.md index afa5182f..23f89f9c 100644 --- a/docs/Stdlib.md +++ b/docs/Stdlib.md @@ -37,6 +37,15 @@ Symbols starting with an underscore must _not_ be redefined. from fandango.language import stdlib ``` +## Characters + +A `` represents any Unicode character, including newline. + +```{code-cell} +:tags: ["remove-input"] +print(stdlib.any_char) +``` + ## Printable Characters These symbols mimic the [string constants from the Python `string` module](https://docs.python.org/3/library/string.html). @@ -89,17 +98,6 @@ print(stdlib.bytes) ``` -## Characters - -A `` is any Unicode character. - -```{error} -`` is currently not defined. -Use ``, ``, or `` instead. -``` -% We need charset or regex specs for this: ::= /./ | '\n' - - ## UTF-8 characters A `` is a UTF-8 encoding of a character, occupying one (``) to four (` Date: Tue, 4 Feb 2025 15:51:59 +0100 Subject: [PATCH 05/28] New: support binary regexes --- docs/Binary.md | 31 ++++++++++++++++++++++++++++--- docs/Regexes.md | 8 ++++++++ docs/binfinity.fan | 1 + src/fandango/language/grammar.py | 5 +++++ 4 files changed, 42 insertions(+), 3 deletions(-) create mode 100644 docs/binfinity.fan diff --git a/docs/Binary.md b/docs/Binary.md index 13ddbaa1..c15670d6 100644 --- a/docs/Binary.md +++ b/docs/Binary.md @@ -164,6 +164,31 @@ The default is `fuzz --file=mode=auto` (default), which will use `binary` or `te Avoid mixing non-ASCII strings with bits and bytes in a single grammar. ::: +(sec:byte-regexes)= +### Bytes and Regular Expressions + +Fandango also supports [regular expressions](Regexes.md) over bytes. +To obtain a regular expression over a byte string, use both `r` and `b` prefixes. +This is especially useful for character classes. + +Here is an example: [`binfinity.fan`](binfinity.fan) produces strings of five bytes _outside_ the range `\x80-\xff`: + +```{code-cell} +:tags: ["remove-input"] +!cat binfinity.fan +``` + +This is what we get: + +```shell +$ fandango fuzz -f binfinity.fan -n 10 +``` + +```{code-cell} +:tags: ["remove-input"] +!fandango fuzz -f binfinity.fan -n 10 +assert _exit_code == 0 +``` ## Length Encodings @@ -221,12 +246,12 @@ Again, all of this goes into a single `.fan` file: [`binary.fan`](binary.fan) ho Let us produce a single output using `binary.fan` and view its (binary) contents, using `od -c`: ```shell -$ fandango fuzz -n 1 -f binary.fan | od -c +$ fandango fuzz -n 1 -f binary.fan -o - | od -c ``` ```{code-cell} :tags: ["remove-input"] -! fandango fuzz -n 1 -f binary.fan | od -c +! fandango fuzz -n 1 -f binary.fan -o - | od -c ``` The hexadecimal dump shows that the first two bytes encode the length of the string of digits that follows. @@ -248,7 +273,7 @@ and obtain the same result: ```{code-cell} :tags: ["remove-input"] -!fandango fuzz -n 1 -f binary-pack.fan | od -c +!fandango fuzz -n 1 -f binary-pack.fan -o - | od -c assert _exit_code == 0 ``` diff --git a/docs/Regexes.md b/docs/Regexes.md index 24eec758..24c0023e 100644 --- a/docs/Regexes.md +++ b/docs/Regexes.md @@ -81,6 +81,9 @@ For parsing inputs, Fandango uses the Python [`re`](https://docs.python.org/3/li for producing inputs, Fandango uses the Python [`exrex`](https://github.com/asciimoo/exrex) module for generating strings that match regular expressions. All the `re` and `exrex` capabilities and limitations thus extend to Fandango. + +### Repetition Limits + Most notably, `exrex` imposes a _repetition limit_ of 20 on generated strings that in principle can have arbitrary length; a `+` or `*` operator will not expand to more than 20 repetitions. Thus, a grammar [`infinity.fan`](infinity.fan) @@ -134,6 +137,11 @@ and respectively. ::: +### Regular Expressions over Bytes + +Regular expressions can also be formed over bytes. +See [Bytes and Regular Expressions](sec:byte-regexes) for details. + ## Regular Expressions vs. Grammars diff --git a/docs/binfinity.fan b/docs/binfinity.fan new file mode 100644 index 00000000..132efd0b --- /dev/null +++ b/docs/binfinity.fan @@ -0,0 +1 @@ + ::= rb"[^\x80-\xff]{5}" \ No newline at end of file diff --git a/src/fandango/language/grammar.py b/src/fandango/language/grammar.py index 9ded398d..6d836ec7 100644 --- a/src/fandango/language/grammar.py +++ b/src/fandango/language/grammar.py @@ -256,6 +256,11 @@ def __init__(self, symbol: Terminal): def fuzz(self, grammar: "Grammar", max_nodes: int = 100) -> List[DerivationTree]: if self.symbol.is_regex: + if isinstance(self.symbol.symbol, bytes): + # Exrex can't do bytes, so we decode to str and back + instance = exrex.getone(self.symbol.symbol.decode('iso-8859-1')) + return [DerivationTree(Terminal(instance.encode('iso-8859-1')))] + instance = exrex.getone(self.symbol.symbol) return [DerivationTree(Terminal(instance))] From 3a040885a15dee78065c23458d7f9457dfd985e7 Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:41:02 +0100 Subject: [PATCH 06/28] Fix: bad refs --- docs/ISO8601.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/ISO8601.md b/docs/ISO8601.md index 8409c53f..20773a4d 100644 --- a/docs/ISO8601.md +++ b/docs/ISO8601.md @@ -370,7 +370,7 @@ Let us write it into a `.fan` file, so we can use it for fuzzing: open('ISO8601.fan', 'w').write(iso8601lib); ``` -Here comes [`iso9601.fan`](iso9601.fan) in all its glory: +Here comes [`iso8601.fan`](iso8601.fan) in all its glory: ```{code-cell} :tags: ["remove-input"] @@ -412,4 +412,4 @@ assert _exit_code == 0 ``` Try out more constraints for yourself! -The generated [`ISO9601.fan`](ISO9601.fan) file is available for download. \ No newline at end of file +The generated [`ISO8601.fan`](ISO8601.fan) file is available for download. \ No newline at end of file From f6151a122864760107f7ec3fc33a980292defa0b Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:41:22 +0100 Subject: [PATCH 07/28] New: for future citations --- docs/fandango.bib | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 docs/fandango.bib diff --git a/docs/fandango.bib b/docs/fandango.bib new file mode 100644 index 00000000..e69de29b From 0846944c54257cafc85b492164fafd997b6c136e Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:41:31 +0100 Subject: [PATCH 08/28] Doc update --- docs/Regexes.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/Regexes.md b/docs/Regexes.md index 24c0023e..dd1acbe1 100644 --- a/docs/Regexes.md +++ b/docs/Regexes.md @@ -81,6 +81,17 @@ For parsing inputs, Fandango uses the Python [`re`](https://docs.python.org/3/li for producing inputs, Fandango uses the Python [`exrex`](https://github.com/asciimoo/exrex) module for generating strings that match regular expressions. All the `re` and `exrex` capabilities and limitations thus extend to Fandango. +:::{tip} +For regex shortcuts, the `exrex` producer only produces characters in the range `\0x00` to `\0xff`: + +* for digits (`\d`), the characters `[0-9]` +* for whitespace (`\s`), the characters `[ \t\n\r\f\v]` +* for words (`\w`), the characters `[a-zA-Z0-9_]` +* for non-words (`\W`), the character range `[^a-zA-Z0-9_]` + +To produce Unicode characters, make them part of an explicit range (e.g. `[äöüÄÖÜß]`). +::: + ### Repetition Limits From 27278f62aee645a8497f89fad83f323ddcaa1885 Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:41:48 +0100 Subject: [PATCH 09/28] Improved diagnostics when checking for failed assertions --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 6da241a7..216dca3a 100644 --- a/Makefile +++ b/Makefile @@ -90,7 +90,7 @@ osascript -e 'tell application "Safari" to set URL of document of window 1 to UR VIEW_PDF = open $(PDF_TARGET) # Command to check docs for failed assertions -CHECK_DOCS = grep -l AssertionError $(DOCS)/_build/html/*.html; if [ $$? == 0 ]; then false; else true; fi +CHECK_DOCS = grep -l AssertionError $(DOCS)/_build/html/*.html; if [ $$? == 0 ]; then echo 'Check the above files for failed assertions'; false; else true; fi # Targets. From c062a121b5aae35a51fd94d2f64573dfbc1a584a Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:54:49 +0100 Subject: [PATCH 10/28] Fix: be sure to have only bytes --- docs/gif89a.fan | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/gif89a.fan b/docs/gif89a.fan index 26489b14..d9c58720 100644 --- a/docs/gif89a.fan +++ b/docs/gif89a.fan @@ -36,7 +36,7 @@ where .. == "89a" ::= 0 0 0 ::= 0 0 0 - ::= '\x00' '\x00' '\x00' + ::= b'\x00' b'\x00' b'\x00' ::= b'\x02' b'\x02' b'L' ::= b'\x00' From 49fe5c56d5491f4c92441169c22a3e8a55eaf87f Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:55:41 +0100 Subject: [PATCH 11/28] Improved diagnostics for mismatched bytes --- src/fandango/cli/__init__.py | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/fandango/cli/__init__.py b/src/fandango/cli/__init__.py index af611850..2768f135 100644 --- a/src/fandango/cli/__init__.py +++ b/src/fandango/cli/__init__.py @@ -791,9 +791,10 @@ def report_syntax_error( if position >= len(individual): return f"{filename!r}: missing input at end of file" - mismatch = repr(individual[position]) + mismatch = individual[position] if binary: - return f"{filename!r}, position {hex(position)} ({position}): mismatched input 0x{mismatch}" + assert isinstance(mismatch, int) + return f"{filename!r}, position {hex(position)} ({position}): mismatched input {mismatch.to_bytes()!r}" line = 1 column = 1 @@ -803,7 +804,7 @@ def report_syntax_error( column = 1 else: column += 1 - return f"{filename!r}, line {line}, column {column}: mismatched input {mismatch}" + return f"{filename!r}, line {line}, column {column}: mismatched input {mismatch!r}" def validate(individual, tree, *, filename=""): From 6ea232babda9aa64809208a1682720649cd4ee23 Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 08:56:15 +0100 Subject: [PATCH 12/28] Fix: better check of bytes against bytes --- src/fandango/language/symbol.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/fandango/language/symbol.py b/src/fandango/language/symbol.py index 099067b8..206066f2 100644 --- a/src/fandango/language/symbol.py +++ b/src/fandango/language/symbol.py @@ -108,15 +108,15 @@ def check(self, word: str | int) -> tuple[bool, int]: # LOGGER.debug(f"Checking {self.symbol!r} against {word!r}") symbol = self.symbol - if isinstance(self.symbol, bytes) and isinstance(word, str): + if isinstance(symbol, bytes) and isinstance(word, str): assert isinstance(symbol, bytes) symbol = symbol.decode("iso-8859-1") - if isinstance(self.symbol, str) and isinstance(word, bytes): + if isinstance(symbol, str) and isinstance(word, bytes): assert isinstance(word, bytes) word = word.decode("iso-8859-1") - assert isinstance(symbol, str) - assert isinstance(word, str) + assert ((isinstance(symbol, str) and isinstance(word, str)) + or (isinstance(symbol, bytes) and isinstance(word, bytes))) if self.is_regex: match = re.match(symbol, word) From 7fe4b9b4ab3c87098263b872e795893cb5dc3b4a Mon Sep 17 00:00:00 2001 From: Andreas Zeller Date: Wed, 5 Feb 2025 09:07:29 +0100 Subject: [PATCH 13/28] New: use regexes by default --- docs/gif.fan | 4 ++-- utils/bt2fan/bt2fan.py | 11 +++++------ 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/docs/gif.fan b/docs/gif.fan index b94e7158..95d0474d 100644 --- a/docs/gif.fan +++ b/docs/gif.fan @@ -53,7 +53,7 @@ ::= ()* ::= ::= - ::= (b'\x01' | b'\x02' | b'\x03' | b'\x04' | b'\x05' | b'\x06' | b'\x07' | b'\x08' | b'\t' | b'\n' | b'\x0b' | b'\x0c' | b'\r' | b'\x0e' | b'\x0f' | b'\x10' | b'\x11' | b'\x12' | b'\x13' | b'\x14' | b'\x15' | b'\x16' | b'\x17' | b'\x18' | b'\x19' | b'\x1a' | b'\x1b' | b'\x1c' | b'\x1d' | b'\x1e' | b'\x1f' | b' ' | b'!' | b'"' | b'#' | b'$' | b'%' | b'&' | b"'" | b'(' | b')' | b'*' | b'+' | b',' | b'-' | b'.' | b'/' | b'0' | b'1' | b'2' | b'3' | b'4' | b'5' | b'6' | b'7' | b'8' | b'9' | b':' | b';' | b'<' | b'=' | b'>' | b'?' | b'@' | b'A' | b'B' | b'C' | b'D' | b'E' | b'F' | b'G' | b'H' | b'I' | b'J' | b'K' | b'L' | b'M' | b'N' | b'O' | b'P' | b'Q' | b'R' | b'S' | b'T' | b'U' | b'V' | b'W' | b'X' | b'Y' | b'Z' | b'[' | b'\\' | b']' | b'^' | b'_' | b'`' | b'a' | b'b' | b'c' | b'd' | b'e' | b'f' | b'g' | b'h' | b'i' | b'j' | b'k' | b'l' | b'm' | b'n' | b'o' | b'p' | b'q' | b'r' | b's' | b't' | b'u' | b'v' | b'w' | b'x' | b'y' | b'z' | b'{' | b'|' | b'}' | b'~' | b'\x7f' | b'\x80' | b'\x81' | b'\x82' | b'\x83' | b'\x84' | b'\x85' | b'\x86' | b'\x87' | b'\x88' | b'\x89' | b'\x8a' | b'\x8b' | b'\x8c' | b'\x8d' | b'\x8e' | b'\x8f' | b'\x90' | b'\x91' | b'\x92' | b'\x93' | b'\x94' | b'\x95' | b'\x96' | b'\x97' | b'\x98' | b'\x99' | b'\x9a' | b'\x9b' | b'\x9c' | b'\x9d' | b'\x9e' | b'\x9f' | b'\xa0' | b'\xa1' | b'\xa2' | b'\xa3' | b'\xa4' | b'\xa5' | b'\xa6' | b'\xa7' | b'\xa8' | b'\xa9' | b'\xaa' | b'\xab' | b'\xac' | b'\xad' | b'\xae' | b'\xaf' | b'\xb0' | b'\xb1' | b'\xb2' | b'\xb3' | b'\xb4' | b'\xb5' | b'\xb6' | b'\xb7' | b'\xb8' | b'\xb9' | b'\xba' | b'\xbb' | b'\xbc' | b'\xbd' | b'\xbe' | b'\xbf' | b'\xc0' | b'\xc1' | b'\xc2' | b'\xc3' | b'\xc4' | b'\xc5' | b'\xc6' | b'\xc7' | b'\xc8' | b'\xc9' | b'\xca' | b'\xcb' | b'\xcc' | b'\xcd' | b'\xce' | b'\xcf' | b'\xd0' | b'\xd1' | b'\xd2' | b'\xd3' | b'\xd4' | b'\xd5' | b'\xd6' | b'\xd7' | b'\xd8' | b'\xd9' | b'\xda' | b'\xdb' | b'\xdc' | b'\xdd' | b'\xde' | b'\xdf' | b'\xe0' | b'\xe1' | b'\xe2' | b'\xe3' | b'\xe4' | b'\xe5' | b'\xe6' | b'\xe7' | b'\xe8' | b'\xe9' | b'\xea' | b'\xeb' | b'\xec' | b'\xed' | b'\xee' | b'\xef' | b'\xf0' | b'\xf1' | b'\xf2' | b'\xf3' | b'\xf4' | b'\xf5' | b'\xf6' | b'\xf7' | b'\xf8' | b'\xf9' | b'\xfa' | b'\xfb' | b'\xfc' | b'\xfd' | b'\xfe' | b'\xff') # not b'\x00' + ::= br'[^\x00]' ::= * # len() == ord(str()); see below ::= b'\x00' ::= @@ -105,7 +105,7 @@ ::= ::= ::=