diff --git a/_config.yml b/_config.yml index a49c0ef..fbffd83 100644 --- a/_config.yml +++ b/_config.yml @@ -34,6 +34,10 @@ plugins: permalink: pretty +kramdown: + syntax_highlighter_opts: + default_lang: dafny + # Exclude from processing. # The following items will not be processed, by default. # Any item listed under the `exclude:` key here will be automatically added to diff --git a/_posts/2023-07-14-types-and-programming-languages.markdown b/_posts/2023-07-14-types-and-programming-languages.markdown index e65d9d5..920a77e 100644 --- a/_posts/2023-07-14-types-and-programming-languages.markdown +++ b/_posts/2023-07-14-types-and-programming-languages.markdown @@ -46,7 +46,7 @@ Writing a type-checker that guarantees that evaluation of a term won't get stuck First, let's define the term language used in the Blockly interface above. Note that the logic of the Blockly workspace above uses that exact code written on this page. Yeah, Dafny compiles to JavaScript too! -{% highlight dafny %} +``` datatype Term = | True | False @@ -57,33 +57,33 @@ datatype Term = | Double(e: Term) | Add(left: Term, right: Term) | If(cond: Term, thn: Term, els: Term) -{% endhighlight %} +``` #### The type language Let's also add the two types our expressions can have. -{% highlight dafny %} +``` datatype Type = | Bool | Int -{% endhighlight %} +``` #### The type checker We can now write a _type checker_ for the terms above. In our case, a type checker will take a term, and return a type if the term has that type. We will not dive into error reporting in this blog post. First, because a term may or may not have a type, we want an `Option` type like this: -{% highlight dafny %} +``` datatype Option = Some(value: A) | None -{% endhighlight %} +``` Now, we can define a function that computes the type of a term, if it has one. For example, in a conditional term, the condition has to be a boolean, while we only require the "then" and "else" part to have the same, defined type. In general, computing types is a task linear in the size of the code, whereas evaluating the code could have any complexity. This is why type checking is an efficient way of preventing obvious mistakes. -{% highlight dafny %} +``` function GetType(term: Term): Option { match term { case True => Some(Bool) @@ -128,15 +128,15 @@ function GetType(term: Term): Option { None } } -{% endhighlight %} +``` A well-typed term is one for which a type exists. -{% highlight dafny %} +``` predicate WellTyped(term: Term) { GetType(term) != None } -{% endhighlight %} +``` #### The evaluator and the progress check @@ -146,7 +146,7 @@ At first, we can define the notion of evaluating a term. We can evaluate a term There are terms where no replacement is possible: value terms. Here is what we want them to look like: either booleans, zero, positive integers, or negative integers. -{% highlight dafny %} +``` predicate IsSuccs(term: Term) { term == Zero || (term.Succ? && IsSuccs(term.e)) } @@ -157,11 +157,11 @@ predicate IsFinalValue(term: Term) { term == True || term == False || term == Zero || IsSuccs(term) || IsPreds(term) } -{% endhighlight %} +``` Now, we can write our one-step evaluation method. As a requirement, we add that the term must be well-typed and nonfinal. -{% highlight dafny %} +``` function OneStepEvaluate(e: Term): (r: Term) requires WellTyped(e) && !IsFinalValue(e) { @@ -222,7 +222,7 @@ function OneStepEvaluate(e: Term): (r: Term) If(OneStepEvaluate(cond), thn, els) } } -{% endhighlight %} +``` The interesting points to note about the function above, in a language like Dafny where every pattern must be exhaustive, are the following: @@ -238,14 +238,14 @@ That concludes the _progress_ part of soundness checking: whenever a term type-c Soundness has another aspect, preservation, as stated in the intro. It says that, when evaluating a well-typed term, the evaluator will not get stuck and the result will have the same type as the original term. Dafny can also prove it for our language, out of the box. Well done, that means our language and evaluator make sense together! -{% highlight dafny %} +``` lemma OneStepEvaluateWellTyped(e: Term) requires WellTyped(e) && !IsFinalValue(e) ensures GetType(OneStepEvaluate(e)) == GetType(e) { // Amazing: Dafny can prove this soundness property automatically } -{% endhighlight %} +``` #### Conclusion @@ -281,7 +281,7 @@ Note that these terms omit `Double` and `Add` above. This means we cannot state We can write the inductive definition above in Dafny too: -{% highlight dafny %} +``` ghost const AllTermsInductively: iset ghost predicate InductionCriteria(terms: iset) { @@ -296,7 +296,7 @@ lemma {:axiom} InductiveAxioms() ensures InductionCriteria(AllTermsInductively) ensures forall setOfTerms: iset | InductionCriteria(setOfTerms) :: AllTermsInductively <= setOfTerms -{% endhighlight %} +``` The second definition for the set of all terms in section 3.2.3 is done constructively. We first define a set $$S_i$$ for each natural number $$i$$, as follows @@ -305,7 +305,7 @@ $$\begin{aligned}S_{i+1} = && && \{\texttt{true}, \texttt{false}, 0\} \\ && \big This we can enter in Dafny too: -{% highlight dafny %} +``` ghost function S(i: nat): iset { if i == 0 then iset{} @@ -317,7 +317,7 @@ ghost function S(i: nat): iset { + (iset t1 <- S(i-1), t2 <- S(i-1), t3 <- S(i-1) :: If(t1, t2, t3)) } ghost const AllTermsConstructively: iset := iset i: nat, t <- S(i) :: t -{% endhighlight %} +``` But now, we are left with the existential question: are these two sets the same? We rush in Dafny and write a lemma ensuring `AllTermsConstructively == AllTermsInductively` by invoking the lemma `InductiveAxioms()`, but... Dafny can't prove it. @@ -342,7 +342,7 @@ between any two sets. We use the annotation `{:vcs_split_on_every_assert}` which makes Dafny verify each assertion independently, which, in this example, helps the verifier. Yes, [helping the verifier](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-verification-debugging-slow) is something we must occasionally do in Dafny. To further control the situation, we use the annotation `{:induction false}` to ensure Dafny does not try to prove induction hypotheses by itself, which gives us control over the proof. Otherwise, Dafny can both automate the proof a lot (which is great!) and sometimes time out because automation is stuck (which is less great!). I left assertions in the code so that not only Dafny, but you too can understand the proof. -{% highlight dafny %} +``` lemma {:vcs_split_on_every_assert} {:induction false} SiAreCumulative(i: nat) ensures S(i) <= S(i+1) { @@ -395,7 +395,7 @@ lemma SiAreIncreasing(i: nat, j: nat) } } } -{% endhighlight %} +``` ## 1. Smallest inductive set contained in constructive set @@ -403,7 +403,7 @@ After proving that intermediate sets form an increasing sequence, we want to pro Note that I use the annotation `{:rlimit 4000}` which is only a way for Dafny to say that every [assertion batch](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-assertion-batches) should verify using less than 4 million resource units (unit provided by the underlying solver), which reduces the chances of proof variability during development. -{% highlight dafny %} +``` lemma {:rlimit 4000} {:vcs_split_on_every_assert} AllTermsConstructivelySatisfiesInductionCriteria() ensures InductionCriteria(AllTermsConstructively) @@ -442,14 +442,14 @@ lemma {:rlimit 4000} {:vcs_split_on_every_assert} } InductiveAxioms(); } -{% endhighlight %} +``` ## 2. Intermediate constructive sets are included in every set that satisfy the induction criteria Now we want to prove that every `S(i)` is included in every set that satisfies the induction criteria. That way, their union, the constructive set, will also be included in any set that satisfies the induction criteria. The proof works by remarking that every element of `S(i)` is built from elements of `S(i-1)`, so if these elements are in the set satisfying the induction criteria, so is the element by induction. I intentionally detailed the proof so that you can understand it, but if you run it yourself, you might see that you can remove a lot of the proof and Dafny will still figure it out. -{% highlight dafny %} +``` lemma {:induction false} InductionCriteriaHasConcreteIAsSubset( i: nat, someset: iset @@ -504,13 +504,13 @@ lemma {:induction false} } } } -{% endhighlight %} +``` ## 3. The constructive set is included in the smallest inductive set that satisfies the induction criteria We can deduce from the previous result that the constructive definition of all terms is also included in any set of term that satisfies the induction criteria. From this we can deduce automatically that the constructive definition of all terms is included in the smallest inductive set satisfying the induction criteria. -{% highlight dafny %} +``` // 3.2.6.b.1 AllTermsConstructively is a subset of any set satisfying the induction criteria (hence AllTermsConstructively <= AllTermsInductively) lemma AllTermsConstructivelyIncludedInSetsSatisfyingInductionCriteria( terms: iset @@ -529,13 +529,13 @@ lemma AllTermsConstructivelyIncludedInAllTermsInductively() InductiveAxioms(); AllTermsConstructivelyIncludedInSetsSatisfyingInductionCriteria(AllTermsInductively); } -{% endhighlight %} +``` ## 4. Conclusion with the equality Because we have `<=` and `>=` between these two sets, we can now prove equality. -{% highlight dafny %} +``` lemma InductionAndConcreteAreTheSame() ensures AllTermsConstructively == AllTermsInductively { @@ -543,7 +543,7 @@ lemma InductionAndConcreteAreTheSame() AllTermsConstructivelySatisfiesInductionCriteria(); AllTermsConstructivelyIncludedInAllTermsInductively(); } -{% endhighlight %} +``` ## Bonus Conclusion @@ -552,35 +552,35 @@ As stated in the introduction, having multiple definitions of a single infinite - If a term is in the constructive set, then it cannot be constructed with `Add` for example, because it would need to be in a `S(i)` and none of the `S(i)` define `Add`. This can be illustrated in Dafny with the following lemma: -{% highlight dafny %} +``` lemma {:induction false} CannotBeAdd(t: Term) requires t in AllTermsConstructively ensures !t.Add? { // InductiveAxioms(); // Uncomment if you use AllTermsInductively } -{% endhighlight %} +``` which Dafny can verify pretty easily. However, if you put `AllTermsInductively` instead of `AllTermsConstructively`, Dafny would have a hard time figuring out. - If `x` is in the inductive set, then `Succ(x)` is in the inductive set as well. Dafny can figure it out by itself using the `AllTermsInductively` definition, but won't be able to do it with `AllTermsConstructively` without a rigorous proof. -{% highlight dafny %} +``` lemma {:induction false} SuccIsInInductiveSet(t: Term) requires t in AllTermsInductively ensures Succ(t) in AllTermsInductively { InductiveAxioms(); } -{% endhighlight %} +``` This could be useful for a rewriter or an optimizer to ensure the elements it writes are in the same set. Everything said, everything above can be a bit overweight for regular Dafny users. In practice, you're better off writing the inductive predicate explicitly as a function rather than an infinite set with a predicate, so that you get both inductive and constructive axioms that enable you to prove something similar to the two results above. -{% highlight dafny %} +``` predicate IsAdmissible(t: Term) { match t { case True => true @@ -603,7 +603,7 @@ lemma {:induction false} SuccIsInInductiveSet2(t: Term) ensures IsAdmissible(Succ(t)) { } -{% endhighlight %} +``` This above example illustrates what Dafny does best: it automates all the hard work under the hood so that you can focus on what is most interesting to you, and even better, it ensures you don't need to define `{:axiom}` yourself in this case. diff --git a/_posts/2023-11-08-cracking-the-coding-interview-in-dafny-permutations.markdown b/_posts/2023-11-08-cracking-the-coding-interview-in-dafny-permutations.markdown index 5725dc9..99858d5 100644 --- a/_posts/2023-11-08-cracking-the-coding-interview-in-dafny-permutations.markdown +++ b/_posts/2023-11-08-cracking-the-coding-interview-in-dafny-permutations.markdown @@ -21,7 +21,7 @@ The coding interview problem is described to us in natural language. As we would Intuitively, a *permutation* of a sequence is a second sequence that contains the same elements as the first one, but potentially in a different order. In other words, after *one forgets about the order* of the elements, both structures are identical. For example, the sequence `[1,0,0]` is a permutation of the sequence `[0,0,1]`. In Dafny, a sequence without an order can be modeled by the type [`multiset`](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-multisets). Multisets are generalisations of sets in the way that they can contain elements *multi*ple times, instead of just once. The multiset that models both of the above sequences is `multiset{0,0,1}` . Note that this multiset is different from the set `set{0,0,1} == set{0,1}`. You may think of multisets that contain elements of type `T` as functions `f: T -> nat` , where `f(t)` specifies how many occurrences of the element `t` the multiset contains. In Dafny, the conversion of a sequence `s` into a multiset is denoted by `multiset(s)` . We define a sequence `p` to be a permutation of a sequence `s` (both with elements of a generic type `T`) if they satisfy the following predicate: -``` dafny +``` predicate IsPermutationOf(p: seq, s: seq) { multiset(p) == multiset(s) } @@ -29,7 +29,7 @@ predicate IsPermutationOf(p: seq, s: seq) { Note that a predicate is a function that returns a `bool`ean — that is, either `true` or `false`. To compare two multisets for equality, we need to be able to compare their elements for equality. For this reason, we require the type `T` to have the characteristic [`(==)`](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-equality-supporting). Given above, the naive definition of the set that contains all permutations of a sequence `s` is immediate: -``` dafny +``` ghost function AllPermutationsOf(s: seq): iset> { iset p | IsPermutationOf(p, s) } @@ -39,13 +39,13 @@ Note that `AllPermutationsOf` returns a potentially *infinite* set of sequences. We are now able to formally state the problem task: Define the body of a function with the signature -``` dafny +``` function CalculateAllPermutationsOf(s: seq): set> ``` and show that it is *correct*, in the sense that there exists a proof for the lemma -``` dafny +``` lemma Correctness(s: seq) ensures (iset p | p in CalculateAllPermutationsOf(s)) == AllPermutationsOf(s) ``` @@ -54,7 +54,7 @@ lemma Correctness(s: seq) There are many potential solutions to the problem. Here, we propose a purely functional approach that defines the set of all unique permutations of a sequence recursively. In the base case, when the sequence `s` has length zero, we simply return the singleton set that contains `s`. In the case that `s` is of length greater than zero, we first recursively calculate the set of all the permutations of the subsequence of `s` that starts at the second element, before inserting, via a separate function `InsertAt`, the first element of `s` into all the permutations in that set, at every possible position. -``` dafny +``` function CalculateAllPermutationsOf(s: seq): set> { if |s| == 0 then {s} @@ -72,7 +72,7 @@ function InsertAt(s: seq, x: T, i: nat): seq For example, to compute the set of all permutations of the sequence `s := [0,0,1]` , we first compute the set of all unique permutations of the subsequence `s[1..] == [0,1]` , which leads to `{[0,1], [1,0]}`. Then we insert the first element `s[0] == 0` into all the sequences of that set, at every possible position. In this case, the resulting set is `{[0,0,1], [0,1,0], [1,0,0]}`, as indicated below: -``` dafny +``` { [0] + [0,1], [0] + [0] + [1], [0,1] + [0], // inserting 0 into [0,1] [0] + [1,0], [1] + [0] + [0], [1,0] + [0] // inserting 0 into [1,0] @@ -85,7 +85,7 @@ During a coding interview, the focus would now typically shift to an analysis of To establish correctness, we need to prove, for all sequences `s`, the equality of the two (infinite) sets `A(s) := iset p | p in CalculateAllPermutationsOf(s)` and `B(s) := AllPermutationsOf(s)`. That is, we have to establish the validity of two implications, for all sequences `p`: i) if `p` is in `A(s)` , then it also is in `B(s)`; and ii) if `p` is in `B(s)` , then it also is in `A(s)`. We prove the two cases in separate lemmas `CorrectnessImplicationOne(s, p)` and `CorrectnessImplicationTwo(s, p)`, respectively: -``` dafny +``` lemma Correctness(s: seq) ensures (iset p | p in CalculateAllPermutationsOf(s)) == AllPermutationsOf(s) { @@ -118,7 +118,7 @@ The first implication is arguably the more intuitive or simpler one to prove. It As is typical for Dafny, the proof of the lemma follows the structure of the function that it reasons about. In this case, we mimic the function `CalculateAllPermutationsOf` by mirroring its recursive nature as a proof via induction. In particular, this means that we copy its case-split. In the case that `s` is of length zero (the induction base), Dafny is able to prove the claim automatically: -``` dafny +``` lemma CorrectnessImplicationOne(s: seq, p: seq) ensures p in CalculateAllPermutationsOf(s) ==> p in AllPermutationsOf(s) { @@ -150,7 +150,7 @@ lemma CorrectnessImplicationOne(s: seq, p: seq) In the case that `s` is of length greater than zero (the induction step), we need to help Dafny a bit. Assuming that `p` is an element in `CalculateAllPermutationsOf(s)`, we aim to show that `multiset(p)` and `multiset(s)` are equal. What information about `p` do we have to derive such an equality? A close inspection of the recursive call in `CalculateAllPermutationsOf` indicates the existence of a second sequence `p’` that lives in `CalculateAllPermutationsOf(s[1..])` and an index `i` of `p’` such that `p == InsertAt(p', s[0], i)` . To transform `multiset(p) == multiset(InsertAt(p', s[0], i))` into something that’s closer to `multiset(s)`, we establish the following additional lemma: -``` dafny +``` lemma MultisetAfterInsertAt(s: seq, x: T, i: nat) requires i <= |s| ensures multiset(InsertAt(s, x, i)) == multiset([x]) + multiset(s) @@ -178,7 +178,7 @@ The second implication states that any permutation `p` of `s` is an element of ` As with the first implication, the structure of the proof for `CorrectnessImplicationTwo` mimics the definition of `CalculateAllPermutationsOf`. In the case that `s` is of length zero (the induction base), Dafny is again able to prove the claim automatically. In the case that `s` is of length greater than zero (the induction step), we aim to establish that `p` is an element in `CalculateAllPermutationsOf(s)`, if it is a permutation of `s`: -``` dafny +``` lemma CorrectnessImplicationTwo(s: seq, p: seq) ensures p in CalculateAllPermutationsOf(s) <== p in AllPermutationsOf(s) { @@ -207,7 +207,7 @@ lemma CorrectnessImplicationTwo(s: seq, p: seq) An inspection of the definition of `CalculateAllPermutationsOf` shows that this means we have to construct a permutation `p'` that is in `CalculateAllPermutationsOf(s[1..])` and some index `i` of `p'` , such that `p == InsertAt(p’, s[0], i)`. How do we find `p'` and `i`? Intuitively, the idea is as follows: we define `i` as the index of the first occurrence of `s[0]` in `p` (which we know exists, since `p`, as permutation, contains the same elements as `s`) and `p'` as the sequence that one obtains when deleting the `i`-th element of `p`. To formalise this idea, we introduce the following two functions: -``` dafny +``` function FirstOccurrence(p: seq, x: T): (i: nat) requires x in multiset(p) ensures i < |p| @@ -230,7 +230,7 @@ To conclude the proof, it remains to establish that i) `p' in CalculateAllPermut For i), the idea is to apply the induction hypothesis `CorrectnessImplicationTwo(s[1..], p')`, which yields the desired result if `p'` is a permutation of `s[1..]`. Since `p` was a permutation of `s`, and `p'` was obtained from `p` by deleting its first element, the statement is intuitively true. Formally, we establish it with the call `PermutationBeforeAndAfterDeletionAt(p, s, i, 0)` to the following, slightly more general, lemma: -``` dafny +``` lemma PermutationBeforeAndAfterDeletionAt(p: seq, s: seq, i: nat, j: nat) requires IsPermutationOf(p, s) requires i < |p| @@ -263,7 +263,7 @@ The proof of `PermutationBeforeAndAfterDeletionAt` is mostly a direct consequenc To establish ii), that is, the equality `p == InsertAt(p’, s[0], i)`, we recall that by construction `p' == DeleteAt(p, i)`. Substituting `p’` for its definition and using the postcondition `p[i] == s[0]` of `FirstOccurrence` thus leads us to the following lemma, which states that deleting an element of a sequence, before inserting it again at the same position, leaves the initial sequence unchanged: -``` dafny +``` lemma InsertAfterDeleteAt(s: seq, i: nat) requires i < |s| ensures s == InsertAt(DeleteAt(s, i), s[i], i) diff --git a/_posts/2023-12-06-automated-test-generation-chess-puzzles-with-dafny.markdown b/_posts/2023-12-06-automated-test-generation-chess-puzzles-with-dafny.markdown index a7d3bb8..3084887 100644 --- a/_posts/2023-12-06-automated-test-generation-chess-puzzles-with-dafny.markdown +++ b/_posts/2023-12-06-automated-test-generation-chess-puzzles-with-dafny.markdown @@ -28,7 +28,7 @@ Now that we have definitions for all relevant chess rules, let us generate some Let us first define the method we want to test, called `Describe`, which prints whether or not a check or a checkmate has occurred. I annotate this method with the `{:testEntry}` attribute to mark it as an entry point for the generated tests. This method receives as input an arbitrary configuration of pieces on the board. In order to satisfy the chosen coverage criterion, Dafny will have to come up with such configurations of pieces when generating tests.
-{% highlight dafny %} +``` // {:testEntry} tells Dafny to use this method as an entry point method {:testEntry} Describe(board: ValidBoard) { var whiteKing := board.pieces[0]; @@ -44,7 +44,7 @@ method {:testEntry} Describe(board: ValidBoard) { print "No checkmate yet\n"; } } -{% endhighlight %} +```
Note that I also added an `expect` statement in the code. An `expect` statement is checked to be true at runtime and triggers a runtime exception if the check fails. In our case, the `expect` statement is trivially true (a checkmate always implies a check) and every test we generate passes it. In fact, if you were to turn this `expect` statement into an assertion, Dafny would prove it. Not every `expect` statement can be proven, however, especially in the presence of external libraries. You can read more about `expect` statements
here. @@ -60,7 +60,7 @@ The `Block` keyword in the command tells Dafny to generate tests that together c In particular, running the command above results in the following two Dafny tests (formatted manually for this blog post). Don’t read too closely into the code just yet, we will visualize these tests soon. My point here is that Dafny produces tests written in Dafny itself and they can be translated to any of the languages the Dafny compiler supports.
-{% highlight dafny %} +``` include "chess.dfy" module Tests { import opened Chess @@ -81,7 +81,7 @@ module Tests { Describe(board); } } -{% endhighlight %} +```
Note also that Dafny annotates the two test methods with the `{:test}` attribute. This means that after saving the tests to a file (`tests.dfy`), you can then use the `dafny test tests.dfy` command to compile and execute them. The default compilation target is C# but you can pick a different one using the `--target` option - read more about the `dafny-test` command [here](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-dafny-test). @@ -116,12 +116,12 @@ The first two tests are essentially equivalent to the ones we got before. Only t By default, Dafny test generation only guarantees coverage of statements or paths **within** the method annotated with `{:testEntry}` attribute. This means, in our case, that test generation would not differentiate between a check delivered by a pawn and one delivered by a knight, since the distinction between the two cases is hidden inside the `CheckedByPlayer` function. In order to cover these additional cases that are hidden within the callees of the test entry method, you need to *inline* them using the `{:testInline}` attribute as in the code snippet below.
-{% highlight dafny %} +``` predicate {:testInline} CheckedByPlayer(board: ValidBoard, king: Piece, byPlayer: Color) { || CheckedByPiece(board, king, Knight(byPlayer)) || CheckedByPiece(board, king, Pawn(byPlayer)) } -{% endhighlight %} +```
Adding this annotation and running test generation while targeting path coverage wins us two more tests on top of the three we already had. We now have two checkmates, one for each piece leading the attack, and two checks. @@ -161,7 +161,7 @@ Large Dafny programs often make use of quantifiers, recursion, or loops. These c To illustrate these rules, let us condense a part of the Dafny chess model by making use of quantifiers. As a reminder, here is the unnecessarily verbose definition of the `ValidBoard` predicate we have been using so far to specify what kind of chess boards we are interested in:
-{% highlight dafny %} +``` datatype Board = Board(pieces: seq) predicate BoardIsValid(board: Board) { // See Section 4 for how we can simplify this // We want boards with specific pieces on it: @@ -175,13 +175,13 @@ predicate BoardIsValid(board: Board) { // See Section 4 for how we can simplify && board.pieces[2].at != board.pieces[3].at && board.pieces[2].at != board.pieces[4].at && board.pieces[3].at != board.pieces[4].at } -{% endhighlight %} +```
There is a lot of repetition in the code above. In order to forbid two pieces from sharing the same square, we enumerate all 15 pairs of pieces! Worse, if we wanted to change the number of pieces on the board, we would have to rewrite the `BoardIsValid` predicate from scratch. A much more intuitive approach would be to use a universal quantifier over all pairs of pieces:
-{% highlight dafny %} +``` datatype Board = Board(pieces: seq) predicate BoardIsValid(board: Board) { // No two pieces on a single square forall i: nat, j: nat :: @@ -189,33 +189,33 @@ predicate BoardIsValid(board: Board) { // No two pieces on a single square board.pieces[i].at != board.pieces[j].at } type ValidBoard = board: Board | BoardIsValid(board) witness Board([]) -{% endhighlight %} +```
Similarly, we can use an existential quantifier within the body of the CheckedByPiece predicate, which returns true if the king is checked by a piece of a certain kind:
-{% highlight dafny %} +``` predicate CheckedByPiece(board: ValidBoard, king: Piece, byPiece: Kind) { exists i: int :: && 0 <= i < |board.pieces| && board.pieces[i].kind == byPiece && board.pieces[i].Threatens(king.at) } -{% endhighlight %} +```
If we want to require our board to have a king, two knights, and two pawns, like we did before, we can now separate this constraint into a separate predicate `BoardPreset` and require it to be true at the entry to the `Describe` method:
-{% highlight dafny %} +``` predicate BoardPreset(board: Board) { && |board.pieces| == 5 && board.pieces[0].kind == King(White) && board.pieces[1].kind == Knight(Black) && board.pieces[2].kind == Knight(Black) && board.pieces[3].kind == Pawn(Black) && board.pieces[4].kind == Pawn(Black) } -{% endhighlight %} +```
This definition plays one crucial role that might be not immediately apparent. It explicitly enumerates all elements within the pieces sequence thereby *triggering* the quantifiers in `BoardIsValid` and `CheckedByPiece` predicates above. In other words, we tell Dafny that we know for a fact there are elements with indices `0`, `1`, etc. in this sequence and force the verifier to substitute these elements in the quantified axioms. The full theory of triggers and quantifiers is beyond the scope of this post, but if you want to combine test generation and quantifiers in your code, you must understand this point. I recommend reading [this part of the Dafny reference manual](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-trigger) and/or [this FAQ](https://github.com/dafny-lang/dafny/wiki/FAQ#how-does-dafny-handle-quantifiers-ive-heard-about-triggers-what-are-those) that discusses the trigger selection process in Dafny. @@ -223,7 +223,7 @@ This definition plays one crucial role that might be not immediately apparent. I While Dafny can compile a subset of quantified expressions, it does not currently support inlining of such expressions for test generation purposes. This presents a challenge, as it means that we cannot immediately inline the `CheckedByPiece` predicate above. In order to inline such functions, we have to provide them with an alternative implementation, e.g. by turning the function into a [function-by-method](https://dafny.org/dafny/DafnyRef/DafnyRef#sec-function-by-method) and using a loop, like so:
-{% highlight dafny %} +``` predicate CheckedByPiece(board: ValidBoard, king: Piece, byPiece: Kind) { exists i: int :: && 0 <= i < |board.pieces| @@ -240,7 +240,7 @@ predicate CheckedByPiece(board: ValidBoard, king: Piece, byPiece: Kind) { } return false; } -{% endhighlight %} +```
Alternatively, we could have rewritten `CheckedByPiece` as a recursive function and put a `{:testInline 6}` annotation on it to unroll the recursion 6 times (6 is the maximum recursion depth for our example because it is one more than the number of pieces on the board). The test generation engine will then perform bounded model checking to produce system level tests. In general, reasoning about recursion and loops in bounded model checking context is known to be difficult, and so, while you have access to these "control knobs" that let you unroll the recursion in this manner (or unroll loops via the `-—loop-unroll` command line option), I would be cautious when combining these features with test generation. You can read [the paper](https://link.springer.com/chapter/10.1007/978-3-031-33170-1_24) on the test generation toolkit to get an idea about some of the challenges. Another consideration in deciding whether or not to unroll recursion and loops is your time budget. Unrolling increases the number of paths through the program and gives you better coverage but it also increases the time it takes to generate any given test. In the end, you will likely need to experiment with these settings to figure out what works best.