Skip to content

Latest commit

 

History

History
828 lines (714 loc) · 30.5 KB

05_Interacting_with_Lean.org

File metadata and controls

828 lines (714 loc) · 30.5 KB

Theorem Proving in Lean

Interacting with Lean

You are now familiar with the fundamentals of dependent type theory, both as a language for defining mathematical objects and a language for constructing proofs. The one thing you are missing is a mechanism for defining new data types. We will fill this gap in the next chapter, which introduces the notion of an inductive data type. But first, in this chapter, we take a break from the mechanics of type theory to explore some pragmatic aspects of interacting with Lean.

Displaying Information

There are a number of ways in which you can query Lean for information about its current state and the objects and theorems that are available in the current context. You have already seen two of the most common ones, check and eval. Remember that check is often used in conjunction with the @ operator, which makes all of the arguments to a theorem or definition explicit. In addition, you can use the print command to get information about any identifier. If the identifier denotes a definition or theorem, Lean prints the type of the symbol, and its definition; if it is a constant or axiom, Lean indicates that fact, and shows the type.

import data.nat

-- examples with equality
check eq
check @eq
check eq.symm
check @eq.symm

print eq.symm

-- examples with and
check and
check and.intro
check @and.intro

-- examples with addition
open nat
check add
check @add
eval add 3 2
print definition add

-- a user-defined function
definition foo {A : Type} (x : A) : A := x

check foo
check @foo
eval foo
eval (foo @nat.zero)
print foo

There are other useful print commands:

print notation               : display all notation
print notation <tokens>      : display notation using any of the tokens
print axioms                 : display assumed axioms
print options                : display options set by user or emacs mode
print prefix <namespace>     : display all declarations in the namespace
print coercions              : display all coercions
print coercions <source>     : display only the coercions from <source>
print classes                : display all classes
print instances <class name> : display all instances of the given class
print fields <structure>     : display all "fields" of a structure

We will discuss classes, instances, and structures in Chapter Type Classes. Here are examples of how the print commands are used:

import standard algebra.ring
open prod sum int nat algebra

print notation
print notation + * -
print axioms
print options
print prefix nat
print prefix nat.le
print coercions
print coercions num
print classes
print instances ring
print fields ring

Another useful command, although the implementation is still rudimentary at this stage, is the find decl command. This can be used to find theorems whose conclusion matches a given pattern. The syntax is as follows:

find_decl <pattern>  [, filter]*

where <pattern> is an expression with “holes” (underscores), and a filter is of the form

+ id  (id is a substring of the declaration)
- id  (id is not a substring of the declaration)
  id  (id is a substring of the declaration)

For example:

import data.nat
open nat

find_decl ((_ * _) = (_ * _))
find_decl (_ * _) = _, +assoc
find_decl (_ * _) = _, -assoc

find_decl _ < succ _, +imp, -le

Setting Options

Lean maintains a number of internal variables that can be set by users to control its behavior. The syntax for doing so is as follows:

set_option <name> <value>

One very useful family of options controls the way Lean’s pretty- printer displays terms. The following options take an input of true or false:

pp.implicit  : display implicit arguments
pp.universes : display hidden universe parameters
pp.coercions : show coercions
pp.notation  : display output using defined notations
pp.beta      : beta reduce terms before displaying them

In Lean, coercions can be inserted automatically to cast an element of one data type to another, for example, to cast an element of nat to an element of int. We will say more about them later in this chapter. This list is not exhaustive; you can see a complete list by typing set_option pp. and then using tab-completion in the Emacs mode for Lean, also discussed below.

As an example, the following settings yield much longer output:

import data.nat
open nat

set_option pp.implicit true
set_option pp.universes true
set_option pp.notation false
set_option pp.numerals false

check 2 + 2 = 4
eval (λ x, x + 2) = (λ x, x + 3)

set_option pp.beta true
check (λ x, x + 1) 1

Pretty printing additional information is often very useful when you are debugging a proof, or trying to understand a cryptic error message. Too much information can be overwhelming, though, and Lean’s defaults are generally sufficient for ordinary interactions.

Using the Library

To use Lean effectively you will inevitably need to make use of definitions and theorems in the library. Recall that the import command at the beginning of a file imports previously compiled results from other files, and that importing is transitive; if you import foo and foo imports bar, then the definitions and theorems from bar are available to you as well. But the act of opening a namespace — which provides shorter names, notations, rewrite rules, and more — does not carry over. In each file, you need to open the namespaces you wish to use.

The command import standard imports the essential parts of the standard library, and by now you have seen many of the namespaces you will need. For example, you should open nat for notation when you are working with the natural numbers, and open int when you are working with the integers. In general, however, it is important for you to be familiar with the library and its contents, so you know what theorems, definitions, notations, and resources are available to you. Below we will see that Lean’s Emacs mode can also help you find things you need, but studying the contents of the library directly is often unavoidable.

Lean has two libraries. Here we will focus on the standard library, which offers a conventional mathematical framework. We will discuss the library for homotopy type theory in a later chapter.

There are a number of ways to explore the contents of the standard library. You can find the file structure online, on github:

https://github.com/leanprover/lean/tree/master/library

You can see the contents of the directories and files using github’s browser interface. If you have installed Lean on your own computer, you can find the library in the lean folder, and explore it with your file manager. Comment headers at the top of each file provide additional information.

Alternatively, there are “markdown” files in the library that provide links to the same files but list them in a more natural order, and provide additional information and annotations.

https://github.com/leanprover/lean/blob/master/library/library.md

You can again browse these through the github interface, or with a markdown reader on your computer.

Lean’s library developers follow general naming guidelines to make it easier to guess the name of a theorem you need, or to find it using tab completion in Lean’s Emacs mode, which is discussed in the next section. To start with, common “axiomatic” properties of an operation like conjunction or multiplication are put in a namespace that begins with the name of the operation:

import standard algebra.ordered_ring
open nat algebra

check and.comm
check mul.comm
check and.assoc
check mul.assoc
check @mul.left_cancel -- multiplication is left cancelative

In particular, this includes intro and elim operations for logical connectives, and properties of relations:

import standard algebra.ordered_ring
open nat algebra

-- BEGIN
check and.intro
check and.elim
check or.intro_left
check or.intro_right
check or.elim

check eq.refl
check eq.symm
check eq.trans
-- END

For the most part, however, we rely on descriptive names. Often the name of theorem simply describes the conclusion:

import standard algebra.ordered_ring
open nat algebra

-- BEGIN
check succ_ne_zero
check @mul_zero
check @mul_one
check @sub_add_eq_add_sub
check @le_iff_lt_or_eq
-- END

If only a prefix of the description is enough to convey the meaning, the name may be made even shorter:

import standard algebra.ordered_ring
open nat algebra

-- BEGIN
check @neg_neg
check pred_succ
-- END

Sometimes, to disambiguate the name of theorem or better convey the intended reference, it is necessary to describe some of the hypotheses. The word “of” is used to separate these hypotheses:

import standard algebra.ordered_ring
open nat algebra

-- BEGIN
check lt_of_succ_le
check @lt_of_not_ge
check @lt_of_le_of_ne
check @add_lt_add_of_lt_of_le
-- END

Sometimes abbreviations or alternative descriptions are easier to work with. For example, we use pos, neg, nonpos, nonneg rather than zero_lt, lt_zero, le_zero, and zero_le.

import standard algebra.ordered_ring
open nat algebra

-- BEGIN
check @mul_pos
check @mul_nonpos_of_nonneg_of_nonpos
check @add_lt_of_lt_of_nonpos
check @add_lt_of_nonpos_of_lt
-- END

Sometimes the word “left” or “right” is helpful to describe variants of a theorem.

import standard algebra.ordered_ring
open nat algebra

-- BEGIN
check @add_le_add_left
check @add_le_add_right
check @le_of_mul_le_mul_left
check @le_of_mul_le_mul_right
-- END

Lean’s Emacs Mode

This tutorial is designed to be read alongside Lean’s web-browser interface, which runs a Javascript-compiled version of Lean inside your web browser. But there is a much more powerful interface to Lean that runs as a special mode in the Emacs text editor. Our goal in this section is to consider some of the advantages and features of the Emacs interface.

If you have never used the Emacs text editor before, you should spend some time experimenting with it. Emacs is an extremely powerful text editor, but it can also be overwhelming. There are a number of introductory tutorials on the web. See, for example:

You can get pretty far simply using the menus at the top of the screen for basic editing and file management. Those menus list keyboard-equivalents for the commands. Notation like “C-x”, short for “control x,” means “hold down the control key while typing x.” The notation “M-x”, short for “Meta x,” means “hold down the Alt key while typing x,” or, equivalently, “press the Esc key, followed by x.” For example, the “File” menu lists “C-c C-s” as a keyboard-equivalent for the “save file” command.

There are a number of benefits to using the native version of Lean instead of the web interface. Perhaps the most important is file management. The web interface imports the entire standard library internally, which is why some examples in this tutorial have to put examples in a namespace, “hide,” to avoid conflicting with objects already defined in the standard library. Moreover, the web interface only operates on one file at a time. Using the Emacs editor, you can create and edit Lean theory files anywhere on your file system, as with any editor or word processor. From these files, you can import pieces of the library at will, as well as your own theories, defined in separate files.

To use the Emacs with Lean, you simply need to create a file with the extension “.lean” and edit it. (For files that should be checked in the homotopy type theory framework, use “.hlean” instead.) For example, you can create a file by typing emacs my_file.lean in a terminal window, in the directory where you want to keep the file. Assuming everything has been installed correctly, Emacs will start up in Lean mode, already checking your file in the background.

You can then start typing, or copy any of the examples in this tutorial. (In the latter case, make sure you include the import and open commands that are sometimes hidden in the text.) Lean mode offers syntax highlighting, so commands, identifiers, and so on are helpfully color-coded. Any errors that Lean detects are subtly underlined in red, and the editor displays an exclamation mark in the left margin. As you continue to type and eliminate errors, these annotations magically disappear.

If you put the cursor on a highlighted error, Emacs displays the error message in at the bottom of the frame. Alternatively, if you type C-c ! l while in Lean mode, Emacs opens a new window with a list of compilation errors. Lean relies on an Emacs mode, Flycheck, for this functionality, as evidenced by the letters “FlyC” that appear in the Emacs information line. An asterisk next to these letters indicates that Flycheck is actively checking the file, using Lean. Flycheck offers a number of commands that begin with C-c !. For example, C-c ! n moves the cursor to the next error, and C-c ! p moves the cursor to the previous error. You can get to a help menu that lists these key bindings by clicking on the “FlyC” tag.

It may be disconcerting to see a perfectly good proof suddenly “break” when you change a single character. Moreover, changes can introduce errors downstream. But the error messages vanish quickly when correctness is restored. Lean is quite fast and caches previous work to speed up compilation, and changes you make are registered almost instantaneously.

The Emacs Lean mode also maintains a continuous dialog with a background Lean process and uses it to present useful information to you. For example, if you put your cursor on any identifier — a theorem name, a defined symbol, or a variable — Emacs displays its type in the information line at the bottom. If you put the cursor on the opening parenthesis of an expression, Emacs displays the type of the expression.

This works even for implicit arguments. If you put your cursor on an underscore symbol, then, assuming Lean’s elaborator was successful in inferring the value, Emacs shows you that value and its type. Typing “C-c C-f” replaces the underscore with the inferred value. In cases where Lean is unable to infer a value of an implicit argument, the underscore is highlighted, and the error message indicates the type of the “hole” that needs to be filled. This can be extremely useful when constructing proofs incrementally. One can start typing a “proof sketch,” using either sorry or an underscore for details you intend to fill in later. Assuming the proof is correct modulo these missing pieces of information, the error message at an unfilled underscore tells you the type of the term you need to construct, typically an assertion you need to justify.

The Lean mode supports tab completion. In a context where Lean expects an identifier (e.g. a theorem name or a defined symbol), if you start typing and then hit the tab key, a popup window suggests possible matches or near-matches for the expression you have typed. This helps you find the theorems you need without having to browse the library. You can also press tab after an import command, to see a list of possible imports, or after the set_option command, to see a list of options.

If you put your cursor on an identifier and type “C-c C-p”, Lean prints the definition of that identifier in a separate buffer. If you put your cursor on an identifier that is defined in Lean’s library and hit “M-.”, Emacs will take you to the identifier’s definition in the library file itself. This works even in an autocompletion popup window: if you start typing an identifier, press the tab key, choose a completion from the list of options, and press “M-.”, you are taken to the symbol’s definition. When you are done, pressing “M-*” takes you back to your original position.

There are other useful tricks. If you see some notation in a Lean file and you want to know how to enter it from the keyboard, put the cursor on the symbol and type “C-c C-k”. You can set common Lean options with “C-c C-o”, and you can execute a Lean command using “C-c C-e”. These commands and others are summarized in the online documentation:

https://github.com/leanprover/lean/blob/master/src/emacs/README.md

If for some reason the Lean background process does not seem to be responding (for example, the information line no longer shows you type information), type “C-c C-r”, or “M-x lean-server-restart-process”, or choose “restart lean process” from the Lean menu, and with luck that will set things right again.

This is a good place to mention another trick that is sometimes useful when editing long files. In Lean, the exit command halts processing of the file abruptly. If you are making changes at the top of a long file and want to defer checking of the remainder of the file until you are done making those changes, you can temporarily insert an exit.

Projects

At this point, it will be helpful to convey more information about the inner workings of Lean. A .lean file (or .hlean file, if you are working on homotopy type theory) consists of instructions that tell Lean how to construct formal terms in dependent type theory. “Processing” this file is a matter of filling in missing or implicit information, constructing the relevant terms, and sending them to the type checker to confirm that they are well-formed and have the specified types. This is analogous to the compilation process for a programming language: the .lean or .hlean file contains the source code that is then compiled down to machine representations of the desired formal objects. Lean stores the output of the compilation process in files with the extension “.olean”, for “object Lean”.

It is these files that are loaded by the import command. When Lean processes an import command, it looks for the relevant .olean files in standard places. By default, the search path consists of the root of the standard library (or the hott library, if the file is a .hlean file) and the current directory. You can specify subdirectories using periods in the module name: for example, import foo.bar.baz looks for the file “foo/bar/baz.olean” relative to any of the locations listed in the search path. A leading period, as in import .foo.bar, indicates that the .olean file in question is specified relative to the current directory. Two leading periods, as in import ..foo.bar, indicates that the address is relative to the parent directory, and so on.

If you enter the command lean -o foo.olean foo.lean from the command line, Lean processes foo.lean and, if it compiles successfully, it stores the output in foo.olean. The result is that another file can then import foo.

When you are editing a single file with either the web interface or the Emacs Lean mode, however, Lean only checks the file internally, without saving the .olean output. Suppose, then, you wish to build a project that has multiple files. What you really want is that Lean’s Emacs mode will build all the relevant .olean files in the background, so that you can import those files freely.

The Emacs mode makes this easy. To start a project that may potentially involve more than one file, choose the folder where you want the project to reside, open an initial file in Emacs, choose “create a new project” from the Lean menu, and press the “open” button. This creates a file, .project, which instructs a background process to ensure that whenever you are working on a file in that folder (or any subfolder thereof), compiled versions of all the modules it depends on are available and up to date.

Suppose you are editing foo.lean, which imports bar. You can switch to bar.lean and make additions or corrections to that file, then switch back to foo and continue working. The process linja, based on the ninja build system, ensures that bar is recompiled and that an up-to-date version is available to foo.

Incidentally, outside of Emacs, from a terminal window, you can type linja anywhere in your project folder to ensure that all your files have compiled .olean counterparts, and that they are up to date.

Notation and Abbreviations

Lean’s parser is an instance of a Pratt parser, a non-backtracking parser that is fast and flexible. You can read about Pratt parsers in a number of places online, such as here:

http://en.wikipedia.org/wiki/Pratt_parser http://eli.thegreenplace.net/2010/01/02/top-down-operator-precedence-parsing

Identifiers can include any alphanumeric characters, including Greek characters (other than Π , Σ , and λ , which, as we have seen, have a special meaning in the dependent type theory). They can also include subscripts, which can be entered by typing \_ followed by the desired subscripted character.

Lean’s parser is moreover extensible, which is to say, we can define new notation.

import data.nat
open nat

notation `[` a `**` b `]` := a * b + 1

definition mul_square (a b : ℕ) := a * a * b * b

infix `<*>`:50 := mul_square

eval [2 ** 3]
eval 2 <*> 3

In this example, the notation command defines a complex binary notation for multiplying and adding one. The infix command declares a new infix operator, with precedence 50, which associates to the left. (More precisely, the token is given left-binding power 50.) The command infixr defines notation which associates to the right, instead.

If you declare these notations in a namespace, the notation is only available when the namespace is open. You can declare temporary notation using the keyword local, in which case the notation is available in the current file, and moreover, within the scope of the current namespace or section, if you are in one.

import data.nat
open nat

-- BEGIN
local notation `[` a `**` b `]` := a * b + 1
local infix `<*>`:50 := λ a b : ℕ, a * a * b * b
-- END

The file reserved_notation.lean in the init folder of the library declares the left-binding powers of a number of common symbols that are used in the library.

https://github.com/leanprover/lean/blob/master/library/init/reserved_notation.lean

You are welcome to overload these symbols for your own use, but you cannot change their right-binding power.

Remember that you can direct the pretty-printer to suppress notation with the command set_option pp.notation false. You can also declare notation to be used for input purposes only with the [parsing_only] attribute:

import data.nat
open nat

notation [parsing_only] `[` a `**` b `]` := a * b + 1

variables a b : ℕ
check [a ** b]

The output of the check command displays the expression as a * b + 1. Lean also provides mechanisms for iterated notation, such as [a, b, c, d, e] to denote a list with the indicated elements. See the discussion of list in the next chapter for an example.

Notation in Lean can be overloaded, which is to say, the same notation can be used for more than one purpose. In that case, Lean’s elaborator will try to disambiguate based on context. For example, we have already seen that with the eq.ops namespace open, the inverse symbol can be used to denote the symmetric form of an equality. It can also be used to denote the multiplicative inverse:

import data.rat
open rat eq.ops

variable r : ℚ

check r⁻¹             --
check (eq.refl r)⁻¹   -- r = r

Insofar as overloads produce ambiguity, they should be used sparingly. We avoid the use of overloads for arithmetic operations like +, *, -, and / by using type classes, as described in Chapter Type Classes. In the following, the addition operation denotes a general algebraic operation that can be instantiated to nat or int as required:

import data.nat data.int
open algebra nat int

variables a b : int
variables m n : nat

check a + b    --
check m + n    --
print notation +

This is sometimes called parametric polymorphism, in contrast to ad hoc polymorphism, which we are considering here. For example, the notation ++ is used to concatenate both lists and vectors:

import data.list data.tuple
open list tuple

variables (A : Type) (m n : ℕ)
variables (v : tuple A m) (w : tuple A n) (s t : list A)

check s ++ t
check v ++ w

Where it is necessary to disambiguate, Lean allows you to precede an expression with the notation #<namespace> to specify the namespace in which notation is to be interpreted.

import data.list data.tuple
open list tuple

variables (A : Type) (m n : ℕ)
variables (v : tuple A m) (w : tuple A n) (s t : list A)

-- BEGIN 
check (#list λ x y, x ++ y)
check (#tuple λ x y, x ++ y)
-- END

Lean provides an abbreviation mechanism that is similar to the notation mechanism.

import data.nat
open nat

abbreviation double (x : ℕ) : ℕ := x + x

theorem foo (x : ℕ) : double x = x + x := rfl
check foo

An abbreviation is a transient form of definition that is expanded as soon as an expression is processed. As with notation, however, the pretty-printer re-constitutes the expression and prints the type of foo as double x = x + x. As with notation, you can designate an abbreviation to be [parsing-only], and you can direct the pretty-printer to suppress their use with the command set_option pp.notation false. Finally, again as with notation, you can limit the scope of an abbreviation by prefixing the declarations with the local modifier.

As the name suggests, abbreviations are intended to be used as convenient shorthand for long expressions. One common use is to abbreviate a long identifier:

definition my_long_identity_function {A : Type} (x : A) : A := x
local abbreviation my_id := @my_long_identity_function

Coercions

Lean also provides mechanisms to automatically insert coercions between types. These are user-defined functions between datatypes that make it possible to “view” one datatype as another. For example, any natural number can be coerced to an integer.

import data.nat data.int
open nat int

variables a b : int
variables m n : nat

-- BEGIN
check m + n          -- m + n : ℕ
check a + n          -- a + n : ℤ
check n + a          -- n + a : ℤ
check (m + n : ℤ)    -- m + n : ℤ

set_option pp.coercions true

check m + n          -- m + n : ℕ
check a + n          -- a + of_nat n : ℤ
check n + a          -- of_nat n + a : ℤ
check (m + n : ℤ)    -- of_nat (m + n) : ℤ
-- END

Setting the option pp.coercions to true makes the coercions explicit. Coercions that are declared in a namespace are only available to the system when the namespace is opened. The notation (t : T) constrains Lean to find an interpertation of t which gives it a type that is definitionally equal to T, thereby allowing you to specify the interpretation of t you have in mind. Thus checking (m + n : ℤ) forces the insertion of a coercion.

Here is an example of how we can define a coercion from the booleans to the natural numbers.

import data.bool data.nat
open bool nat

definition bool.to_nat [coercion] (b : bool) : nat :=
bool.cond b 1 0

eval 2 + ff
eval 2 + tt
eval tt + tt + tt + ff

print coercions        -- show all coercions
print coercions bool   -- show all coercions from bool

The tag “coercion” is an attribute that is associated with the symbol bool.to_nat. It does not change the meaning of bool.to_nat. Rather, it associates additional information to the symbol that informs Lean’s elaboration algorithm, as discussed in Section Elaboration and Unification. We could also declare bool.to_nat to be a coercion after the fact as follows:

import data.bool data.nat
open bool nat

-- BEGIN
definition bool.to_nat (b : bool) : nat :=
bool.cond b 1 0

attribute bool.to_nat [coercion]
-- END
eval 2 + ff
eval 2 + tt
eval tt + tt + tt + ff

In both cases, the scope of the coercion is the current namespace, so the coercion will be in place whenever the module is imported and the namespace is open. Sometimes it is useful to assign an attribute only temporarily. The local modifier ensures that the declaration is only in effect in the current file, and within the current namespace or section:

import data.bool data.nat
open bool nat

-- BEGIN
definition bool.to_nat (b : bool) : nat :=
bool.cond b 1 0

local attribute bool.to_nat [coercion]
-- END

Overloads and coercions introduce “choice points” in the elaboration process, forcing the elaborator to consider multiple options and backtrack appropriately. This can slow down the elaboration process. What is more problematic is that it can make error messages less informative: Lean only reports the result of the last backtracking path, which means the failure that is reported to the user may be due to the wrong interpretation of an overload or coercion. This is why Lean provides mechanism for namespace management: parsing and elaboration go more smoothly when we only import the notation that we need.

Nonetheless, overloading is quite convenient, and often causes no problems. There are various ways to manually disambiguate an expression when necessary. One is to precede the expression with the notation #<namespace>, to specify the namespace in which notation is to be interpreted. Another is to replace the notation with an explicit function name. Yet a third is to use the (t : T) notation to indicate the intended type.

#