Computation in Coq
- Recursive functions
- Proving things about recursive functions
- Lists and nested recursive functions
- Trees and more complicated recursive functions
- Program Extraction
- Using recursive functions to prove things (a.k.a. reflection)
- Efficiency
- Endnotes
First example
Require Import Bool Arith.
Definition nonzero (n : nat) : bool :=
match n with
| 0 ⇒ false
| S m ⇒ true
end.
In an interactive session, the following would display the
result of "running" this function.
"Computation" in Coq is simply the process of using definitional
equality to simplify a term until it is in normal form, e.g.
nonzero 3 ⇒ nonzero (S (S (S 0))) ⇒ true.
Note that match is a primitive concept in Coq, which when combined with
Fixpoint is equivalent to the induction principles associated to inductive types.
Thus, they can be used to make recursive functions.
Recursive functions
Fixpoint factorial (n : nat) : nat :=
match n with
| 0 ⇒ 1
| S m ⇒ n × factorial m
end.
Compute factorial 7. (* 5040 : nat *)
Definition factorial_ex1 : factorial 7 = 7×6×5×4×3×2×1 := eq_refl.
match n with
| 0 ⇒ 1
| S m ⇒ n × factorial m
end.
Compute factorial 7. (* 5040 : nat *)
Definition factorial_ex1 : factorial 7 = 7×6×5×4×3×2×1 := eq_refl.
It is essential for logical consistency that all functions terminate
on all inputs. For example, the following would type check, but doesn't
terminate, so Coq doesn't accept the definition.
Fail Fixpoint loop (n : nat): Empty_set := loop n.
(* Error: Recursive call to loop has principal argument equal to
"n" instead of a subterm of "n". *)
(* Error: Recursive call to loop has principal argument equal to
"n" instead of a subterm of "n". *)
If it were accepted, then loop 0 would be of type Empty_set. Oops.
For Coq to accept a definition using Fixpoint, there must
be an argument of your function so that all of your recursive
calls pass in a proper subterm of that argument.
For example, m is a proper subterm of S m.
We'll get to more advanced examples of recursion soon.
Coq is in general lazy about computation. It does it when needed
to check the type of a term, e.g. checking that eq_refl has type
factorial 7 = 7*6*5*4*3*2*1, but otherwise leaves things alone. This laziness
is crucial for efficiency.
For example, factorial 9 causes a stack overflow on my machine, but
the following examples type check without a problem.
Aside: Laziness
Definition factorial_ex2 : factorial 100 = factorial 100 := eq_refl.
Definition factorial_ex3 : nat := factorial 100.
Definition factorial_ex3 : nat := factorial 100.
Proving things about recursive functions
We ask Coq to create an induction principle for this function.
Functional Scheme div2_ind := Induction for div2 Sort Prop.
We use this to prove a Lemma. The cases match the cases in div2.
We could also use ordinary induction, twice.
Lemma div2_le : ∀ n, div2 n × 2 ≤ n.
Proof.
intro n.
functional induction div2 n.
- (* Goal: 0 × 2 ≤ 0 *)
auto.
- (* Goal: 0 × 2 ≤ 1 *)
auto.
- (* Goal: (S (div2 p)) × 2 ≤ S (S p) *)
(* Given: IHN0 : div2 p × 2 ≤ p *)
simpl.
(* Goal: S (S (div2 p × 2)) ≤ S (S p) *)
auto with arith.
Qed.
Proof.
intro n.
functional induction div2 n.
- (* Goal: 0 × 2 ≤ 0 *)
auto.
- (* Goal: 0 × 2 ≤ 1 *)
auto.
- (* Goal: (S (div2 p)) × 2 ≤ S (S p) *)
(* Given: IHN0 : div2 p × 2 ≤ p *)
simpl.
(* Goal: S (S (div2 p × 2)) ≤ S (S p) *)
auto with arith.
Qed.
Inductive mylist (A : Type) :=
| mynil : mylist A
| mycons : A → mylist A → mylist A.
Definition mylist_ex1 : mylist nat := mycons nat 0 (mycons nat 1 (mynil nat)).
| mynil : mylist A
| mycons : A → mylist A → mylist A.
Definition mylist_ex1 : mylist nat := mycons nat 0 (mycons nat 1 (mynil nat)).
We'll use Coq's list type, which comes with nice notations.
Require Import List.
Import ListNotations.
Definition list_ex1 : list nat := [1; 2; 4; 3].
Definition list_ex2 : list nat := 1 :: 2 :: 4 :: 3 :: nil.
Definition list_ex_eq : list_ex1 = list_ex2 := eq_refl.
Fixpoint last {A : Type} (default : A) (lst : list A) : A :=
match lst with
| nil ⇒ default
| hd :: nil ⇒ hd
| hd :: tail ⇒ last default tail
end.
Compute last 0 list_ex1. (* 3 : nat *)
Import ListNotations.
Definition list_ex1 : list nat := [1; 2; 4; 3].
Definition list_ex2 : list nat := 1 :: 2 :: 4 :: 3 :: nil.
Definition list_ex_eq : list_ex1 = list_ex2 := eq_refl.
Fixpoint last {A : Type} (default : A) (lst : list A) : A :=
match lst with
| nil ⇒ default
| hd :: nil ⇒ hd
| hd :: tail ⇒ last default tail
end.
Compute last 0 list_ex1. (* 3 : nat *)
last is inductively defined by structural recursion on the argument lst.
tail is a proper subterm of the list head :: tail.
Now suppose I want to write a function that merges two sorted lists
of natural numbers into a new sorted list. A naive try would be:
Nested structural recursion
Fail Fixpoint merge (l1 l2 : list nat) : list nat :=
match l1 with
| nil ⇒ l2
| h1 :: t1 ⇒ match l2 with
| nil ⇒ l1
| h2 :: t2 ⇒
if leb h1 h2 then
h1 :: merge t1 l2
else
h2 :: merge l1 t2
end
end.
(* Error: Cannot guess decreasing argument of fix. *)
match l1 with
| nil ⇒ l2
| h1 :: t1 ⇒ match l2 with
| nil ⇒ l1
| h2 :: t2 ⇒
if leb h1 h2 then
h1 :: merge t1 l2
else
h2 :: merge l1 t2
end
end.
(* Error: Cannot guess decreasing argument of fix. *)
Here's one way to accomplish this.
Fixpoint merge (l1 l2 : list nat) : list nat :=
match l1 with
| nil ⇒ l2
| h1 :: t1 ⇒ (fix merge_aux (l2' : list nat) :=
match l2' with
| nil ⇒ l1
| h2 :: t2 ⇒
if leb h1 h2 then
h1 :: merge t1 l2'
else
h2 :: merge_aux t2
end) l2
end.
match l1 with
| nil ⇒ l2
| h1 :: t1 ⇒ (fix merge_aux (l2' : list nat) :=
match l2' with
| nil ⇒ l1
| h2 :: t2 ⇒
if leb h1 h2 then
h1 :: merge t1 l2'
else
h2 :: merge_aux t2
end) l2
end.
The inner recursive function merge_aux is the "merge with l1" function.
It is created each time we reach that spot, adapted to the particular l1 we
have then. It calls the outer merge and itself recursively. The outer
merge has l1 decreasing, and the inner merge_aux has l2' decreasing.
Compute merge [1; 3; 3; 6] [0; 1; 2; 5; 7].
(* [0; 1; 1; 2; 3; 3; 5; 6; 7] : list nat *)
Trees
Inductive tree :=
node : nat → list tree → tree.
One parent node, with three children nodes.
Definition tree_ex1 : tree :=
node 7 [node 6 nil; node 3 nil; node 3 nil].
node 7 [node 6 nil; node 3 nil; node 3 nil].
Node 4 has two children, and the second of those has three children.
Definition tree_ex2 : tree :=
node 4 [node 2 nil; tree_ex1].
node 4 [node 2 nil; tree_ex1].
Return label of the "leftmost" leaf node.
Fixpoint leftmost_leaf (t : tree) : nat :=
match t with
| node n nil ⇒ n
| node n (child::rest) ⇒ leftmost_leaf child
end.
match t with
| node n nil ⇒ n
| node n (child::rest) ⇒ leftmost_leaf child
end.
Compute leftmost_leaf tree_ex1. (* 6 : nat *)
Compute leftmost_leaf tree_ex2. (* 2 : nat *)
This function searches through the tree for a node with a particular label.
Fixpoint occur (m : nat) (t : tree) : bool :=
match t with
| node n children ⇒
if beq_nat m n then true
else (fix occur_list (m' : nat) (l : list tree) : bool :=
match l with
| nil ⇒ false
| hd::tl ⇒ orb (occur m' hd) (occur_list m' tl)
end) m children
end.
match t with
| node n children ⇒
if beq_nat m n then true
else (fix occur_list (m' : nat) (l : list tree) : bool :=
match l with
| nil ⇒ false
| hd::tl ⇒ orb (occur m' hd) (occur_list m' tl)
end) m children
end.
We need nested recursion, because we need to search the tail list.
One could try calling occur m (node n tl), but node n tl isn't a subterm.
Compute occur 2 tree_ex2. (* true : bool *)
Compute occur 9 tree_ex2. (* false : bool *)
Another approach is a recursive function defined from the start to
look through a list of trees.
Fail Fixpoint occur_list (m : nat) (l : list tree) {struct l} : bool :=
match l with
| nil ⇒ false
| (node n children)::tl ⇒ beq_nat m n || occur_list m children || occur_list m tl
end.
(* Error: Recursive call to occur_list has principal argument equal to
"children" instead of "tl". *)
match l with
| nil ⇒ false
| (node n children)::tl ⇒ beq_nat m n || occur_list m children || occur_list m tl
end.
(* Error: Recursive call to occur_list has principal argument equal to
"children" instead of "tl". *)
I don't know why Coq accepts child as a subterm of node n (child::rest)
but not children as a subterm of (node n children)::tl. If it had worked,
one would then define:
Fail Definition occur' (m : nat) (t : tree) : bool := occur_list m [t].
In situations like this, it is sometimes fruitful to change the definition of the type.
Inductive tree' :=
node' : nat → forest → tree'
with forest :=
| fnil : forest
| fcons : tree' → forest → forest.
node' : nat → forest → tree'
with forest :=
| fnil : forest
| fcons : tree' → forest → forest.
We have defined mutually inductive types, with forest essentially the
same as list tree.
Now Coq will accept:
Fixpoint occur_forest (m : nat) (l : forest) : bool :=
match l with
| fnil ⇒ false
| fcons (node' n children) tl ⇒ beq_nat m n || occur_forest m children || occur_forest m tl
end.
Definition occur' (m : nat) (t :tree') : bool := occur_forest m (fcons t fnil).
match l with
| fnil ⇒ false
| fcons (node' n children) tl ⇒ beq_nat m n || occur_forest m children || occur_forest m tl
end.
Definition occur' (m : nat) (t :tree') : bool := occur_forest m (fcons t fnil).
One can also define mutually recursive functions over mutually inductive types.
The above two could be better organized in this way.
Sometimes recursion isn't structural. Then you need to prove to Coq that
your function terminates. One way is to specify an argument X of type T
and a measure function m : T → nat such that m X decreases on each
recursive call.
More complicated recursion
Require Import Recdef.
Function log2 (n : nat) {measure (fun m ⇒ m) n} : nat :=
match n with
| S (S n') ⇒ S (log2 (div2 n))
| _ ⇒ 0
end.
Proof.
(* Goal: ∀ n n0 n' : nat, n0 = S n' → n = S (S n') → div2 (S (S n')) < S (S n') *)
intros n n0 n'.
intros H1 H2.
(* Goal: div2 (S (S n')) < S (S n') *)
auto with arith. (* I'm surprised this works! *)
Defined.
Program Fixpoint is another tool for defining a function using a
measure. It is more flexible, in that the measure can depend on several
arguments. (It's also an expression, rather than a function.) And the
code it produces is more concise and faster, at least in one example I
tested (game_half_sum).
Function doesn't allow nested recursion, mutual recursion, dependent types,
or higher order functions.
Program Fixpoint allows most or all of these, but doesn't generate
as many inductive principles.
Once you've defined a function (and possibly proved it correct), you can
extract code in a variety of languages. The extraction procedure is smart
enough to throw away proofs that don't affect the result.
Code extraction
One can also extract to Haskell and Scheme. An entire C compiler,
called CompCert, has been written in Coq, proved to be correct relative to
a specification, and extracted to Ocaml!
(* Added later: See
https://www.irif.univ-paris-diderot.fr/~letouzey/download/letouzey_extr_cie08.pdf
for more about extraction (and recursion in general). And see the game_sum example.
Note that the extracted code is generally much faster than running the code in Coq. *)
Proof using reflection
Inductive Even : nat → Prop :=
| Even_O : Even 0
| Even_SS : ∀ n, Even n → Even (S (S n)).
Goal Even 0.
exact Even_O.
Qed.
Goal Even 4.
exact (Even_SS 2 (Even_SS 0 Even_O)).
Qed.
We can even use repeat constructor to prove that large numbers are Even.
Lemma by_constructor_1024 : Even 1024.
Proof.
repeat constructor. (* 1.5s *)
Defined. (* 1.6s *)
The time is roughly quadratic in the value! For Even 2n, we get
n nested occurrences of Even_SS, but each of those requires an argument m
which in unary requires m constructors.
An alternative approach.
Fixpoint is_even (n : nat) : bool :=
match n with
| 0 ⇒ true
| 1 ⇒ false
| S (S n') ⇒ is_even n'
end.
match n with
| 0 ⇒ true
| 1 ⇒ false
| S (S n') ⇒ is_even n'
end.
Soundness theorem. Proved using match, to mirror the function.
Fixpoint is_even_sound (n : nat) : is_even n = true → Even n :=
match n return is_even n = true → Even n with
| 0 ⇒ fun _ ⇒ Even_O
| 1 ⇒ fun pf : false = true ⇒ match pf with eq_refl ⇒ tt end
| S (S n') ⇒ fun pf : is_even n' = true ⇒ Even_SS _ (is_even_sound n' pf)
end.
match n return is_even n = true → Even n with
| 0 ⇒ fun _ ⇒ Even_O
| 1 ⇒ fun pf : false = true ⇒ match pf with eq_refl ⇒ tt end
| S (S n') ⇒ fun pf : is_even n' = true ⇒ Even_SS _ (is_even_sound n' pf)
end.
Notice how we are using computational properties of is_even to make
the types match up. This is why decidability of definitional equality
is essential for type checking.
(The second line is pretty confusing, as is the proof
that you can Print. I can explain on the board later if there is time.)
Lemma by_reflection_1024 : Even 1024.
Proof.
apply is_even_sound.
(* Goal: is_even 1024 = true *)
reflexivity. (* 0.005s *)
Defined. (* 0.02s *)
Lemma by_reflection_big : Even 16384.
Proof.
apply is_even_sound.
reflexivity. (* 0.09s *)
Defined. (* 0.27s *)
The speed is because the proof term is concise.
Print by_reflection_1024. (* is_even_sound 1024 eq_refl : Even 1024 *)
Coq doesn't need to fully evaluate that proof term to check its type. (Laziness!)
But if it did, it would obtain the same proof term we got the other way.
Goal by_reflection_1024 = by_constructor_1024.
reflexivity.
Qed.
reflexivity.
Qed.
A more substantial example of reflection: commutative monoids
Section cmonoid.
Variable A : Set.
Variable e : A.
Variable m : A → A → A.
Infix "+" := m.
Hypothesis identl : ∀ a, e + a = a.
Hypothesis identr : ∀ a, a + e = a.
Hypothesis assoc : ∀ a b c, (a + b) + c = a + (b + c).
Hypothesis comm : ∀ a b, a + b = b + a.
Consider a theorem such as:
Theorem monoid_ex1 (a b c : A) : (a + e) + (e + (b + (c + b))) = (e + (b + b) + (c + (a + e))).
Abort.
Abort.
The proof would be a bit tedious.
To reason about such equalities, we create a syntactic representation of them
that we can manipulate in gallina. We use natural numbers to index variables.
Inductive mexp : Type :=
| E : mexp
| Var : nat → mexp
| Add : mexp → mexp → mexp.
| E : mexp
| Var : nat → mexp
| Add : mexp → mexp → mexp.
For example, the following represent the above expressions, with different variables.
Definition LHS : mexp := (Add (Add (Var 0) E) (Add E (Add (Var 1) (Add (Var 2) (Var 1))))).
Definition RHS : mexp := (Add (Add E (Add (Var 1) (Var 1))) (Add (Var 2) (Add (Var 0) E))).
Definition RHS : mexp := (Add (Add E (Add (Var 1) (Var 1))) (Add (Var 2) (Add (Var 0) E))).
Variable var : nat → A.
Fixpoint mdenote (me : mexp) : A :=
match me with
| E ⇒ e
| Var v ⇒ var v
| Add me1 me2 ⇒ mdenote me1 + mdenote me2
end.
Compute mdenote LHS. (* var 0 + e + (e + (var 1 + (var 2 + var 1))) : A *)
Compute mdenote RHS. (* e + (var 1 + var 1) + (var 2 + (var 0 + e)) : A *)
Fixpoint mdenote (me : mexp) : A :=
match me with
| E ⇒ e
| Var v ⇒ var v
| Add me1 me2 ⇒ mdenote me1 + mdenote me2
end.
Compute mdenote LHS. (* var 0 + e + (e + (var 1 + (var 2 + var 1))) : A *)
Compute mdenote RHS. (* e + (var 1 + var 1) + (var 2 + (var 0 + e)) : A *)
We will normalize expressions by first flattening them into lists (dropping E),
which takes care of associativity and identity, and then sorting the result, to take
care of commutativity.
Fixpoint flatten (me : mexp) : list nat :=
match me with
| E ⇒ nil
| Var n ⇒ n :: nil
| Add me1 me2 ⇒ flatten me1 ++ flatten me2
end.
Require Import Sorting.
Import NatSort.
Definition sort_and_flatten (me : mexp) : list nat :=
sort (flatten me).
match me with
| E ⇒ nil
| Var n ⇒ n :: nil
| Add me1 me2 ⇒ flatten me1 ++ flatten me2
end.
Require Import Sorting.
Import NatSort.
Definition sort_and_flatten (me : mexp) : list nat :=
sort (flatten me).
Given a list of variable indices, we can get also interpret it.
Fixpoint mldenote (ls : list nat) : A :=
match ls with
| nil ⇒ e
| n :: tail ⇒ var n + mldenote tail
end.
Compute mldenote (sort_and_flatten LHS). (* var 0 + (var 1 + (var 1 + (var 2 + e))) : A *)
Compute mldenote (sort_and_flatten RHS). (* var 0 + (var 1 + (var 1 + (var 2 + e))) : A *)
match ls with
| nil ⇒ e
| n :: tail ⇒ var n + mldenote tail
end.
Compute mldenote (sort_and_flatten LHS). (* var 0 + (var 1 + (var 1 + (var 2 + e))) : A *)
Compute mldenote (sort_and_flatten RHS). (* var 0 + (var 1 + (var 1 + (var 2 + e))) : A *)
This might be a bit tricky to prove.
Theorem sort_and_flatten_correct : ∀ me, mdenote me = mldenote (sort_and_flatten me).
Admitted.
Admitted.
Now it is easy to prove the main soundness result.
Theorem cmonoid_reflect : ∀ me1 me2,
mldenote (sort_and_flatten me1) = mldenote (sort_and_flatten me2)
→ mdenote me1 = mdenote me2.
Proof.
intros; repeat rewrite sort_and_flatten_correct; assumption.
Qed.
mldenote (sort_and_flatten me1) = mldenote (sort_and_flatten me2)
→ mdenote me1 = mdenote me2.
Proof.
intros; repeat rewrite sort_and_flatten_correct; assumption.
Qed.
Now look at the motivating example again:
Theorem cmonoid_ex1 : (var 0 + e) + (e + (var 1 + (var 2 + var 1)))
= ((e + (var 1 + var 1)) + (var 2 + (var 0 + e))).
Proof.
exact (cmonoid_reflect LHS RHS eq_refl).
Qed.
= ((e + (var 1 + var 1)) + (var 2 + (var 0 + e))).
Proof.
exact (cmonoid_reflect LHS RHS eq_refl).
Qed.
It is still a bit tedious to write down LHS and RHS, but it's a
purely mechanical process to translate the expressions in the
statement into the syntactic type we created. In fact, one can create
a tactic that does this for you. See the references at the end, and
also the tactics built into Coq, like ring and omega.
Reflection is a very powerful tool. It is the basis for the SSReflect
library, which was developed for the proof of the Four Colour Theorem, by
Gonthier's team.
End cmonoid.
Efficiency (only if I have time to spare)
Fixpoint divonacci (n : nat) : nat :=
match n with
| 0 ⇒ 2
| 1 ⇒ 2
| S ((S n'') as n') ⇒ divonacci n' + div2 (divonacci n'')
end.
Compute map divonacci [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10].
(* [2; 2; 3; 4; 5; 7; 9; 12; 16; 22; 30] : list nat *)
Compute divonacci 34. (* 53134 : nat, 5.0s *)
Fixpoint divonacci_aux (n : nat) (dn' dn'' : nat): nat :=
match n with
| 0 ⇒ dn''
| 1 ⇒ dn'
| S n' ⇒ divonacci_aux n' (dn' + div2 dn'') dn'
end.
Definition divonacci_fast (n : nat) : nat := divonacci_aux n 2 2.
Compute map divonacci_fast [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10].
(* [2; 2; 3; 4; 5; 7; 9; 12; 16; 22; 30] : list nat *)
Compute divonacci_fast 34. (* 53134 : nat, 0.2s *)
match n with
| 0 ⇒ dn''
| 1 ⇒ dn'
| S n' ⇒ divonacci_aux n' (dn' + div2 dn'') dn'
end.
Definition divonacci_fast (n : nat) : nat := divonacci_aux n 2 2.
Compute map divonacci_fast [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10].
(* [2; 2; 3; 4; 5; 7; 9; 12; 16; 22; 30] : list nat *)
Compute divonacci_fast 34. (* 53134 : nat, 0.2s *)
More generally, recursive calls can often be eliminated using
an appropriate data structure to record past or pending work.
Endnotes
- There are a variety of evaluation techniques, such as Eval compute,
Eval vm_compute, Eval lazy, etc. See Section 8.7 of the reference manual.
Compute is short for Eval vm_compute in.
- There are many tutorials about recursive functions in Coq, but some
have questionable examples. I borrowed from one due to Benjamin Grégoire:
http://www.di.ens.fr/~zappa/teaching/coq/ecole11/summer/lectures/lec9.pdf
- My example of reflection is based on a blog post of Gregory Malecha:
https://gmalecha.github.io/reflections/2015/10/03/computational-reflection-primer/.
I also combined ideas from http://poleiro.info/posts/2015-04-13-writing-reflective-tactics.html and http://adam.chlipala.net/cpdt/html/Reflection.html. - Good tutorials on recursive functions and proving their correctness, with
sorting as important examples:
Eric Kidd, blog post,
Ömer Ağacan, blog post,
Coq in a Hurry, Bertot,
Altenkirch, Lecture 7.
Sorting is also part of the Coq standard library.
- Another good reference on Inductive types, induction principles, recursive functions, etc. is the Inductive Types chapter of Adam Chlipala's online book.