Computation in Coq

Dan Christensen, UWO, June 7, 2016
Outline:
  • Recursive functions
  • Proving things about recursive functions
  • Lists and nested recursive functions
  • Trees and more complicated recursive functions
  • Program Extraction
  • Using recursive functions to prove things (a.k.a. reflection)
  • Efficiency
  • Endnotes
This web page is generated from a Coq source file using coqdoc. You can view this file and the Coq source file via http://jdc.math.uwo.ca/HoTT. Direct link: computation.v. Tested with Coq 8.4.

First example


Require Import Bool Arith.

Definition nonzero (n : nat) : bool :=
  match n with
  | 0 ⇒ false
  | S mtrue
  end.

In an interactive session, the following would display the result of "running" this function.
Compute nonzero 0. (* false : bool *)
Compute nonzero 3. (* true : bool *)

"Computation" in Coq is simply the process of using definitional equality to simplify a term until it is in normal form, e.g. nonzero 3 nonzero (S (S (S 0))) true.

Recursive functions

Note that match is a primitive concept in Coq, which when combined with Fixpoint is equivalent to the induction principles associated to inductive types. Thus, they can be used to make recursive functions.
Fixpoint factorial (n : nat) : nat :=
  match n with
  | 0 ⇒ 1
  | S mn × factorial m
  end.

Compute factorial 7. (* 5040 : nat *)

Definition factorial_ex1 : factorial 7 = 7×6×5×4×3×2×1 := eq_refl.

It is essential for logical consistency that all functions terminate on all inputs. For example, the following would type check, but doesn't terminate, so Coq doesn't accept the definition.
Fail Fixpoint loop (n : nat): Empty_set := loop n.
(* Error: Recursive call to loop has principal argument equal to 
   "n" instead of a subterm of "n". *)


If it were accepted, then loop 0 would be of type Empty_set. Oops.
For Coq to accept a definition using Fixpoint, there must be an argument of your function so that all of your recursive calls pass in a proper subterm of that argument.
For example, m is a proper subterm of S m. We'll get to more advanced examples of recursion soon.

Aside: Laziness

Coq is in general lazy about computation. It does it when needed to check the type of a term, e.g. checking that eq_refl has type factorial 7 = 7*6*5*4*3*2*1, but otherwise leaves things alone. This laziness is crucial for efficiency.
For example, factorial 9 causes a stack overflow on my machine, but the following examples type check without a problem.
Definition factorial_ex2 : factorial 100 = factorial 100 := eq_refl.

Definition factorial_ex3 : nat := factorial 100.

Proving things about recursive functions

Here's another recursive function, which divides by 2 and rounds down. Note that it calls itself on a sub-sub-term of its input.
Fixpoint div2 (n : nat) :=
  match n with
  | 0 ⇒ 0
  | S 0 ⇒ 0
  | S (S p) ⇒ S (div2 p)
  end.

We ask Coq to create an induction principle for this function.
Functional Scheme div2_ind := Induction for div2 Sort Prop.

We use this to prove a Lemma. The cases match the cases in div2. We could also use ordinary induction, twice.
Lemma div2_le : n, div2 n × 2 ≤ n.
Proof.
  intro n.
  functional induction div2 n.
  - (* Goal: 0 × 2 0 *)
    auto.
  - (* Goal: 0 × 2 1 *)
    auto.
  - (* Goal: (S (div2 p)) × 2 S (S p) *)
    (* Given: IHN0 : div2 p × 2 p *)
    simpl.
    (* Goal: S (S (div2 p × 2)) S (S p) *)
    auto with arith.
Qed.

Lists

Lists are defined in this way.
Inductive mylist (A : Type) :=
| mynil : mylist A
| mycons : Amylist Amylist A.

Definition mylist_ex1 : mylist nat := mycons nat 0 (mycons nat 1 (mynil nat)).

We'll use Coq's list type, which comes with nice notations.
Require Import List.
Import ListNotations.

Definition list_ex1 : list nat := [1; 2; 4; 3].
Definition list_ex2 : list nat := 1 :: 2 :: 4 :: 3 :: nil.
Definition list_ex_eq : list_ex1 = list_ex2 := eq_refl.

Fixpoint last {A : Type} (default : A) (lst : list A) : A :=
  match lst with
  | nildefault
  | hd :: nilhd
  | hd :: taillast default tail
  end.

Compute last 0 list_ex1. (* 3 : nat *)

last is inductively defined by structural recursion on the argument lst.
tail is a proper subterm of the list head :: tail.

Nested structural recursion

Now suppose I want to write a function that merges two sorted lists of natural numbers into a new sorted list. A naive try would be:
Fail Fixpoint merge (l1 l2 : list nat) : list nat :=
  match l1 with
  | nill2
  | h1 :: t1match l2 with
                | nill1
                | h2 :: t2
                  if leb h1 h2 then
                    h1 :: merge t1 l2
                  else
                    h2 :: merge l1 t2
                end
  end.
(* Error: Cannot guess decreasing argument of fix. *)

Here's one way to accomplish this.
Fixpoint merge (l1 l2 : list nat) : list nat :=
  match l1 with
  | nill2
  | h1 :: t1 ⇒ (fix merge_aux (l2' : list nat) :=
                   match l2' with
                   | nill1
                   | h2 :: t2
                     if leb h1 h2 then
                       h1 :: merge t1 l2'
                     else
                       h2 :: merge_aux t2
                   end) l2
  end.

The inner recursive function merge_aux is the "merge with l1" function. It is created each time we reach that spot, adapted to the particular l1 we have then. It calls the outer merge and itself recursively. The outer merge has l1 decreasing, and the inner merge_aux has l2' decreasing.

Compute merge [1; 3; 3; 6] [0; 1; 2; 5; 7].
(* [0; 1; 1; 2; 3; 3; 5; 6; 7] : list nat *)

Trees

We'll study a type of trees whose nodes are labelled with natural numbers, and whose children are ordered.

Inductive tree :=
  node : natlist treetree.

One parent node, with three children nodes.
Definition tree_ex1 : tree :=
  node 7 [node 6 nil; node 3 nil; node 3 nil].

Node 4 has two children, and the second of those has three children.
Definition tree_ex2 : tree :=
  node 4 [node 2 nil; tree_ex1].

Return label of the "leftmost" leaf node.
Fixpoint leftmost_leaf (t : tree) : nat :=
  match t with
  | node n niln
  | node n (child::rest) ⇒ leftmost_leaf child
  end.
Coq considers child a subterm of node n (child::rest), so accepts the above.

Compute leftmost_leaf tree_ex1. (* 6 : nat *)
Compute leftmost_leaf tree_ex2. (* 2 : nat *)

This function searches through the tree for a node with a particular label.
Fixpoint occur (m : nat) (t : tree) : bool :=
  match t with
  | node n children
    if beq_nat m n then true
    else (fix occur_list (m' : nat) (l : list tree) : bool :=
            match l with
            | nilfalse
            | hd::tlorb (occur m' hd) (occur_list m' tl)
            end) m children
  end.
We need nested recursion, because we need to search the tail list. One could try calling occur m (node n tl), but node n tl isn't a subterm.

Compute occur 2 tree_ex2. (* true : bool *)
Compute occur 9 tree_ex2. (* false : bool *)

Another approach is a recursive function defined from the start to look through a list of trees.
Fail Fixpoint occur_list (m : nat) (l : list tree) {struct l} : bool :=
  match l with
  | nilfalse
  | (node n children)::tlbeq_nat m n || occur_list m children || occur_list m tl
  end.
(* Error: Recursive call to occur_list has principal argument equal to
"children" instead of "tl". *)

I don't know why Coq accepts child as a subterm of node n (child::rest) but not children as a subterm of (node n children)::tl. If it had worked, one would then define:
Fail Definition occur' (m : nat) (t : tree) : bool := occur_list m [t].

In situations like this, it is sometimes fruitful to change the definition of the type.
Inductive tree' :=
  node' : natforesttree'
with forest :=
| fnil : forest
| fcons : tree'forestforest.
We have defined mutually inductive types, with forest essentially the same as list tree.
Now Coq will accept:
Fixpoint occur_forest (m : nat) (l : forest) : bool :=
  match l with
  | fnilfalse
  | fcons (node' n children) tlbeq_nat m n || occur_forest m children || occur_forest m tl
  end.

Definition occur' (m : nat) (t :tree') : bool := occur_forest m (fcons t fnil).

One can also define mutually recursive functions over mutually inductive types. The above two could be better organized in this way.

More complicated recursion

Sometimes recursion isn't structural. Then you need to prove to Coq that your function terminates. One way is to specify an argument X of type T and a measure function m : T nat such that m X decreases on each recursive call.

Require Import Recdef.

Function log2 (n : nat) {measure (fun mm) n} : nat :=
  match n with
  | S (S n') ⇒ S (log2 (div2 n))
  | _ ⇒ 0
  end.
Proof.
  (* Goal:  n n0 n' : nat, n0 = S n' n = S (S n') div2 (S (S n')) < S (S n') *)
  intros n n0 n'.
  intros H1 H2.
  (* Goal: div2 (S (S n')) < S (S n') *)
  auto with arith. (* I'm surprised this works! *)
Defined.

Program Fixpoint is another tool for defining a function using a measure. It is more flexible, in that the measure can depend on several arguments. (It's also an expression, rather than a function.) And the code it produces is more concise and faster, at least in one example I tested (game_half_sum).
Function doesn't allow nested recursion, mutual recursion, dependent types, or higher order functions. Program Fixpoint allows most or all of these, but doesn't generate as many inductive principles.

Code extraction

Once you've defined a function (and possibly proved it correct), you can extract code in a variety of languages. The extraction procedure is smart enough to throw away proofs that don't affect the result.
Extraction Language Ocaml.
Extraction "log2.ml" log2. (* See log2.ml *)

One can also extract to Haskell and Scheme. An entire C compiler, called CompCert, has been written in Coq, proved to be correct relative to a specification, and extracted to Ocaml!

(* Added later:  See
     https://www.irif.univ-paris-diderot.fr/~letouzey/download/letouzey_extr_cie08.pdf
   for more about extraction (and recursion in general).  And see the game_sum example.
   Note that the extracted code is generally much faster than running the code in Coq. *)


Proof using reflection

The rough idea is to prove something by translating it into a question that becomes true by definitional equality. Here is a toy example.

Inductive Even : natProp :=
| Even_O : Even 0
| Even_SS : n, Even nEven (S (S n)).

Goal Even 0.
  exact Even_O.
Qed.

Goal Even 4.
  exact (Even_SS 2 (Even_SS 0 Even_O)).
Qed.

We can even use repeat constructor to prove that large numbers are Even.

Lemma by_constructor_1024 : Even 1024.
Proof.
  repeat constructor. (* 1.5s *)
Defined. (* 1.6s *)
The time is roughly quadratic in the value! For Even 2n, we get n nested occurrences of Even_SS, but each of those requires an argument m which in unary requires m constructors.
An alternative approach.
Fixpoint is_even (n : nat) : bool :=
  match n with
  | 0 ⇒ true
  | 1 ⇒ false
  | S (S n') ⇒ is_even n'
  end.

Soundness theorem. Proved using match, to mirror the function.
Fixpoint is_even_sound (n : nat) : is_even n = trueEven n :=
  match n return is_even n = trueEven n with
  | 0 ⇒ fun _Even_O
  | 1 ⇒ fun pf : false = truematch pf with eq_refltt end
  | S (S n') ⇒ fun pf : is_even n' = trueEven_SS _ (is_even_sound n' pf)
  end.
Notice how we are using computational properties of is_even to make the types match up. This is why decidability of definitional equality is essential for type checking.
(The second line is pretty confusing, as is the proof that you can Print. I can explain on the board later if there is time.)

Lemma by_reflection_1024 : Even 1024.
Proof.
  apply is_even_sound.
  (* Goal: is_even 1024 = true *)
  reflexivity. (* 0.005s *)
Defined. (* 0.02s *)

Lemma by_reflection_big : Even 16384.
Proof.
  apply is_even_sound.
  reflexivity. (* 0.09s *)
Defined. (* 0.27s *)

The speed is because the proof term is concise.
Print by_reflection_1024. (* is_even_sound 1024 eq_refl : Even 1024 *)

Coq doesn't need to fully evaluate that proof term to check its type. (Laziness!) But if it did, it would obtain the same proof term we got the other way.
Goal by_reflection_1024 = by_constructor_1024.
  reflexivity.
Qed.

A more substantial example of reflection: commutative monoids

The point of reflection isn't really speed. It's the ability to have Coq automatically handle tedious proofs.

Section cmonoid.
  Variable A : Set.
  Variable e : A.
  Variable m : AAA.
  Infix "+" := m.

  Hypothesis identl : a, e + a = a.
  Hypothesis identr : a, a + e = a.
  Hypothesis assoc : a b c, (a + b) + c = a + (b + c).
  Hypothesis comm : a b, a + b = b + a.

Consider a theorem such as:
  Theorem monoid_ex1 (a b c : A) : (a + e) + (e + (b + (c + b))) = (e + (b + b) + (c + (a + e))).
  Abort.
The proof would be a bit tedious.
To reason about such equalities, we create a syntactic representation of them that we can manipulate in gallina. We use natural numbers to index variables.
  Inductive mexp : Type :=
  | E : mexp
  | Var : natmexp
  | Add : mexpmexpmexp.

For example, the following represent the above expressions, with different variables.
  Definition LHS : mexp := (Add (Add (Var 0) E) (Add E (Add (Var 1) (Add (Var 2) (Var 1))))).
  Definition RHS : mexp := (Add (Add E (Add (Var 1) (Var 1))) (Add (Var 2) (Add (Var 0) E))).

Next, we write an interpretation function, given variables var n of type A.
  Variable var : natA.

  Fixpoint mdenote (me : mexp) : A :=
    match me with
      | Ee
      | Var vvar v
      | Add me1 me2mdenote me1 + mdenote me2
    end.

  Compute mdenote LHS. (* var 0 + e + (e + (var 1 + (var 2 + var 1))) : A *)
  Compute mdenote RHS. (* e + (var 1 + var 1) + (var 2 + (var 0 + e)) : A *)

We will normalize expressions by first flattening them into lists (dropping E), which takes care of associativity and identity, and then sorting the result, to take care of commutativity.
  Fixpoint flatten (me : mexp) : list nat :=
    match me with
      | Enil
      | Var nn :: nil
      | Add me1 me2flatten me1 ++ flatten me2
    end.

  Require Import Sorting.
  Import NatSort.

  Definition sort_and_flatten (me : mexp) : list nat :=
    sort (flatten me).

Given a list of variable indices, we can get also interpret it.
  Fixpoint mldenote (ls : list nat) : A :=
    match ls with
      | nile
      | n :: tailvar n + mldenote tail
    end.

  Compute mldenote (sort_and_flatten LHS). (* var 0 + (var 1 + (var 1 + (var 2 + e))) : A *)
  Compute mldenote (sort_and_flatten RHS). (* var 0 + (var 1 + (var 1 + (var 2 + e))) : A *)

This might be a bit tricky to prove.
  Theorem sort_and_flatten_correct : me, mdenote me = mldenote (sort_and_flatten me).
  Admitted.

Now it is easy to prove the main soundness result.
  Theorem cmonoid_reflect : me1 me2,
    mldenote (sort_and_flatten me1) = mldenote (sort_and_flatten me2)
    → mdenote me1 = mdenote me2.
  Proof.
    intros; repeat rewrite sort_and_flatten_correct; assumption.
  Qed.

Now look at the motivating example again:
  Theorem cmonoid_ex1 : (var 0 + e) + (e + (var 1 + (var 2 + var 1)))
                        = ((e + (var 1 + var 1)) + (var 2 + (var 0 + e))).
  Proof.
    exact (cmonoid_reflect LHS RHS eq_refl).
  Qed.
It is still a bit tedious to write down LHS and RHS, but it's a purely mechanical process to translate the expressions in the statement into the syntactic type we created. In fact, one can create a tactic that does this for you. See the references at the end, and also the tactics built into Coq, like ring and omega.
Reflection is a very powerful tool. It is the basis for the SSReflect library, which was developed for the proof of the Four Colour Theorem, by Gonthier's team.

End cmonoid.

Efficiency (only if I have time to spare)

Here's an example illustrating a strategy for efficient recursion.

Fixpoint divonacci (n : nat) : nat :=
  match n with
  | 0 ⇒ 2
  | 1 ⇒ 2
  | S ((S n'') as n') ⇒ divonacci n' + div2 (divonacci n'')
  end.

Compute map divonacci [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10].
(* [2; 2; 3; 4; 5; 7; 9; 12; 16; 22; 30] : list nat *)

Compute divonacci 34. (* 53134 : nat, 5.0s *)

Do the iteration n times, starting with dn'' and dn'.
Fixpoint divonacci_aux (n : nat) (dn' dn'' : nat): nat :=
  match n with
  | 0 ⇒ dn''
  | 1 ⇒ dn'
  | S n'divonacci_aux n' (dn' + div2 dn'') dn'
  end.

Definition divonacci_fast (n : nat) : nat := divonacci_aux n 2 2.

Compute map divonacci_fast [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10].
(* [2; 2; 3; 4; 5; 7; 9; 12; 16; 22; 30] : list nat *)

Compute divonacci_fast 34. (* 53134 : nat, 0.2s *)

More generally, recursive calls can often be eliminated using an appropriate data structure to record past or pending work.

Endnotes