7. Lists¶

If you have already used a functional programming language like Haskell you know that lists are a very ubiquitous datastructure. Given A : Type there is list A : Type which is the type of finite sequences of elements of A. So for example [1,2,3] : list ℕ or [tt] : list bool, we can also have lists of lists so for example [[],[0],[0,1],[0,1,2]] : list (list ℕ) is a list of lists of numbers. However, all elements of a list must have the same type, hence something like [1,tt] is not permitted.

For any type A we always can construct the empty list [] : list A, which is also called nil (this comes from the phrase not in list and it was used in the first functional language LISP which was first implemented in 1959). On the other hand if we have an element a : A and a list l : List A we can form the list a :: l : list A with a in front of l, e.g. 1 :: [2,3] = [1,2,3]. We call this cons - another heritage from LISP. Indeed the notation using [ .. ] is just a short-hand, e.g.:

[1,2,3] = 1 :: 2 :: 3 :: []

or using nil and cons:

= cons 1 (cons 2 (cons 3 nil))

All lists can be created from nil and cons hence lists can be defined inductively in a manner very similar to ℕ

 inductive list (A : Type)
| nil  : list
| cons : A → list → list
  

This declaration means:

For any type A : Type There is a new type list A : Type,
There are elements nil : list A, and given a : A and l : list A we have cons a l.
All the elements of list A can be generated by nil and then applying cons a finite number of times,
nil and cons a l are different elements,
cons is injective, i.e. given cons a as = cons b bs then we know that a = b and as = bs.

7.1. Basic properties of `list`¶

Most of the basic properties and their proofs are very similar to the natural numbers. First of all we want to show that nil ≠ a :: l which corresponds to 0 ≠ succ n and true ≠ false we had seen before. These properties are called no confusion properties. We can use contradiction as before for natural numbers:

theorem no_conf : ∀ (a:A)(l : list A), nil ≠ a :: l :=
begin
  assume a l,
  contradiction
end

Next we want to show injectivity of ::. Unlike in the case for natural numbers there are two results we want to prove: given a :: l = a' :: l' we need to show that a = a' and l = l'. The second part we can prove the same way as for ℕ, instead of pred we define tl (for tail) and use this in the proof:

def tl : list A → list A
| [] := []
| (a :: l) := l

theorem inj_tl : ∀ (a a':A)(l l' : list A), a :: l = a' :: l' → l = l' :=
begin
  assume a a' l l' h,
  change tl (a :: l) = tl (a' :: l'),
  rewrite h,
end

However the first part turns out more difficult. It seems that we need to define a function hd : list A → list A which extracts the head of a list. But what should we do for the [] case

def hd : list A → list A
| [] := ?
| (a :: l) := a

There is really nothing we can plug in here because A may be empty but there is an empty list over the empty type. We could define a function that returns an error, i.e.

 def hd : list A → option (list A)
 | [] := none
 | (a :: l) := some a
  

But this doesn’t help us to prove injectivity. I leave this as a challenge because the only solution I know involves dependent types which I am not covering in this course.

However, the injection tactic which we have seen before works in this case:

theorem inj_hd : ∀ (a a':A)(l l' : list A), a :: l = a' :: l' → a = a' :=
begin
  assume a a' l l' h,
  injection h,
end

injection also works in the proof of inj_tl alleviating the need to define tl.

7.2. The free monoid¶

In the section on natural numbers we encountered a monoid using + and 0 on ℕ. There is a very similar monoid for lists using ++ and []. Here ++ (append) is the operation that appends two lists e.g. [1,2]++[3,4] = [1,2,3,4]. We define it recursively:

definition append : list A → list A → list A
| []       l := l
| (h :: s) t := h :: (append s t)

local notation l₁ ++ l₂ := append l₁ l₂

I am cheating a bit here because like +, ++ is already defined in the Lean prelude.

In the definition of append we are using pattern matching and structural recursion on lists. This works in the same way as for natural numbers: we can recursively use append for the smaller list s. To prove that this forms a monoid we need induction. However, different from +, now left neutrality is easy because it follows from the definition:

theorem app_lneutr : forall l : list A, [] ++ l = l :=
begin
  assume l,
  reflexivity,
end

But now right neutrality requires induction:

theorem app_rneutr : forall l : list A, l ++ [] = l :=
begin
  assume l,
  induction l with a l' ih,
  reflexivity,
  apply congr_arg (cons a),
  exact ih,
end

Induction for lists means to prove something for all lists you prove it for the empty list [] and assuming that it holds for a list l you show that it holds for a :: l for any a. Comparing the proof with the one for left neutrality for + we see that it is almost the same proof replacing congr_arg succ with congr_arg (cons a).

The switch between left and right neutrality is a consequence of that we have defined + by recursion over the 2nd argument while we have defined ++ by recursion over the first. We could have defined + by recursion over the first argument and then this difference would have disappeared (I prefer this but the designers of the lean standard library had different ideas). We couldn’t have defined ++ by recursion over the 2nd argument - do you see why?

The proof of associativity of ++ again is very similar to the one for +. The main difference is that now we do induction over the first argument instead of the last - again this is caused by the fact that we now use recursion over the first instead of the 2nd argument.

theorem app_assoc :
      forall l₁ l₂ l₃ : list A , (l₁ ++ l₂) ++ l₃ = l₁ ++ (l₂ ++ l₃) :=
begin
  assume l₁ l₂ l₃,
  induction l₁ with a l₁' ih,
  reflexivity,
  dsimp [(++),list.append],
  apply congr_arg (cons a),
  exact ih,
end

For + we also proved commutativity that is m + n = n + m, but clearly this doesn’t hold for ++, e.g. [1,2] ++ [3,4] = [1,2,3,4] while [3,4] ++ [1,2] = [3,4,1,2].

Indeed list A with ++ and [] is the free monoid over A which intuitively means that only the monoid equations hold but now additional laws like commutativity. In this sense this monoid is free not to follow any laws apart from the monoid ones.

7.3. Reverse¶

Since the list monoid is not commutative, order matters. In particular we can reverse a list. That is we are going to define a function which given a list like [1,2,3] produces the list with the elements in reverse order, in this case [3,2,1].

How do we reverse a list? Recursively! The reverse of the empty list is the empty list and the reverse of a list of the form a :: l is the reverse of l with a put in the end. So for example the reverse of [1,2,3] = 1 :: [2,3] is the reverse of [2,3], i.e. [3,2] with 1 put at the end giving [3,2,1].

To define reverse we need an auxiliary operation which puts an element at the end. We call this operation snoc because it is the reverse of cons. We could define snoc using ++, but for our purposes it is slightly more convenient to define it directly using recursion:

def snoc : list A → A → list A
| [] a := [a]
| (a :: as) b := a :: (snoc as b)

#reduce (snoc [1,2] 3)

Using snoc we are ready to implement the recursive recipe for reversing a list:

def rev : list A → list A
| [] := []
| (a :: as) := snoc (rev as) a

#reduce rev [1,2,3]

A central property of rev is that it is self-inverse that is if we reverse a list twice we obtain the original list, e.g. rev (rev [1,2,3])) = rev [3,2,1] = [1,2,3]

Ok, let’s try and prove this using list induction:

theorem revrev : ∀ as : list A , rev (rev as) = as :=
begin
  assume as,
  induction as with a as' ih,
  reflexivity,
  dsimp [rev],
  sorry,

I got stuck in this proof, I am in the following situation:

ih : rev (rev as') = as'
⊢ rev (snoc (rev as') a) = a :: as'

I cannot apply my induction hypothesis because my current goal doesn’t contain an expression of the form rev (rev as'). However, I should be able to exploit the fact that to reverse a snoc is the same as a cons of the reverse. Then we can reason:

rev (snoc (rev as') a)
= a :: rev (rev as')
= a :: as'

This leads us to proving the following lemma:

theorem revsnoc :
              ∀ a : A, ∀ as : list A, rev (snoc as a) = a :: rev as :=
begin
  assume a as,
  induction as with b as' ih,
  reflexivity,
  dsimp [snoc,rev],
  rewrite ih,
  dsimp [snoc],
  reflexivity,
end

This is exactly what was missing to complete the proof of revrev:

theorem revrev : ∀ as : list A , rev (rev as) = as :=
begin
  assume as,
  induction as with a as' ih,
  reflexivity,
  dsimp [rev],
  rewrite revsnoc,
  rewrite ih,
end

This is a nice example about the art of proving which is a bit like putting stepping stones into a stream to cross it without getting wet feet. When getting stuck with our induction, then looking at the point where we are stuck often leads us to identifying another property which we can prove and which helps us to complete the original proof. There is no fixed method to identify a good auxiliary property (a lemma) it is a skill which improves by practice.

Here is another problem to do with reverse which you can use to practice this skill: if you have attended Prof Hutton’s Haskell course you will know that the above definition of reverse is very inefficient, indeed it has a quadratic complexity. A better definition is the following:

def revaux : list A → list A → list A
| [] bs := bs
| (a :: as) bs := revaux as (a :: bs)

def fastrev (l : list A) : list A
  := revaux l []

#reduce fastrev [1,2,3]

fastrev only has linear complexity. However, we should convince ourselves that it behaves in the same way as rev, that is we should prove the following theorem:

theorem fastrev_thm : ∀ as : list A , fastrev as = rev as :=
begin
  sorry,
end

I leave it as an exercise to figure out what lemma(s) we need. Also it may be useful to employ some properties we have already proven.

7.4. Functors and Naturality¶

The list operator doesn’t just work on types but also on functions. That means we can write a map function which applies a function to every element of a list:

def map_list : (A → B) → list A → list B
| f [] := []
| f (a :: as) := (f a) :: (map_list f as)

Actually map_list satisfies some important equations. To understand them we need to have a closer look at functions. What are the basic ingredients of the function type. There are two: the identity function and function composition.

def id : A → A
| a := a

def comp : (B → C) → (A → B) → A → C
| g f a := g (f a)

local notation (name := comp) f ∘ g := comp f g

We write f ∘ g for the composition of f and g. You may notice that composition seems to be defined backwards, that is we first run g and then f. The reason for this is that indeed function application also works backwards: first we evaluate the arguments and then the function. Since composition is defined via application it seems a good idea not to change the direction.

There are some basic equations involving id and circ namely:

theorem idl : ∀ f : A → B, id ∘ f = f := sorry

theorem idr : ∀ f : A → B, f ∘ id = f := sorry

theorem assoc : ∀ h : C → D, ∀ g : B → C,
  ∀ f : A → B, (h ∘ g) ∘ f = h ∘ (g ∘ f) := sorry

These equations look like what? Yes, like a monoid. However, this isn’t really a monoid because there isn’t one fixed type on which the operations are defined but a whole family of types. This structure is called a category, which is the basic notion in the field of category theory.

To prove these equations we need to know how to prove equations about functions: this is given by a lean axiom called funext which says to prove that two functions are equal you need to show that they produce the same outputs for all inputs. In predicate logic: given f g : A → B

∀ a : A, f a = g a → f = g

Using funext the laws are easy to show. I just give you the first:

theorem idl : ∀ f : A → B, id ∘ f = f :=
begin
  assume f,
  apply funext,
  assume x,
  reflexivity,
end

I leave the other two as an easy exercise.

Now back to map_list: we need to show that it preserves identity and composition, that is mapping the identity function gives rise to the identity and mapping a composition of two functions is the composition of mapping them.

theorem map_id : map_list (@id A) = id := sorry

theorem map_comp : ∀ g : B → C, ∀ f : A → B,
  map_list (g ∘ f) = (map_list g) ∘ (map_list f) := sorry

I am writing @id A instead of just id because I need to tell lean that I want to use the type variable A here, so I have to give it explicitly. @id is id with the implicit parameter A : Type made explicit.

We can prove the laws using funext and list induction. Here is the first one:

Again I leave the 2nd one as an exercise. In the language of category theory the fact that list comes with an operation on functions (map_list) satisfying these laws means that list is a functor.

Now rev : list A → list A is an operation on lists that is natural which means it works on all types A uniformly. This can be expressed by the following theorem using map_list :

theorem nat_rev : ∀ f : A → B, ∀ l : list A,
  map_list f (rev l) = rev (map_list f l) := sorry

This says that it is the same wether you first map a function and then reverse or you first reverse and then run the function. For example if you have a function is_even : ℕ → bool and a list [1,2,4] you can first map is_even obtaining [ff,tt,tt] and then reverse obtaining [tt,tt,ff] or you first reverse obtaining [4,2,1] and then map is_even also ending up with [tt,tt,ff]. The ony way rev could break this woud be by behaving differently on list if natural numbers and lists of booleans - but this woud be very unnatural.

Instead of proving the theorem (which I leave as an exercise - you will need a lemma) let me remark that equations like the one above in category theory are often shown using commutative diagrams in this case, assuming f : A → B

_images/natural.pdf

Following the arrows labelled by functions corresponds to function composition and different paths express that these compositions have the same results, we say the diagram commutes. This also suggests that we could have written the naturality theorem using compositions :

rev ∘ map_list f = map_list f ∘ rev

This is called a point-free style since we avoid talking about elements.

7.5. Insertion sort¶

Finally, we are going to look at a slightly more interesting algorithm: insertion-sort. Insertion-sort is quite an inefficient algorithm because it has quadratic complexity while the best sorting algorithms have a \(n \log{n}\) complexity. However, insertion-sort is especially easy to implement and verify.

To keep things simple I will only sort lists of natural numbers wrt ≤ : ℕ → ℕ → Prop using the fact that we have an implementation in form of ble : ℕ → ℕ → bool. That means for the following we assume in particular that ble is decidable, i.e.

lemma Le2ble : ∀ m n : ℕ, m ≤ n → ble m n = true := sorry
lemma ble2Le : ∀ m n : ℕ, ble m n = true → m ≤ n := sorry

7.5.1. Implementing sort¶

So our goal is to implement a function sort : list ℕ → list ℕ that sorts a given list, e.g. sort [6,3,8,2,3] = [2, 3, 3, 6, 8].

Insertion sort can be easily done with a deck of cards: you start with an unsorted deck and you take a card form the top each time and insert it at the appropriate position in the already sorted pile. Inserting a card means you have to go through the sorted pile until you find a card that is greater and then you insert it just before that. Ok, there is also the special case that the sorted pile is empty in which case you just make a pile out of one card. Here is the recursive function implementing the algorithm:

def ins : ℕ → list ℕ → list ℕ
| a [] := [a]
| a (b :: bs) := if ble a b then a :: b :: bs else b::(ins a bs)

#reduce ins 6 [2, 3, 3, 8]

Using ins it is easy to implement sort by inserting one element after the other, recursively:

def sort : list ℕ → list ℕ
| [] := []
| (a :: as) := ins a (sort as)

#reduce (sort [6,3,8,2,3])

7.5.2. Specifying sort¶

To verify that the algorithm indeed sorts we need to specify what it means for a list to be sorted. That is we need to define a predicate Sorted : list ℕ → Prop. To define this we also need an auxiliary predicate which tells us that an element we want to add to a sorted list is smaller then the first element of the list Le_list : ℕ → list ℕ → Prop.

To define these predicates we are going to use inductive definitions of predicates which are similar to the inductive datatypes we have already seen. The basic idea is that we state some rules how to prove the predicate like we used constructors for inductive types like zero , succ or nil and cons. We also presume that using those rules is the only way to prove the predicate. So for example we define Le_list:

inductive Le_list : ℕ → list ℕ → Prop
| le_nil : ∀ n:ℕ, Le_list n []
| le_cons : ∀ m n : ℕ, ∀ ns : list ℕ, m ≤ n → Le_list m (n :: ns)

That is the we can prove that any element fits into the empty list and that if we have an element that is smaller then the head of a list this is ok too and these are the only ways to establish this fact.

So for example we can prove:

example : Le_list 3 [6,8] :=
begin
 apply le_cons,
 apply ble2Le,
 trivial,
end
 

Note that I am using one of the direction of the proof that ble decides ≤. This was actually left as an exercise in the chapter on natural numbers.

Using the principle that the two rules le_nil and le_cons are the only way to prove Le_list we can also invert le_cons:

lemma le_cons_inv : ∀ m n : ℕ, ∀ ns : list ℕ,
              Le_list m (n :: ns) → m ≤ n :=
begin
  assume m n ns h,
  cases h with _ _ _ _ mn,
  exact mn
end
 

I am using cases here again. The idea is that the only way to prove Le_list m (n :: ns) are le_nil and le_cons but le_nil can be ruled out since it only proves it for the empty list. Hence we must have used le_cons but then we know that we have assumed mn : m ≤ n and we can use this.

The definition of Le_list is not recursive, hence cases is sufficient. However, the definition of Sorted is actually recursive:

inductive Sorted : list ℕ → Prop
| sorted_nil : Sorted []
| sorted_cons : ∀ n : ℕ, ∀ ns : list ℕ, Le_list n ns
              → Sorted ns → Sorted (n :: ns)

We say that the empty list is sorted, and given a sorted list and an element that is smaller than the first then the list obtained by consing this element in front of the list is sorted. And these are the only ways to prove sortedness.

7.5.3. Verifying sort¶

We now want to prove that sort only produces sorted outputs. What principle are we going to use? Exactly: induction over lists. The base case is easy: the empty list is sorted by definition.

theorem ins_sort_sorts : ∀ ns : list ℕ, Sorted (sort ns) :=
begin
  assume ns,
  induction ns with a as' ih,
  apply sorted_nil,
  dsimp [sort],
  sorry
end

In the cons case we are left in the following state:

ih : Sorted (sort as')
⊢ Sorted (ins a (sort as'))

We know that sort sorts for as' and from this we have to conclude that also ins a (sort as') is sorted. This suggests the following lemma:

lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
  assume n ns sns,
  sorry,
end

To prove this lemma we actually have a choice we can either do induction over the list ns or the derivation of Sorted ns. I found it is easier to the former.

lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
  assume n ns sns,
  induction ns,
  sorry,
end

In the base case we need to show Sorted (ins n nil) which reduces to Sorted [n]. It is easy to show that singletons (lists with just one element) are always sorted:

lemma sorted_sgl : ∀ n : ℕ, Sorted [n] :=
begin
  assume n,
  apply sorted_cons,
  apply le_nil,
  apply sorted_nil,
end

Ok this kills the base case, the cons case is certainly more difficult:

lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
  assume n ns sns,
  induction ns,
  apply sorted_sgl,
  dsimp [ins],
  sorry,
end

We end up with the following goal:

ns_ih : Sorted ns_tl → Sorted (ins n ns_tl),
sns : Sorted (ns_hd :: ns_tl)
⊢ Sorted (ite ↥(ble n ns_hd) (n :: ns_hd :: ns_tl) (ns_hd :: ins n ns_tl))

ite is just the internal code for if-then-else. Clearly the proof is stuck at the condition. We need to do a case analysis on the outcome of ble n ns_hd. The tt case is pretty straightforward, so let’s do it first

lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
  assume n ns sns,
  induction ns,
  apply sorted_sgl,
  dsimp [ins],
  cases eq : ble n ns_hd,
  sorry,
  apply sorted_cons,
  apply le_cons,
  apply ble2Le,
  exact eq,
  exact sns
end

The ff case is more subtle, we are in the following situation:

ns_ih : Sorted ns_tl → Sorted (ins n ns_tl),
eq : ble n ns_hd = ff,
sns_a_1 : Sorted ns_tl,
sns_a : Le_list ns_hd ns_tl
⊢ Sorted (ns_hd :: ins n ns_tl)

We know that ble n ns_hd = ff and hence ¬ (n ≤ ns_hd) but we actually need ns_hd ≤ n. This actually follows from trichotomy, which we discussed in the section on natural numbers, so let’s just assume it in the moment:

lemma ble_lem : ∀ m n : ℕ , ble m n = ff → n ≤ m := sorry

However, we still need another lemma because we need to know that if ns_hd ≤ n then it will still fit in front of ins n ns_tl, that is Le_list ns_hd is n ns_tl holds. This leads to another lemma:

lemma ins_lem_le_list : ∀ m n : ℕ, ∀ ns : list ℕ , n ≤ m
              → Le_list n ns → Le_list n (ins m ns) :=
begin
  sorry,
end

You see what I mean with the stepping stones? Anyway, now we have all the stones together we can pass to the other side - here is the complete proof:

lemma sorted_sgl : ∀ n : ℕ, Sorted [n] :=
begin
  assume n,
  apply sorted_cons,
  apply le_nil,
  apply sorted_nil,
end

lemma ins_lem_le_list : ∀ m n : ℕ, ∀ ns : list ℕ , n ≤ m → Le_list n ns → Le_list n (ins m ns) :=
begin
  assume m n ns nm nns,
  cases ns with hd tl,
  apply le_cons,
  exact nm,
  dsimp [ins],
  cases eq : ble m hd,
  cases nns with _ _ _ _ nhd,
  apply le_cons,
  exact nhd,
  apply le_cons,
  exact nm,
end

lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
  assume n ns sns,
  induction ns,
  apply sorted_sgl,
  dsimp [ins],
  cases eq : ble n ns_hd,
  cases sns with nx nsx lex sx,
  apply sorted_cons,
  apply ins_lem_le_list,
  apply ble_lem,
  exact eq,
  exact lex,
  apply ns_ih,
  exact sx,
  apply sorted_cons,
  apply le_cons,
  apply ble2Le,
  exact eq,
  exact sns
end

theorem ins_sort_sorts : ∀ ns : list ℕ, Sorted (sort ns) :=
begin
  assume ns,
  induction ns with a as' ih,
  apply sorted_nil,
  dsimp [sort],
  apply ins_lem,
  exact ih,
end

7.5.4. Permutations¶

Are we done now? Have we verified sort? That is if you buy a tin and it says we have proven the theorem ins_sort_sorts are you then satisfied that the program sort does the job of sorting?

Not so! We have missed one important aspect of sort, namely that the output should be a rearrangement, i.e. a permutation of the input. Otherwise an implementation of sort that always returns the empty list would be ok.

We can specify the relation Perm that one list is a permutation of another inductively, using an auxiliary relation Insert which means that a list is obtained form another by inserting an element somewhere.

inductive Insert : A → list A → list A → Prop
| ins_hd : ∀ a:A, ∀ as : list A,Insert a as (a :: as)
| ins_tl : ∀ a b:A, ∀ as as': list A, Insert a as as' → Insert a (b :: as) (b :: as')

inductive Perm : list A → list A → Prop
| perm_nil : Perm [] []
| perm_cons : ∀ a : A, ∀ as bs bs' : list A,Perm as bs → Insert a bs bs' → Perm (a :: as) bs'

Using these definitions it is relatively straightforward to establish that sort permutes its input:

 theorem sort_perm : ∀ ns : list ℕ , Perm ns (sort ns) :=
 begin
   sorry
 end
 

I leave it as an exercise to complete the proof which also involves identifying a rather obvious lemma.

We would expect Perm to be an equivalence relation, i.e. reflexive, transitive and symmetric. While the proof of reflexivity is straightforward, the other two are rather hard. I leave this as a challenge.

7.5.5. Parametrisation¶

I have presented sort for a rather special case, i.e. for ≤ on the natural numbers. However, all my constructions and proofs should work in a general case. Can you parametrize both the function and the proof so that it is applicable in much more generality?