7. Lists¶
If you have already used a functional programming language like
Haskell you know that lists are a very ubiquitous datastructure. Given
A : Type
there is list A : Type
which is the type of
finite sequences of elements of A
. So for example [1,2,3] : list
ℕ
or [tt] : list bool
, we can also have lists of lists so for
example [[],[0],[0,1],[0,1,2]] : list (list ℕ)
is a list of lists
of numbers. However, all elements of a list must have the same type,
hence something like [1,tt]
is not permitted.
For any type A
we always can construct the empty list [] : list
A
, which is also called nil
(this comes from the phrase not in
list and it was used in the first functional language LISP which was
first implemented in 1959). On the other hand if we have an element a :
A
and a list l : List A
we can form the list a :: l : list A
with
a
in front of l
, e.g. 1 :: [2,3] = [1,2,3]
. We call this
cons
- another heritage from LISP. Indeed the notation using
[ .. ]
is just a short-hand, e.g.:
[1,2,3] = 1 :: 2 :: 3 :: []
or using nil
and cons
:
= cons 1 (cons 2 (cons 3 nil))
All lists can be created from nil
and cons
hence lists can be
defined inductively in a manner very similar to ℕ
inductive list (A : Type)
| nil : list
| cons : A → list → list
This declaration means:
- For any type
A : Type
There is a new typelist A : Type
, - There are elements
nil : list A
, and givena : A
andl : list A
we havecons a l
. - All the elements of
list A
can be generated bynil
and then applyingcons
a finite number of times, nil
andcons a l
are different elements,cons
is injective, i.e. givencons a as = cons b bs
then we know thata = b
andas = bs
.
7.1. Basic properties of list
¶
Most of the basic properties and their proofs are very similar to the natural numbers. First of all we
want to show that nil ≠ a :: l
which corresponds to 0 ≠
succ n
and true ≠ false
we had seen before. These properties are
called no confusion properties. We can use contradiction
as
before for natural numbers:
theorem no_conf : ∀ (a:A)(l : list A), nil ≠ a :: l :=
begin
assume a l,
contradiction
end
Next we want to show injectivity of ::
. Unlike in the case for
natural numbers there are two results we want to prove: given a :: l
= a' :: l'
we need to show that a = a'
and l = l'
. The
second part we can prove the same way as for ℕ
, instead of
pred
we define tl
(for tail) and use this in the proof:
def tl : list A → list A
| [] := []
| (a :: l) := l
theorem inj_tl : ∀ (a a':A)(l l' : list A), a :: l = a' :: l' → l = l' :=
begin
assume a a' l l' h,
change tl (a :: l) = tl (a' :: l'),
rewrite h,
end
However the first part turns out more difficult. It seems that we need
to define a function hd : list A → list A
which extracts the head
of a list. But what should we do for the [] case
def hd : list A → list A
| [] := ?
| (a :: l) := a
There is really nothing we can plug in here because A
may be empty
but there is an empty list over the empty type. We could define a
function that returns an error, i.e.
def hd : list A → option (list A)
| [] := none
| (a :: l) := some a
But this doesn’t help us to prove injectivity. I leave this as a challenge because the only solution I know involves dependent types which I am not covering in this course.
However, the injection
tactic which we have seen before works in
this case:
theorem inj_hd : ∀ (a a':A)(l l' : list A), a :: l = a' :: l' → a = a' :=
begin
assume a a' l l' h,
injection h,
end
injection
also works in the proof of inj_tl
alleviating the
need to define tl
.
7.2. The free monoid¶
In the section on natural numbers we encountered a monoid using
+
and 0
on ℕ
. There is a very similar monoid for lists
using ++
and []
. Here ++
(append) is the operation that
appends two lists e.g. [1,2]++[3,4] = [1,2,3,4]
. We define it
recursively:
definition append : list A → list A → list A
| [] l := l
| (h :: s) t := h :: (append s t)
local notation l₁ ++ l₂ := append l₁ l₂
I am cheating a bit here because like +
, ++
is already defined
in the Lean prelude.
In the definition of append
we are using pattern matching and
structural recursion on lists. This works in the same way as for
natural numbers: we can recursively use append
for the smaller
list s
. To prove that this forms a monoid we need
induction. However, different from +
, now left neutrality is easy
because it follows from the definition:
theorem app_lneutr : forall l : list A, [] ++ l = l :=
begin
assume l,
reflexivity,
end
But now right neutrality requires induction:
theorem app_rneutr : forall l : list A, l ++ [] = l :=
begin
assume l,
induction l with a l' ih,
reflexivity,
apply congr_arg (cons a),
exact ih,
end
Induction for lists means to prove something for all lists you prove
it for the empty list []
and assuming that it holds for a list
l
you show that it holds for a :: l
for any a
. Comparing
the proof with the one for left neutrality for +
we see that it is
almost the same proof replacing congr_arg succ
with congr_arg
(cons a)
.
The switch between left and right neutrality is a consequence of that we
have defined +
by recursion over the 2nd argument while we have
defined ++
by recursion over the first. We could have defined
+
by recursion over the first argument and then this difference
would have disappeared (I prefer this but the designers of the lean
standard library had different ideas). We couldn’t have defined ++
by recursion over the 2nd argument - do you see why?
The proof of associativity of ++
again is very similar to the one
for +
. The main difference is that now we do induction over the
first argument instead of the last - again this is caused by the fact
that we now use recursion over the first instead of the 2nd argument.
theorem app_assoc :
forall l₁ l₂ l₃ : list A , (l₁ ++ l₂) ++ l₃ = l₁ ++ (l₂ ++ l₃) :=
begin
assume l₁ l₂ l₃,
induction l₁ with a l₁' ih,
reflexivity,
dsimp [(++),list.append],
apply congr_arg (cons a),
exact ih,
end
For +
we also proved commutativity that is m + n = n + m
, but
clearly this doesn’t hold for ++
, e.g. [1,2] ++ [3,4] =
[1,2,3,4]
while [3,4] ++ [1,2] = [3,4,1,2]
.
Indeed list A
with ++
and []
is the free monoid over
A
which
intuitively means that only the monoid equations hold but now
additional laws like commutativity. In this sense this monoid is free
not to follow any laws apart from the monoid ones.
7.3. Reverse¶
Since the list monoid is not commutative, order matters. In particular
we can reverse a list. That is we are going to define a function
which given a list like [1,2,3]
produces the list with the
elements in reverse order, in this case [3,2,1]
.
How do we reverse a list? Recursively! The reverse of the empty list
is the empty list and the reverse of a list of the form a :: l
is
the reverse of l
with a
put in the end. So for example the
reverse of [1,2,3] = 1 :: [2,3]
is the reverse of [2,3]
,
i.e. [3,2]
with 1
put at the end giving [3,2,1]
.
To define reverse we need an auxiliary operation which puts an
element at the end. We call this operation snoc
because it is the
reverse of cons
. We could define snoc
using ++
, but for
our purposes it is
slightly more convenient to define it directly using recursion:
def snoc : list A → A → list A
| [] a := [a]
| (a :: as) b := a :: (snoc as b)
#reduce (snoc [1,2] 3)
Using snoc
we are ready to implement the recursive recipe for
reversing a list:
def rev : list A → list A
| [] := []
| (a :: as) := snoc (rev as) a
#reduce rev [1,2,3]
A central property of rev is that it is self-inverse that is if we
reverse a list twice we obtain the original list, e.g. rev (rev
[1,2,3])) = rev [3,2,1] = [1,2,3]
Ok, let’s try and prove this using list induction:
theorem revrev : ∀ as : list A , rev (rev as) = as := begin assume as, induction as with a as' ih, reflexivity, dsimp [rev], sorry,
I got stuck in this proof, I am in the following situation:
ih : rev (rev as') = as'
⊢ rev (snoc (rev as') a) = a :: as'
I cannot apply my induction hypothesis because my current goal doesn’t
contain an expression of the form rev (rev as')
. However, I should
be able to exploit the fact that to reverse a snoc is the same as a
cons of the reverse. Then we can reason:
rev (snoc (rev as') a)
= a :: rev (rev as')
= a :: as'
This leads us to proving the following lemma:
theorem revsnoc :
∀ a : A, ∀ as : list A, rev (snoc as a) = a :: rev as :=
begin
assume a as,
induction as with b as' ih,
reflexivity,
dsimp [snoc,rev],
rewrite ih,
dsimp [snoc],
reflexivity,
end
This is exactly what was missing to complete the proof of revrev
:
theorem revrev : ∀ as : list A , rev (rev as) = as :=
begin
assume as,
induction as with a as' ih,
reflexivity,
dsimp [rev],
rewrite revsnoc,
rewrite ih,
end
This is a nice example about the art of proving which is a bit like putting stepping stones into a stream to cross it without getting wet feet. When getting stuck with our induction, then looking at the point where we are stuck often leads us to identifying another property which we can prove and which helps us to complete the original proof. There is no fixed method to identify a good auxiliary property (a lemma) it is a skill which improves by practice.
Here is another problem to do with reverse which you can use to practice this skill: if you have attended Prof Hutton’s Haskell course you will know that the above definition of reverse is very inefficient, indeed it has a quadratic complexity. A better definition is the following:
def revaux : list A → list A → list A
| [] bs := bs
| (a :: as) bs := revaux as (a :: bs)
def fastrev (l : list A) : list A
:= revaux l []
#reduce fastrev [1,2,3]
fastrev
only has linear complexity. However, we should convince
ourselves that it behaves in the same way as rev
, that is we
should prove the following theorem:
theorem fastrev_thm : ∀ as : list A , fastrev as = rev as :=
begin
sorry,
end
I leave it as an exercise to figure out what lemma(s) we need. Also it may be useful to employ some properties we have already proven.
7.4. Functors and Naturality¶
The list operator doesn’t just work on types but also on functions. That means we can write a map function which applies a function to every element of a list:
def map_list : (A → B) → list A → list B
| f [] := []
| f (a :: as) := (f a) :: (map_list f as)
Actually map_list
satisfies some important equations. To
understand them we need to have a closer look at functions. What are
the basic ingredients of the function type. There are two: the identity
function and function composition.
def id : A → A
| a := a
def comp : (B → C) → (A → B) → A → C
| g f a := g (f a)
local notation (name := comp) f ∘ g := comp f g
We write f ∘ g
for the composition of f
and g
. You may
notice that composition seems to be defined backwards, that is we
first run g
and then f
. The reason for this is that indeed
function application also works backwards: first we evaluate the
arguments and then the function. Since composition is defined via
application it seems a good idea not to change the direction.
There are some basic equations involving id
and circ
namely:
theorem idl : ∀ f : A → B, id ∘ f = f := sorry
theorem idr : ∀ f : A → B, f ∘ id = f := sorry
theorem assoc : ∀ h : C → D, ∀ g : B → C,
∀ f : A → B, (h ∘ g) ∘ f = h ∘ (g ∘ f) := sorry
These equations look like what? Yes, like a monoid. However, this isn’t really a monoid because there isn’t one fixed type on which the operations are defined but a whole family of types. This structure is called a category, which is the basic notion in the field of category theory.
To prove these equations we need to know how to prove equations about
functions: this is given by a lean axiom called funext
which says
to prove that two functions are equal you need to show that they
produce the same outputs for all inputs. In predicate logic: given
f g : A → B
∀ a : A, f a = g a → f = g
Using funext
the laws are easy to show. I just give you the first:
theorem idl : ∀ f : A → B, id ∘ f = f :=
begin
assume f,
apply funext,
assume x,
reflexivity,
end
I leave the other two as an easy exercise.
Now back to map_list
: we need to show that it preserves identity
and composition, that is mapping the identity function gives rise to
the identity and mapping a composition of two functions is the
composition of mapping them.
theorem map_id : map_list (@id A) = id := sorry
theorem map_comp : ∀ g : B → C, ∀ f : A → B,
map_list (g ∘ f) = (map_list g) ∘ (map_list f) := sorry
I am writing @id A
instead of just id
because I need to tell
lean that I want to use the type variable A
here, so I have
to give it explicitly. @id
is id
with the implicit parameter
A : Type
made explicit.
We can prove the laws using funext
and list induction. Here is the
first one:
Again I leave the 2nd one as an exercise. In the language of category
theory the fact that list
comes with an operation on functions (map_list
)
satisfying these laws means that list
is a functor.
Now rev : list A → list A
is an operation on lists that is
natural which means it works on all types A
uniformly. This can
be expressed by the following theorem using map_list
:
theorem nat_rev : ∀ f : A → B, ∀ l : list A,
map_list f (rev l) = rev (map_list f l) := sorry
This says that it is the same wether you first map a function and then
reverse or you first reverse and then run the function. For example if
you have a function is_even : ℕ → bool
and a list [1,2,4]
you
can first map is_even
obtaining [ff,tt,tt]
and then reverse
obtaining [tt,tt,ff]
or you first reverse obtaining [4,2,1]
and then map is_even
also ending up with [tt,tt,ff]
. The ony
way rev
could break this woud be by behaving differently on list
if natural numbers and lists of booleans - but this woud be very
unnatural.
Instead of proving the theorem (which I leave as an exercise - you
will need a lemma) let me remark that equations like the one above
in category theory are often shown using commutative diagrams in
this case, assuming f : A → B
Following the arrows labelled by functions corresponds to function composition and different paths express that these compositions have the same results, we say the diagram commutes. This also suggests that we could have written the naturality theorem using compositions :
rev ∘ map_list f = map_list f ∘ rev
This is called a point-free style since we avoid talking about elements.
7.5. Insertion sort¶
Finally, we are going to look at a slightly more interesting algorithm: insertion-sort. Insertion-sort is quite an inefficient algorithm because it has quadratic complexity while the best sorting algorithms have a \(n \log{n}\) complexity. However, insertion-sort is especially easy to implement and verify.
To keep things simple I will only sort lists of natural numbers wrt
≤ : ℕ → ℕ → Prop
using the fact that we have an implementation in
form of ble : ℕ → ℕ → bool
. That means for the following
we assume in particular that ble
is decidable, i.e.
lemma Le2ble : ∀ m n : ℕ, m ≤ n → ble m n = true := sorry
lemma ble2Le : ∀ m n : ℕ, ble m n = true → m ≤ n := sorry
7.5.1. Implementing sort¶
So our goal is to implement a function sort : list ℕ → list ℕ
that
sorts a given list, e.g. sort [6,3,8,2,3] = [2, 3, 3, 6, 8]
.
Insertion sort can be easily done with a deck of cards: you start with an unsorted deck and you take a card form the top each time and insert it at the appropriate position in the already sorted pile. Inserting a card means you have to go through the sorted pile until you find a card that is greater and then you insert it just before that. Ok, there is also the special case that the sorted pile is empty in which case you just make a pile out of one card. Here is the recursive function implementing the algorithm:
def ins : ℕ → list ℕ → list ℕ
| a [] := [a]
| a (b :: bs) := if ble a b then a :: b :: bs else b::(ins a bs)
#reduce ins 6 [2, 3, 3, 8]
Using ins
it is easy to implement sort
by inserting one
element after the other, recursively:
def sort : list ℕ → list ℕ
| [] := []
| (a :: as) := ins a (sort as)
#reduce (sort [6,3,8,2,3])
7.5.2. Specifying sort¶
To verify that the algorithm indeed sorts we need to specify what it
means for a list to be sorted. That is we need to define a predicate
Sorted : list ℕ → Prop
. To define this we also need an auxiliary
predicate which tells us that an element we want to add to a sorted
list is smaller then the first element of the list Le_list : ℕ →
list ℕ → Prop
.
To define these predicates we are going to use inductive definitions
of predicates which are similar to the inductive datatypes we have
already seen. The basic idea is that we state some rules how to prove
the predicate like we used constructors for inductive types like
zero
, succ
or nil
and cons
. We also presume that using
those rules is the only way to prove the predicate. So for example we
define Le_list
:
inductive Le_list : ℕ → list ℕ → Prop
| le_nil : ∀ n:ℕ, Le_list n []
| le_cons : ∀ m n : ℕ, ∀ ns : list ℕ, m ≤ n → Le_list m (n :: ns)
That is the we can prove that any element fits into the empty list and that if we have an element that is smaller then the head of a list this is ok too and these are the only ways to establish this fact.
So for example we can prove:
example : Le_list 3 [6,8] :=
begin
apply le_cons,
apply ble2Le,
trivial,
end
Note that I am using one of the direction of the proof that ble
decides ≤
. This was actually left as an exercise in the chapter on
natural numbers.
Using the principle that the two rules le_nil
and le_cons
are
the only way to prove Le_list
we can also invert le_cons
:
lemma le_cons_inv : ∀ m n : ℕ, ∀ ns : list ℕ,
Le_list m (n :: ns) → m ≤ n :=
begin
assume m n ns h,
cases h with _ _ _ _ mn,
exact mn
end
I am using cases
here again. The idea is that the only
way to prove Le_list m (n :: ns)
are le_nil
and le_cons
but le_nil
can be ruled out since it only proves it for the empty
list. Hence we must have used le_cons
but then we know that we
have assumed mn : m ≤ n
and we can use this.
The definition of Le_list
is not recursive, hence cases
is
sufficient. However, the definition of Sorted
is actually
recursive:
inductive Sorted : list ℕ → Prop
| sorted_nil : Sorted []
| sorted_cons : ∀ n : ℕ, ∀ ns : list ℕ, Le_list n ns
→ Sorted ns → Sorted (n :: ns)
We say that the empty list is sorted, and given a sorted list and an element that is smaller than the first then the list obtained by consing this element in front of the list is sorted. And these are the only ways to prove sortedness.
7.5.3. Verifying sort¶
We now want to prove that sort
only produces sorted outputs. What
principle are we going to use? Exactly: induction over lists. The base
case is easy: the empty list is sorted by definition.
theorem ins_sort_sorts : ∀ ns : list ℕ, Sorted (sort ns) :=
begin
assume ns,
induction ns with a as' ih,
apply sorted_nil,
dsimp [sort],
sorry
end
In the cons case we are left in the following state:
ih : Sorted (sort as')
⊢ Sorted (ins a (sort as'))
We know that sort
sorts for as'
and from this we have to
conclude that also ins a (sort as')
is sorted. This suggests the
following lemma:
lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) := begin assume n ns sns, sorry, end
To prove this lemma we actually have a choice we can either do
induction over the list ns
or the derivation of Sorted ns
. I found
it is easier to the former.
lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
assume n ns sns,
induction ns,
sorry,
end
In the base case we need to show Sorted (ins n nil)
which reduces
to Sorted [n]
. It is easy to show that singletons (lists with just
one element) are always sorted:
lemma sorted_sgl : ∀ n : ℕ, Sorted [n] :=
begin
assume n,
apply sorted_cons,
apply le_nil,
apply sorted_nil,
end
Ok this kills the base case, the cons case is certainly more difficult:
lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
assume n ns sns,
induction ns,
apply sorted_sgl,
dsimp [ins],
sorry,
end
We end up with the following goal:
ns_ih : Sorted ns_tl → Sorted (ins n ns_tl),
sns : Sorted (ns_hd :: ns_tl)
⊢ Sorted (ite ↥(ble n ns_hd) (n :: ns_hd :: ns_tl) (ns_hd :: ins n ns_tl))
ite
is just the internal code for if-then-else. Clearly the proof
is stuck at the condition. We need to do a case analysis on the
outcome of ble n ns_hd
. The tt
case is pretty straightforward, so
let’s do it first
lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
assume n ns sns,
induction ns,
apply sorted_sgl,
dsimp [ins],
cases eq : ble n ns_hd,
sorry,
apply sorted_cons,
apply le_cons,
apply ble2Le,
exact eq,
exact sns
end
The ff
case is more subtle, we are in the following situation:
ns_ih : Sorted ns_tl → Sorted (ins n ns_tl),
eq : ble n ns_hd = ff,
sns_a_1 : Sorted ns_tl,
sns_a : Le_list ns_hd ns_tl
⊢ Sorted (ns_hd :: ins n ns_tl)
We know that ble n ns_hd = ff
and hence ¬ (n ≤ ns_hd)
but we
actually need ns_hd ≤ n
. This actually follows from trichotomy,
which we discussed in the section on natural numbers, so let’s just
assume it in the moment:
lemma ble_lem : ∀ m n : ℕ , ble m n = ff → n ≤ m := sorry
However, we still need another lemma because we need to know that if
ns_hd ≤ n
then it will still fit in front of ins n ns_tl
, that
is Le_list ns_hd is n ns_tl
holds. This leads to another lemma:
lemma ins_lem_le_list : ∀ m n : ℕ, ∀ ns : list ℕ , n ≤ m
→ Le_list n ns → Le_list n (ins m ns) :=
begin
sorry,
end
You see what I mean with the stepping stones? Anyway, now we have all the stones together we can pass to the other side - here is the complete proof:
lemma sorted_sgl : ∀ n : ℕ, Sorted [n] :=
begin
assume n,
apply sorted_cons,
apply le_nil,
apply sorted_nil,
end
lemma ins_lem_le_list : ∀ m n : ℕ, ∀ ns : list ℕ , n ≤ m → Le_list n ns → Le_list n (ins m ns) :=
begin
assume m n ns nm nns,
cases ns with hd tl,
apply le_cons,
exact nm,
dsimp [ins],
cases eq : ble m hd,
cases nns with _ _ _ _ nhd,
apply le_cons,
exact nhd,
apply le_cons,
exact nm,
end
lemma ins_lem : ∀ n : ℕ, ∀ ns : list ℕ, Sorted ns → Sorted (ins n ns) :=
begin
assume n ns sns,
induction ns,
apply sorted_sgl,
dsimp [ins],
cases eq : ble n ns_hd,
cases sns with nx nsx lex sx,
apply sorted_cons,
apply ins_lem_le_list,
apply ble_lem,
exact eq,
exact lex,
apply ns_ih,
exact sx,
apply sorted_cons,
apply le_cons,
apply ble2Le,
exact eq,
exact sns
end
theorem ins_sort_sorts : ∀ ns : list ℕ, Sorted (sort ns) :=
begin
assume ns,
induction ns with a as' ih,
apply sorted_nil,
dsimp [sort],
apply ins_lem,
exact ih,
end
7.5.4. Permutations¶
Are we done now? Have we verified sort
? That is if you buy a tin
and it says we have proven the theorem ins_sort_sorts
are you then
satisfied that the program sort
does the job of sorting?
Not so! We have missed one important aspect of sort
, namely that
the output should be a rearrangement, i.e. a permutation of the
input. Otherwise an implementation of sort
that always returns the
empty list would be ok.
We can specify the relation Perm
that one list is a permutation of another
inductively, using an auxiliary relation Insert
which means that a
list is obtained form another by inserting an element somewhere.
inductive Insert : A → list A → list A → Prop
| ins_hd : ∀ a:A, ∀ as : list A,Insert a as (a :: as)
| ins_tl : ∀ a b:A, ∀ as as': list A, Insert a as as' → Insert a (b :: as) (b :: as')
inductive Perm : list A → list A → Prop
| perm_nil : Perm [] []
| perm_cons : ∀ a : A, ∀ as bs bs' : list A,Perm as bs → Insert a bs bs' → Perm (a :: as) bs'
Using these definitions it is relatively straightforward to establish that sort permutes its input:
theorem sort_perm : ∀ ns : list ℕ , Perm ns (sort ns) :=
begin
sorry
end
I leave it as an exercise to complete the proof which also involves identifying a rather obvious lemma.
We would expect Perm
to be an equivalence relation,
i.e. reflexive, transitive and symmetric. While the proof of
reflexivity is straightforward, the other two are rather hard. I leave
this as a challenge.
7.5.5. Parametrisation¶
I have presented sort for a rather special case, i.e. for ≤
on the
natural numbers. However, all my constructions and proofs should work
in a general case. Can you parametrize both the function and the
proof so that it is applicable in much more generality?