8. Trees¶
Trees are everywhere, there isn’t just one type of trees but many datatypes that can be subsumed under the general concept of trees. Indeed the types we have seen in the previous chapters (natural numbers and lists) are special instances of trees.
Trees are also the most dangerously underused concept in programming. Bad programmers tend to try to do everything with strings, unaware how much easier it would be with trees. You don’t have to be a functional programmer to use trees, trees can be just as easily defined in languages like Python - see my computerphile video.
In this chapter we are going to look at two instances of trees: expression trees and sorted trees.
Expression trees are used to encode the syntax of expressions, we are going to define an interpreter and a compiler which compiles expression into the code for a simple stack machine and then show that the compiler is correct wrt the interpreter. This is an extended version of Hutton’s razor an example invented by Prof Hutton.
The 2nd example is tree-sort: an efficient alternatve to insertion-sort. From a list we produce a sorted tree and then we turn this into a sorted list. Actually, tree-sort is just quicksort in disguise.
8.1. Expression trees¶
I am trying to hammer this in: when you see an expression like x * (y +
2)
or (x * y) + 2
then try to see the tree. Really expressions
are a 2-dimensional structure but we use a 1-dimensional notation with
brackets to write them.
But it isn’t only about seeing the tree we can turn this into a datatype, indeed an inductive datatype.
inductive Expr : Type
| const : ℕ → Expr
| var : string → Expr
| plus : Expr → Expr → Expr
| times : Expr → Expr → Expr
def e1 : Expr
:= times (var "x") (plus (var "y") (const 2))
def e2 : Expr
:= plus (times (var "x") (var "y")) (const 2)
An expression is either a constant, a variable, a plus-expression or a times-expression. To construct a constant we need a number, for variables we need a string and for both plus and times we need two expressions which serve as the first and second argument. The last two show that expression tress are recursive.
I am not going to waste any time to prove no-confusion and injectivity, e.g.
theorem no_conf : ∀ n : ℕ, ∀ l r : Expr, const n ≠ plus l r :=
begin
sorry,
end
theorem inj_plus_l : ∀ l r l' r' : Expr , plus l r = plus l' r' → l=l' :=
begin
sorry,
end
Btw the name of the theorem inj_plus
is a bit misleading: it is
the tree constructor plus
that is injective not the operation
+
. Actually is +
injective?
8.1.1. Evaluating expressions¶
Instead let’s evaluate expressions! To do this we need an assignment from variable names (i.e. strings) to numbers. For this purpose I introduce a type of environments - I am using functions to represent environments.
def Env : Type
:= string → ℕ
def my_env : Env
| "x" := 3
| "y" := 5
| _ := 0
#reduce my_env "y"
The environment my_env
assigns to "x"
the number
3
, to y
the number 5
and 0
to all other
variables. Really I should have introduced some error handling for
undefined variables but for the sake of brevity I am going to ignore
this. To look up a variable name we just have to apply the function,
e.g. my_env "y"
Ok, we are ready to write the evaluator for expressions which gets an expression and an environment and returns a number. And it uses - oh horror - recursion on trees.
def eval : Expr → Env → ℕ
| (const n) env := n
| (var s) env := env s
| (plus l r) env := (eval l env) + (eval r env)
| (times l r) env := (eval l env) * (eval r env)
#reduce eval e1 my_env
#reduce eval e2 my_env
eval
looks at the expression: if it is a constant it just returns
the numerical values of the constant, it looks up variable in the
environment and to evaluate a plus or a times we first recursively
evaluate the subexpressions and then add them or multiply them
together.
I hope you are able to evaluate the two examples e1
and e2
in
your head before checking wether you got it right.
8.1.2. A simple Compiler¶
To prove something interesting let’s implement a simple compiler. Our machine code is a little stack machine. We first define the instructions:
inductive Instr : Type
| pushC : ℕ → Instr
| pushV : string → Instr
| add : Instr
| mult : Instr
open Instr
def Code : Type
:= list Instr
We can push a constant or a variable from the environment onto the stack and we can add or multiply the top two items of the stack which has the effect of removing them and replacing them with their sum or product. The machine code is just a sequence of instructions.
We define a run
function that executes a piece of code, returning
what is on the top of the stack which is represented as a list of numbers.
def Stack : Type
:= list ℕ
def run : Code → Stack → Env → ℕ
| [] [n] env := n
| (pushC n ::c) s env := run c (n :: s) env
| (pushV x ::c) s env := run c (env x :: s) env
| (add :: c) (m :: n :: s) env := run c ((n + m) :: s) env
| (mult :: c) (m :: n :: s) env := run c ((n * m) :: s) env
| _ _ _ := 0
The function run
analyzes the first instruction (if there is one)
and modifies the stack accordongly and then rns the remaining
instructions. Again no error handling, if something is wrong, I
return 0. This calls for some serious refactoring! But not now.
As an example let’s run some code that computes the values of e1
:
def c1 : Code
:= [pushV "x",pushV "y",pushC 2,add,mult]
#eval run c1 [] my_env
Now I have compiled e1
by hand to c1
but certainly we can do
this automatically.
Here is the first version of a compiler:
def compile_naive : Expr → Code
| (const n) := [pushC n]
| (var x) := [pushV x]
| (plus l r) := (compile_naive l) ++ (compile_naive r) ++ [add]
| (times l r) := (compile_naive l) ++ (compile_naive r) ++ [mult]
The compiler translates const
and var
into the corresponding
push instructions, and for plus
and times
it creates code for
the subexpression and inserts an add
or a mult
instruction
afterwards.
The naive compiler is inefficient due to the use of ++
which has to
traverse the already created code each time. And it is actually a bit
harder to verify because we need to exploit the fact that lists are a
monoid. However, there is a nice trick to make the compiler more
efficient and easier to verify. We add an extra argument to the
compiler which is the code that should be inserted after the code for
the expression we are just compiling (the continuation). We end up
with:
def compile_aux : Expr → Code → Code
| (const n) c := pushC n :: c
| (var x) c := pushV x :: c
| (plus l r) c := compile_aux l (compile_aux r (add :: c))
| (times l r) c := compile_aux l (compile_aux r (mult :: c))
def compile (e :Expr) : Code
:= compile_aux e []
#reduce run (compile e1) [] my_env
#reduce run (compile e2) [] my_env
This version of the compiler is more efficient because it doesn’t
need to traverse the code it already produced. Indeed, this is
basically the same issue with rev
vs fastrev
. The other
advantage is that we don’t need to use any properties of ++
in the proof
because we aren’t using it!
8.1.3. Compiler correctness¶
We can see looking at the examples that compile
and run
produces the same results as eval
but we woud like to prove this,
i.e. the correctness of the compiler.
theorem compile_ok : ∀ e : Expr, ∀ env : Env,
run (compile e) [] env = eval e env :=
begin
sorry,
end
However, we won’t be able to prove this directly because here we state a property which only holds for the empty stack but once we compile a complex expression the stack won’t be empty anymore.
This means we have to find a stronger proposition for which the induction goes through and which implies the proposition we actually want to prove. This is known as induction loading.
Clearly we need to prove a statement for compile_aux
, namely that
running compile_aux
for some expression is the same as evaluating
the expression and putting the result on the stack. This implies the
statement we want to prove for the special case that both the
remaining code and the stack are empty.
lemma compile_aux_ok : ∀ e : Expr, ∀ c : Code, ∀ s : Stack, ∀ env : Env, run (compile_aux e c) s env = run c ((eval e env) :: s) env := begin assume e, induction e, sorry,
I have already started the proof. We are going to do induction over
the expression e
. After this we are in the following state:
e : ℕ
⊢ ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux (const e) c) s env = run c (eval (const e) env :: s) env
case Expr.var
e : string
⊢ ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux (var e) c) s env = run c (eval (var e) env :: s) env
case Expr.plus
e_a e_a_1 : Expr,
e_ih_a : ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux e_a c) s env = run c (eval e_a env :: s) env,
e_ih_a_1 : ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux e_a_1 c) s env = run c (eval e_a_1 env :: s) env
⊢ ∀ (c : Code) (s : Stack) (env : Env),
run (compile_aux (plus e_a e_a_1) c) s env = run c (eval (plus e_a e_a_1) env :: s) env
case Expr.times
e_a e_a_1 : Expr,
e_ih_a : ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux e_a c) s env = run c (eval e_a env :: s) env,
e_ih_a_1 : ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux e_a_1 c) s env = run c (eval e_a_1 env :: s) env
⊢ ∀ (c : Code) (s : Stack) (env : Env),
run (compile_aux (times e_a e_a_1) c) s env = run c (eval (times e_a e_a_1) env :: s) env
We see that we have four cases, one for each of the constructors and
in the recursive cases for plus
ad mult
I have induction
hypothesis which say that my theorem holds for the left and right sub
expressions.
I don’t want to use the names generated by lean (like e_ih_a_1
)
but using with
here would be also getting a bit complicated given
all the names we are using. Hence I am going to use case
for each
of the cases which allows my to introduce the variables separately. I
also have to use {..}
to turn the proof for each case into a
block.
- Hence our proof is going to look like this:
lemma compile_aux_ok : ∀ e : Expr, ∀ c : Code, ∀ s : Stack, ∀ env : Env, run (compile_aux e c) s env = run c ((eval e env) :: s) env := begin assume e, induction e, sorry, case const : n { sorry, }, case var : name { sorry, }, case plus : l r ih_l ih_r { sorry, }, case times : l r ih_l ih_r { sorry,} end
The cases for const
and var
are easy, we just need to appeal
to reflexivity,
. The cases for plus
and mult
are virtually
identical (another case for refactoring, in this case for a
proof). Let’s have a look at plus
:
l r : Expr,
ih_l : ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux l c) s env = run c (eval l env :: s) env,
ih_r : ∀ (c : Code) (s : Stack) (env : Env), run (compile_aux r c) s env = run c (eval r env :: s) env,
c : Code,
s : Stack,
env : Env
⊢ run (compile_aux (plus l r) c) s env = run c (eval (plus l r) env :: s) env
By unfolding the definition of compile_aux (plus l r) c
we
obtain:
run (compile_aux (plus l r) c) s env
= run (compile_aux l (compile_aux r (add :: c))) s env
And now we can use ih_l
to push the value of l
on the
stack:
... = run (compile_aux r (add :: c)) ((eval l env) :: s) env
So indeed the value of l
has landed on the stack, and now we can
use ih_r
... = run (add :: c) ((eval r env) :: (eval l env) :: s) env
We are almost done, we can run run
one
step
... = run c (((eval l env) + (eval r env)) :: s) env
And now working backward from the definition of eval
we arrive on
the other side of the equation:
... = run c (eval (plus l r) env :: s) env
Ok here is the proof so far using calc
for the reasoning above:
lemma compile_aux_ok : ∀ e : Expr, ∀ c : Code, ∀ s : Stack, ∀ env : Env, run (compile_aux e c) s env = run c ((eval e env) :: s) env := begin assume e, induction e, sorry, case const : n { assume c s env, reflexivity,}, case var : name { assume c s env, reflexivity,}, case plus : l r ih_l ih_r { assume c s env, dsimp [compile_aux], calc run (compile_aux (plus l r) c) s env = run (compile_aux l (compile_aux r (add :: c))) s env : by refl ... = run (compile_aux r (add :: c)) ((eval l env) :: s) env : by rewrite ih_l ... = run (add :: c) ((eval r env) :: (eval l env) :: s) env : by rewrite ih_r ... = run c (((eval l env) + (eval r env)) :: s) env : by refl ... = run c (eval (plus l r) env :: s) env : by refl, }, case times : l r ih_l ih_r { sorry,} endI leave it to you to fill in the case for
times
.You may have noticed that I didn’t introduce all the assumptions in the beginning. Could I have done the proof starting with:
lemma compile_aux_ok : ∀ e : Expr, ∀ c : Code, ∀ s : Stack, ∀ env : Env, run (compile_aux e c) s env = run c ((eval e env) :: s) env := begin assume e c s env, induction e, sorry,
It seems I would have avoided the repeated assume c s env,
or is
there a problem? Try it out.
8.2. Tree sort¶
Finally we will look at another application of trees: sorting. The algorithm I am describing is tree sort and as I already said it is a variation of quicksort.
The idea is that we turn a list like our favorite example
[6,3,8,2,3]
into a tree like this one:
The nodes of the tree are labelled with numbers and the tree is sorted
in the sense that at each node all the labels in the left subtree are
less or equal to the number at the current node and all the ones in
the right subtree are greater or equal. And once we flatten this tree
into a list we get the sorted list [2,3,3,6,8]
.
8.2.1. Implementing tree sort¶
First of all we need to define the type of binary trees with nodes labelled with natural numbers:
inductive Tree : Type
| leaf : Tree
| node : Tree → ℕ → Tree → Tree
To build a sorted tree from a list we need to write a function that inserts an element into a sorted tree, preserving sortedness. We define this function by recursion over trees:
def ins : ℕ → Tree → Tree
| n leaf := node leaf n leaf
| n (node l m r) :=
if ble n m
then node (ins n l) m r
else node l m (ins n r)
Here we query the function ble
which we have already seen earlier
to decide wether to recursively insert the number into the right or
left subtree.
To turn a list into a sorted tree we need to fold ins
over the
list, mapping the empty list to a leaf.
def list2tree : list ℕ → Tree
| [] := leaf
| (n :: ns) := ins n (list2tree ns)
#reduce list2tree [6,3,8,2,3]
Now all what is left to do to do is to implement a function that flattens a tree into a list:
def tree2list : Tree → list ℕ
| leaf := []
| (node l m r) := tree2list l ++ m :: tree2list r
Putting both together we have constructed a sorting function on lists - treesort:
def sort (ns : list ℕ) : list ℕ
:= tree2list (list2tree ns)
#reduce (sort [6,3,8,2,3])
8.2.2. Verifying tree sort¶
To verify that tree sort returns a sorted list we have to specify what a sorted tree is. To do this we need to be able to say things like all the nodes in a tree are smaller or greater that a number. We can do this even more generally by defining a higher order predicate that applies a given predicate to all nodes of a tree:
inductive AllTree (P : ℕ → Prop) : Tree → Prop
| allLeaf : AllTree leaf
| allNode : ∀ l r : Tree, ∀ m : ℕ,
AllTree l → P m → AllTree r → AllTree (node l m r)
That is AllTree P t
holds if the predicate P
holds for all the
numbers in the nodes of t
.
Using AllTree
we can define SortedTree
:
inductive SortedTree : Tree → Prop
| sortedLeaf : SortedTree leaf
| sortedNode : ∀ l r : Tree, ∀ m : ℕ,
SortedTree l → AllTree (λ x:ℕ, x ≤ m) l
→ SortedTree r → AllTree (λ x:ℕ, m ≤ x) r
→ SortedTree (node l m r)
We are now ready to state the correctness of sort
which is the
same as for insertion sort using the predicate Sorted
on
lists that we have define in the previous chapter:
theorem tree_sort_sorts : ∀ ns : list ℕ, Sorted (sort ns) :=
begin
sorry,
end
It is not difficult to identify the two lemmas we need to show:
list2tree
produces a sorted tree (list2tree_lem
)tree2list
maps a sorted tree into a sorted list (tree2list_lem
)
Hence the top-level structure of our proof looks like this:
lemma list2tree_lem : forall l : list ℕ, SortedTree (list2tree l) :=
begin
sorry,
end
lemma tree2list_lem : ∀ t : Tree, SortedTree t → Sorted (tree2list t) :=
begin
sorry
end
theorem tree_sort_sorts : ∀ ns : list ℕ, Sorted (sort ns) :=
begin
assume ns,
dsimp [sort],
apply tree2list_lem,
apply list2tree_lem,
end
Since you have now seen enough proofs I will omit the gory details but
only tell you the lemmas (stepping stones). First of all we want to
prove list2tree_lem
by induction over lists. Hence another lemma
pops up:
lemma ins_lem : ∀ t : Tree,∀ n:ℕ,SortedTree t → SortedTree (ins n t) :=
begin
sorry,
end
This we need to prove by induction over trees. At some point we need a
lemma about the interaction of ins
with AllTree
, I used the
following:
lemma insAllLem : ∀ P : ℕ → Prop, ∀ t : Tree, ∀ n : ℕ,
AllTree P t → P n → AllTree P (ins n t) :=
begin
sorry,
end
Again this just requirea tree induction. To prove the other direction it is helpfull to also introduce a higher order predicate for lists:
inductive AllList (P : ℕ → Prop) : list ℕ → Prop
| allListNil : AllList []
| allListCons : ∀ n : ℕ, ∀ ns : list ℕ, P n → AllList ns → AllList (n :: ns)
And then I prove a lemma:
lemma AllTree2list : ∀ P : ℕ → Prop, ∀ t : Tree,
AllTree P t → AllList P (tree2list t) :=
begin
sorry,
end
To complete the proof of tree2list_lem
I needed some additional
lemmas about Sorted
and Le_list
, but you may find a different
path.
8.2.3. Tree sort and permutation¶
As before for insertion sort we also need to show that tree sort
permutes its input. The proof is actually very similar to the one for
insertion sort, we just need to adopt the lemma for ins
lemma ins_inserts : ∀ n : ℕ,∀ t : Tree,
Insert n (tree2list t) (tree2list (ins n t)) :=
begin
sorry
end
theorem sort_perm : ∀ ns : list ℕ , Perm ns (sort ns) :=
begin
assume ns,
induction ns,
apply perm_nil,
apply perm_cons,
apply ns_ih,
apply ins_inserts,
end
To show ins_inserts
I needed two lemmas about Insert
and
++
:
lemma insert_appl : ∀ n:ℕ, ∀ ms nms is : list ℕ,
Insert n ms nms → Insert n (is ++ ms) (is ++ nms) :=
begin
sorry,
end
lemma insert_appr : ∀ n:ℕ, ∀ ms is nms : list ℕ,
Insert n ms nms → Insert n (ms ++ is) (nms ++ is) :=
begin
sorry,
end
Both can be shown by induction over lists but the choice of which list to do induction over is crucial.
8.2.4. Relation to quicksort¶
I have already mentioned that tree sort is basically quick sort. How can this be you ask because quicksort doesn’t actually uses any trees. Here is quick sort in Lean:
def split : ℕ → list ℕ → list ℕ × list ℕ
| n [] := ([] , [])
| n (m :: ns) := match split n ns with (l,r) :=
if ble n m
then (m :: l, r)
else (l, m::r) end
def qsort : list ℕ → list ℕ
| [] := []
| (n :: ms) := match split n ms with (l,r) :=
(qsort l)++(n::(qsort r)) end
#eval (qsort [6,3,8,2,3])
The program uses ×
and match
which I haven’t explained but I
hope it is obvious. Lean isn’t happy about the recursive definition of
qsort
because the recursive call isn’t on a sublist of the
input. This can be fixed by using well founded recursion but this is
beyond the scope of this course. However, Lean is happy running the
program using #eval
.
We can get form tree sort to quicksort by a process called program
fusion. In a nutshell: the function list2tree
produces a tree
which is consumed by tree2list
but we can actually avoid the
creation of the intermediate tree by fusing the two together, hence
arriving at quick sort.
Can you see how to tree-ify merge sort? Hint: in this case you need to use trees where the leaves contain the data not the nodes.