6. The Natural Numbers

We have already used the natural numbers () in examples but now we will formally define them. We will in spirit follow Guiseppe Peano who codified the laws of natural numbers using predicate logic in the late 19th century. This is referred to as Peano Arithmetic.

_images/peano.jpg

Giuseppe Peano (1858 - 1932)

Peano viewed the natural numbers as created from zero (0 = zero) and succ successor, i.e. 1 = succ 0, 2 = succ 1 and so on. In Lean this corresponds to the following inductive definition:

inductive nat : Type
| zero : nat
| succ : nat  nat
  

This declaration means:

  • There is a new type nat : Type,
  • There are two elements zero : nat, and given n : nat we have succ n : nat.
  • All the elements of nat can be generated by zero and then applying succ a finite number of times,
  • zero and succ n are different elements,
  • succ is injective, i.e. given succ m = succ n then we know that m = n.

We adopt the notation that = nat and Lean also automatically translates the usual decimal notation into elements of . This is convenient because otherwise it would be quite cumbersome to write out a number like 1234 : .

6.1. Basic properties of

Let’s verify some basic properties of which actually correspond to some of Peano’s axioms. First of all we want to verify that 0 succ n which corresponds to true false for bool. We could apply the same technique as before but actually contradiction does the job straight away:

example :  n : , 0  succ n :=
begin
  assume n h,
  contradiction,
end

Next let’s show that succ is injective. To do this we define a predecessor function pred using pattern matching which works the same way as for bool:

 def pred :   
 | zero := zero
 | (succ n) := n

 #reduce (pred 7)

   

We can test the function using #reduce (pred 7).

We defined pred 0 = 0 which is a bit arbitrary. However, pred does the job to show that succ is injective:

theorem inj_succ :  m n : nat, succ m = succ n  m = n :=
begin
  assume m n h,
  change pred (succ m) = pred (succ n),
  rewrite h,
end

Here I am using change again to replace the goal m = n with pred (succ m) = pred (succ n) exploiting that pred (succ x) = x. On the new goal I can apply rewrite.

However, there is also a tactic called injection which does this automatically for all inductive types which avoids the need to define pred:

example :   m n : nat, succ m = succ n  m = n :=
begin
  assume m n h,
  injection h,
end

injection h can be applied to a hypothesis of the form h : succ m = succ n.

6.2. Structural recursion

You may have already seen recursive programs. When defining functions on we will need recursion but unlike the general recursion available in programming languages we will only use structural recursion. That is when we define a function on the natural numbers we can use the function on n to compute it for succ n. A simple example for this is the double function which doubles a number:

def double :   
| zero := 0
| (succ n) := succ (succ (double n))

#reduce (double 7)

Here double (succ n) is suc (suc (double n)). That is basically every succ is replaced by two succ s. For example:

double 3
= double (succ 2)
= succ (succ (double 2))
= succ (succ (double (succ 1)))
= succ (succ (succ (succ (double 1))))
= succ (succ (succ (succ (double (succ 0)))))
= succ (succ (succ (succ (succ (succ (double 0))))))
= succ (succ (succ (succ (succ (succ (double zero))))))
= succ (succ (succ (succ (succ (succ zero)))))
= 6

That we allowed to use recursion really is a consequence of the idea that every natural number can be obtained by applying succ a finite number of times. Hence when we run the recursive program it will terminate in a finite number of steps. Other uses of recursion are not allowed, for example we are not allowed to write:

def bad :   
| zero := zero
| (succ n) := succ (bad (succ n))

Indeed this function would not terminate and Lean complains about it. However, actually structural recursion is a bit more general then what I just said, for example we can define the inverse to double (which is division by 2 without remainder):

def half :   
| zero := zero
| (succ zero) := zero
| (succ (succ n)) := succ (half n)

#reduce (half 7)
#reduce (half (double 7))

We define half which replaces every two succ s by one. In the recursion we use half n to compute half (succ (succ n)) because it is clear that n is structurally smaller that succ (succ n).

For more sophisticated uses of recursion we may have to prove that the recursion is actually terminating but we leave this for the advanced wizard level.

6.3. Induction

Proof by induction is very closely related to structural recursion which we have just seen, it is basically the same idea but for proofs. As an example let’s actually prove that half is the inverse of double:

theorem half_double :  n :  , half (double n) = n :=
begin
  assume n,
  induction n with n' ih,
  refl,
  dsimp [double,half],
  apply congr_arg succ,
  exact ih,
end

After assume n we are in the following state:

n : ℕ
⊢ half (double n) = n

Now induction n works very similar to cases (which we can also use on natural numbers) but it gives us an extra assumption when proving the successor case, which I have labelled ih for induction hypothesis. So after induction n we have two subgoals:

2 goals
case nat.zero
⊢ half (double 0) = 0

case nat.succ
n' : ℕ,
ih : half (double n') = n'
⊢ half (double (succ n')) = succ n'

The first one is easily disposed off using refl, because half (double 0) = half 0 = 0. The successor case is more interesting:

n' : ℕ,
ih : half (double n') = n'
⊢ half (double (succ n')) = succ n'

Here we see the extra assumption that what we want to prove for succ n' already holds for n'. Now we can apply the definitions of double and half using dsimp [double,half]:

n' : ℕ,
ih : half (double n') = n'
⊢ succ (half (double n')) = succ n'

And now we can use that succ preserves equality by appealing to congr_arg succ:

n' : ℕ,
ih : half (double n') = n'
⊢ half (double n') = n'

And we are left with a goal that exactly matches the induction hypothesis, hence we are done using exact ih.

This is a very easy inductive proof but it serves to show the general idea. Because every number is finitely generated from zero and succ we can run an inductive proof for any number by repeating the inductive step as many times as there are succ s and obtain a concrete proof without using induction.

There is nothing mysterious or difficult about induction itself but it is the main work horse to prove properties on inductive types like . Not all proofs are as easy as this one, and in many cases we have to think a bit to generalize the goal we want to prove so that induction goes through or we may have to prove an auxiliary theorem (a lemma) first to make progress. So it is not induction itself that is difficult but it is used in many situation which require some thinking.

The best way to learn how to do proofs using induction is to look at examples, so let’s just do this.

6.4. Addition and its properties

While addition is an operation which you may have learned already in kindergarden it still needs to be defined. And, horror, its definition already uses recursion. They don’t tell the kids this in kindergarten!

Here is the definition of add:

def add :     
| m  zero     := m
| m  (succ n) := succ (add m n)

So add m n applies n succ s to m. We define m + n as add m n. So for example:

3 + 2
= add 3 2
= add 3 (succ 1)
= succ (add 3 1)
= succ (add 3 (succ 0))
= succ (succ (add 3 0))
= succ (succ (add 3 zero))
= succ (succ 3)
= 5

Lean defines addition by recursion over the 2nd argument, while I actually think it is better to recur over the first one, even though the output is the same because add is commutative (m + n = n + m) as we will show soon. In any case the reasons are a bit involved but I am happy to answer to explain why if you ask me, However, let’s stick with the Lean definition.

Now what are the basic algebraic properties of +? First of all 0 is a neutral element, that is n + 0 = n and 0 + n = n. We may think that we only need to prove one of them since addition is commutative but actually we will need exactly this property when proving commutativity. It turns out that one of the two is trivial while one of them needs induction. First the easy one:

theorem add_rneutr :  n : , n + 0 = n :=
begin
  assume n,
  refl,
end

This theorem is easy because n + 0 = n holds by definition of add. However, this is not the case for the other one which does need induction:

theorem add_lneutr :  n : , 0 + n  = n :=
begin
  assume n,
  induction n with n' ih,
  refl,
  apply congr_arg succ,
  exact ih,
end

 

Another important property of addition is that brackets don’t matter, that is (l + m) + n = l + (m + n) this is called associativity. We need induction to prove this but there is a lot of choice: do we do induction over l m or n? And yes, it does matter!

If you remember the two definitions of band you should know that we need to analyse the argument which is used in the pattern matching. That is we need to do induction on the 2nd argument of addition which is n because addition matches on this (had we used my preferred definition of addition then it would be n).

theorem add_assoc :  l m n :  , (l + m) + n = l + (m + n) :=
begin
  assume l m n,
  induction n with n' ih,
  refl,
  dsimp [(+), nat.add],
  apply congr_arg succ,
  exact ih,
end

 

Already the 0 case only works if we choose n:

l m : ℕ
⊢ l + m + 0 = l + (m + 0)

Both sides reduce to l+m, notice that l + m + 0 really means (l + m) + 0 which is equal to l + m by definition. But also l + (m + 0) = l + m because m + 0 = m. Hence we just say refl.

Now in the successor case:

l m n' : ℕ,
ih : l + m + n' = l + (m + n')
⊢ l + m + succ n' = l + (m + succ n')

we observe that:

(l + m) + succ n'
= succ ((l + m) + n')

and:

l + (m + succ n')
= l + (succ (m + n'))
= succ (l + (m + n'))

Hence after using congr_arg succ we are back to the induction hypothesis.

We have shown the following facts about + and 0:

  • 0 is right neutral: n + 0 = n (add_rneutr),
  • 0 is left neutral: 0 + n = n ( add_lneutr),
  • + is associative : (l + m) + n = l + (m + n) ( add_assoc).

Such a structure is called a monoid. If you look this up on wikipedia don’t get distracted by the philosophical notion with the same name. Do you know some other monoids?

However, we have not yet talked about commutativity which isn’t required for a monoid. Ok how do we prove m + n = n + m? The choice of variable isn’t clear here because they swap places. Hence we may as well just go for m. Here is an attempt:

The 0 case looks like this:

n : ℕ
⊢ 0 + n = n + 0

Clearly the right hand side reduces to n and hence we just need to apply add_lneutr. But the succ case is less clear:

n m' : ℕ,
ih : m' + n = n + m'
⊢ succ m' + n = n + succ m'

Again the right hand side reduces to succ (n + m') but there isn’t much we can do with the left hand side. This is the case for a lemma which states that the alternative definition of + is true as a theorem, provable by induction:

lemma add_succ_lem :  m n : , succ m + n = succ (m + n) :=
begin
  assume m n,
  induction n with n' ih,
  refl,
  apply congr_arg succ,
  exact ih,
end

Now we can complete the proof of add_comm:

theorem add_comm :  m n :  , m + n = n + m :=
begin
  assume m n,
  induction m with m' ih,
  apply add_lneutr,
  transitivity,
  apply add_succ_lem,
  apply congr_arg succ,
  exact ih,
end

We need to apply two steps here, which is why we use transitivity

n m' : ℕ,
ih : m' + n = n + m'
⊢ succ m' + n = n + succ m'

And after transitivity we see:

2 goals
n m' : ℕ,
ih : m' + n = n + m'
⊢ succ m' + n = ?m_1

n m' : ℕ,
ih : m' + n = n + m'
⊢ ?m_1 = n + succ m'

The ?m_1 is a placeholder which can be filled in later. We are using our lemma which forces ?m_1 to be succ (m' + n) and we are left with:

n m' : ℕ,
ih : m' + n = n + m'
⊢ succ (m' + n) = n + succ m'

which now can be reduced to ih by using congr_arg succ.

Using transitivity and tactics to do equational proofs isn’t very readable, hence there is the calc syntax which looks like a usual equational derivation:

theorem add_comm :  m n :  , m + n = n + m :=
begin
  assume m n,
  induction m with m' ih,
  apply add_lneutr,
  calc
    succ m' + n = succ (m' + n) : by apply add_succ_lem
    ... = succ (n + m') : by apply congr_arg succ ih
    ... = n + succ m' : by refl
end

The disadvantage is that we have to construct each step by hand but it is certainly more readable. Also we can make the unfolding of definitions explicit by putting in trivial steps using refl.

Together with the previous facts we have now shown that with + and 0 form a commutative monoid.

Mathematicians prefer it if you also have inverse as for the integers where for every integer i there is -i such that i+(- i) = 0 and (- i) + i = 0. Such a structure is called a group.

6.5. Multiplication and its properties

Ok, we are doing things slowly but I won’t go in as much detail this time but leave the fun for you. First of all lets define multiplication:

def mul :     
| m 0     := 0
| m (succ n) := (mul m n) + m

And as usual we define x * y to stand for mul x y. As + was repeated succ, * is repeated +. That is m * n is m added n times, for example:

3 * 2
= mul 3 2
= mul 3 (succ 1)
= mul 3 1 + 3
= mul 3 (succ 0) + 3
= mul 3 0 + 3 + 3
= 0 + 3 + 3
= 6

What are the properties of multiplication. First we note that it also forms a commutative monoid, with 1 now playing the role of 0 for + – it is the neutral elements. We can show:

theorem mult_rneutr :  n : , n * 1 = n :=
begin
  sorry,
end

theorem mult_lneutr :  n : , 1 * n  = n :=
begin
  sorry,
end

theorem mult_assoc :  l m n :  , (l * m) * n = l * (m * n) :=
begin
  sorry,
end

theorem mult_comm :   m n :  , m * n = n * m :=
begin
  sorry,
end

 

This time I leave it as an exercise to prove the properties. You will certainly need to use the properties of addition we have already shown and the order in which you prove these propositions may not be the way in which I write them. Indeed, you may want to take the next properties into account as well.

Apart from addition and multiplication both being commutative monoids they also interact in an interesting way. You have will have seen this in school when you are asked to simplify an expression of the form x*(y + z) by multiplying out to x*y + x*z. The property is called distributivity. There is also a case for an empty addition, ie. when the one argument is 0 we want that x*0 = 0. And if we don’t want to assume commutativity (indeed we may use these properties to prove it) we also need the symmetric cases, That is we have the following properties:

theorem mult_zero_l :  n :  , 0 * n = 0 :=
begin
  sorry,
end

theorem mult_zero_r :  n :  , n * 0 = 0 :=
begin
  sorry,
end

theorem mult_distr_r :   l m n :  , l * (m + n) = l * m + l * n :=
begin
  sorry,
end

theorem mult_distr_l :   l m n :  , (m + n) * l = m * l + n * l :=
begin
  sorry,
end

Now this structure, we have two monoids and the distributivity laws just stated and we do need to require that addition is commutative, is called a semiring actually in this case where multiplication is commutative too it is a commutative semiring. Rings and semirings are a very central structure in algebra and they are closely related to polynomials (that is expressions like 7*x^2 + x + 5 but possibly with higher exponents and several variables).

6.6. Some Algebra

Once we have established that multiplication and addition form a commutative ring we can establish many well known equations. For example the well know binomial equality, \((x + y)^2 = x^2 + 2xy + y^2\). Ok to state this lets define exponentiation, which is just repeated multiplication in the same way as multiplication is repeated addition (you may notice a pattern):

def exp :     
| n zero := 1
| n (succ m) := exp n m * n

In this case we have to do recursion over the 2nd argument because exponentiation is not commutative like addition and multiplication.

The Lean prelude also introduces the standard notation for exponentiation, i.e. we define x^y = exp x y. Now we can state the binomial theorem:

theorem binom : ∀ x y : ℕ, (x + y)^2 = x^2 + 2*x*y + y^2 :=
begin
  sorry,
end

Now you can prove this using the properties we have established above but it is very laborious. You should do it at least once but luckily some clever people have implemented a tactic which does this automatically:

theorem binom : ∀ x y : ℕ, (x + y)^2 = x^2 + 2*x*y + y^2 :=
begin
  assume x y,
  ring,
end

That is ring automatically solves any equations which can be shown using the fact that multiplication and addition form a (semi-)ring. How does it work? You may not want to know. The basic idea is to reduce the expression to a polynomial and then it is easy to check wether two polynomials are equal The nice thing about a system like Lean is that we also generate the proof from basic principles, there is no danger of cheating.

We have just opened the door to the exciting realm of algebra. There is not enough time in this course to cover this in any depth but here is a table with a quick overview over different algebraic structures:

Example Algebraic structure
Natural numbers (ℕ) Semiring
Integers (ℤ) Ring
Rational numbers (ℚ) Field
Real numbers (ℝ) Complete Field
Complex numbers (ℂ) Algebraically complete Field

A ring is a semiring with additive inverses that is for every number \(x:ℤ\) there is \(-x : ℤ\) such that \(x + (-x) = 0\) and \(-x + x = 0\). Here subtraction is not a primitive operation but it is defined by adding the additive inverse \(x - y := x + (-y)\).

A field also has the multiplicative inverses for all numbers different from \(0\). The simplest example are the rational numbers = fractions. For every \(p : ℚ\) different from \(0\) there is \(p^{-1} : ℚ\) such that \(p * p^{-1} = 1\) and \(p^{-1} * p = 1\). Note that for a fraction \(\frac{a}{b} : ℚ\) the multiplicative inverse is \((\frac{a}{b}) ^{-1} = \frac{b}{a}\).

The real numbers include numbers like \(\sqrt{2} : ℝ\) and \(\pi:ℝ\), they have the additional property that any converging infinite sequence of real numbers has a unique limit. The complex numbers include \(i = \sqrt{-1} : ℂ\) have the additional feature that every polynomial equation has a solution (e.g. \(x^2 + 1 = 0\)) , this is called algebraically complete.

6.7. Ordering the numbers

Next we will look at the relation, which defines a partial order on the numbers. We say that m n if there is a number k:ℕ such that n = k + m. In full lean glory that is:

def le(m n : ) : Prop :=
  k :  , k + m = n

 local notation (name := le) x  y := le x y
 

I hasten to add that this is not the official definition of in Lean, because they are using an inductive relation and I don’t want to introduce this concept just now.

There is a precise meaning of the word partial order: a partial order is a relation which is reflexive, transitive and antisymmetric. Maybe you remember that we have already met the words reflexive and transitive when we looked at equality which is an equivalence relation. Here is a reminder what this meant but this time applied to

  • reflexive ( x : A, x≤x),
  • transitive ( x y z : A, x≤y y≤z x≤z)

For an equivalence relation we also demanded that it should be symmetric ( x y : A, x=y y=x) but this doesn’t apply to . Almost the opposite is true, the only situation in which we have x≤y and y≤ x is if they are actually equal. That is we have

  • antisymmetry ( x y : , x y y x x = y)

A relation which has these three properties (reflexive, transitive and antisymmetric) is called a partial order. It is not hard to prove that is reflexive and transitive:

 theorem le_refl : ∀ x : ℕ , x ≤ x :=
 begin
   assume x,
   existsi 0,
   ring,
 end

 theorem le_trans : ∀ x y z : ℕ , x ≤ y → y ≤ z → x ≤ z :=
 begin
   assume x y z xy yz,
   cases xy with k p,
   cases yz with l q,
   existsi (k+l),
   rewrite← q,
   rewrite← p,
   ring,
 end

For reflexivity we choose k=0 and exploit that 0 + x =x and for transitivity we add the differences exploiting associativity. I am leaving all the equational reasoning to the ring tactic avoiding unnecessary detail in the proof.

The 3rd one, antisymmetry, I found a bit harder to prove and I needed to show some lemmas. But I don’t want to spoil the fun and leave this to you:

 theorem anti_sym :  x y :  , x  y  y  x  x = y :=
 begin
   sorry,
 end

We can also define < by saying that m < n means that m+1 n:

 def lt(m n : ) : Prop :=
   m+1  n

 local notation (name := lt) x < y := lt x y

I am not going to discuss < in much detail now: it is antireflexive ( n : ℕ, ¬ (n < n)), transitive and strictly antisymmetric ( m n : ℕ, m < n n < m false). Indeed antireflexivity and strict antisymmetry follow from another important property of <, that it is well-founded. This means that any sequence starting with a number and always choosing smaller numbers, like 10 > 5 > 3 > 0 will eventually terminate. This property is important to prove the termination of algorithms which are not primitive recursive. We are not going to define well-founded now because it goes a bit beyond what we have learned so far.

Another important property of < is trichotomy, that is that any two numbers one is greater than the other or they are equal or one is smaller than the other:

theorem trich :  m n : , m < n  m = n  n < m :=
begin
  sorry
end

A relation that is transitive, trichotomous and well-founded is called a well-order. Well-orderings are very useful and there is a famous theorem by Cantor that every set can be well-ordered. However, this is not accepted by intuitionists and indeed no well-ordering of the real numbers is known.

6.8. Decidability

Equality for natural numbers is decidable, that is we can actually implement it as a function into bool:

def eq_nat :     bool
| zero     zero      := tt
| zero     (succ n)  := false
| (succ m) zero      := false
| (succ m) (succ n)  := eq_nat m n

We need to show that eq_nat indeed decides equality. First of all we show that eq_nat returns tt for equal numbers:

lemma eq_nat_ok_1 :  n :  , eq_nat n n = tt :=
begin
  assume n,
  induction n with n' ih,
  reflexivity,
  exact ih,
end

On the other hand we can also show that if eq_nat returns tt then we its arguments must be equal:

 lemma eq_nat_ok_2 :  m n : , eq_nat m n = tt  m = n :=
 begin
   assume m,
   induction m with m' ih_m,
   assume n,
   induction n with n' ih_n,
   assume _,
   reflexivity,
   assume h,
   contradiction,
   assume n,
   induction n with n' ih_n,
   assume h,
   contradiction,
   assume h,
   apply congr_arg succ,
   apply ih_m,
   exact h,
 end

Here we need a double induction. Putting everything together we can show that eq_nat decides equality:

theorem eq_nat_ok :  m n : , m = n  eq_nat m n = tt :=
begin
  assume m n,
  constructor,
  assume h,
  rewrite h,
  apply eq_nat_ok_1,
  apply eq_nat_ok_2,
end

We can do the same for , that is we implement a function leb : bool and then show m n : ℕ, m n leb m n = tt.

Certainly not all relations or predicates are decidable. An example for an undecidable relation is equality of functions that is given f g : is f = g? Two functions are equal iff they agree on all arguments f = g n :ℕ,f n = g n. It is clear that we cannot decide this, i.e. define a function into bool, because we would have to check infinitely many inputs.