6. The Natural Numbers¶
We have already used the natural numbers (ℕ
) in examples but now
we will formally define them. We will in spirit follow Guiseppe Peano who
codified the laws of natural numbers using predicate logic in the late
19th century. This is referred to as Peano Arithmetic.

Giuseppe Peano (1858 - 1932)
Peano viewed the natural numbers as created from zero
(0 =
zero
) and succ
successor, i.e. 1 = succ 0
, 2 = succ 1
and so on. In Lean
this corresponds to the following inductive definition:
inductive nat : Type
| zero : nat
| succ : nat → nat
This declaration means:
- There is a new type
nat : Type
, - There are two elements
zero : nat
, and givenn : nat
we havesucc n : nat
. - All the elements of
nat
can be generated byzero
and then applyingsucc
a finite number of times, zero
andsucc n
are different elements,succ
is injective, i.e. givensucc m = succ n
then we know thatm = n
.
We adopt the notation that ℕ = nat
and Lean also automatically
translates the usual decimal notation into elements of ℕ
. This
is convenient because otherwise it would be quite cumbersome to write
out a number like 1234 : ℕ
.
6.1. Basic properties of ℕ
¶
Let’s verify some basic properties of ℕ
which actually correspond
to some of Peano’s axioms. First of all we want to verify that 0 ≠
succ n
which corresponds to true ≠ false
for
bool
. We could apply the same technique as before but actually
contradiction
does the job straight away:
example : ∀ n : ℕ, 0 ≠ succ n :=
begin
assume n h,
contradiction,
end
Next let’s show that succ
is injective. To do this we define a
predecessor function pred
using pattern matching which works the
same way as for bool
:
def pred : ℕ → ℕ
| zero := zero
| (succ n) := n
#reduce (pred 7)
We can test the function using #reduce (pred 7)
.
We defined pred 0 = 0
which is a bit arbitrary. However,
pred
does the job to show that succ
is injective:
theorem inj_succ : ∀ m n : nat, succ m = succ n → m = n :=
begin
assume m n h,
change pred (succ m) = pred (succ n),
rewrite h,
end
Here I am using change
again to replace the goal m = n
with
pred (succ m) = pred (succ n)
exploiting that pred (succ x) =
x
. On the new goal I can apply rewrite
.
However, there is also a tactic called injection
which does this automatically for all
inductive types which avoids the need to define pred
:
example : ∀ m n : nat, succ m = succ n → m = n :=
begin
assume m n h,
injection h,
end
injection h
can be applied to a hypothesis of the form h : succ m =
succ n
.
6.2. Structural recursion¶
You may have already seen recursive programs. When defining functions
on ℕ
we will need recursion but unlike the general recursion
available in programming languages we will only use structural
recursion. That is when we define a function on the natural numbers
we can use the function on n
to compute it for succ n
. A
simple example for this is the double
function which doubles a
number:
def double : ℕ → ℕ
| zero := 0
| (succ n) := succ (succ (double n))
#reduce (double 7)
Here double (succ n)
is suc (suc (double n))
. That is
basically every succ
is replaced by two succ
s. For example:
double 3
= double (succ 2)
= succ (succ (double 2))
= succ (succ (double (succ 1)))
= succ (succ (succ (succ (double 1))))
= succ (succ (succ (succ (double (succ 0)))))
= succ (succ (succ (succ (succ (succ (double 0))))))
= succ (succ (succ (succ (succ (succ (double zero))))))
= succ (succ (succ (succ (succ (succ zero)))))
= 6
That we allowed to use recursion really is a consequence of the idea
that every natural number can be obtained by applying succ
a
finite number of times. Hence when we run the recursive program it
will terminate in a finite number of steps. Other uses of recursion
are not allowed, for example we are not allowed to write:
def bad : ℕ → ℕ
| zero := zero
| (succ n) := succ (bad (succ n))
Indeed this function would not terminate and Lean complains about
it. However, actually structural recursion is a bit more general
then what I just said, for example we can define the inverse to
double
(which is division by 2 without remainder):
def half : ℕ → ℕ
| zero := zero
| (succ zero) := zero
| (succ (succ n)) := succ (half n)
#reduce (half 7)
#reduce (half (double 7))
We define half
which replaces every two succ
s by one. In the
recursion we use half n
to compute half (succ (succ n))
because it
is clear that n
is structurally smaller that succ (succ n)
.
For more sophisticated uses of recursion we may have to prove that the recursion is actually terminating but we leave this for the advanced wizard level.
6.3. Induction¶
Proof by induction is very closely related to structural recursion
which we have just seen, it is basically the same idea but for
proofs. As an example let’s actually prove that half
is the
inverse of double
:
theorem half_double : ∀ n : ℕ , half (double n) = n :=
begin
assume n,
induction n with n' ih,
refl,
dsimp [double,half],
apply congr_arg succ,
exact ih,
end
After assume n
we are in the following state:
n : ℕ
⊢ half (double n) = n
Now induction n
works very similar to cases
(which we can also
use on natural numbers) but it gives us an extra assumption when
proving the successor case, which I have labelled ih
for
induction hypothesis. So after induction n
we have two
subgoals:
2 goals
case nat.zero
⊢ half (double 0) = 0
case nat.succ
n' : ℕ,
ih : half (double n') = n'
⊢ half (double (succ n')) = succ n'
The first one is easily disposed off using refl
, because
half (double 0) = half 0 = 0
. The successor case is more
interesting:
n' : ℕ,
ih : half (double n') = n'
⊢ half (double (succ n')) = succ n'
Here we see the extra assumption that what we want to prove for succ
n'
already holds for n'
. Now we can apply the definitions of
double
and half using dsimp [double,half]
:
n' : ℕ,
ih : half (double n') = n'
⊢ succ (half (double n')) = succ n'
And now we can use that succ
preserves equality by appealing to
congr_arg succ
:
n' : ℕ,
ih : half (double n') = n'
⊢ half (double n') = n'
And we are left with a goal that exactly matches the induction
hypothesis, hence we are done using exact ih
.
This is a very easy inductive proof but it serves to show the general
idea. Because every number is finitely generated from zero
and
succ
we can run an inductive proof for any number by repeating
the inductive step as many times as there are succ
s and obtain a
concrete proof without using induction.
There is nothing mysterious or difficult about induction itself but it is the main work
horse to prove properties on inductive types like ℕ
. Not all
proofs are as easy as this one, and in many cases we have to think a
bit to generalize the goal we want to prove so that induction goes
through or we may have to prove an auxiliary theorem (a lemma)
first to make progress. So it is not induction itself that is
difficult but it is used in many situation which require some
thinking.
The best way to learn how to do proofs using induction is to look at examples, so let’s just do this.
6.4. Addition and its properties¶
While addition is an operation which you may have learned already in kindergarden it still needs to be defined. And, horror, its definition already uses recursion. They don’t tell the kids this in kindergarten!
Here is the definition of add
:
def add : ℕ → ℕ → ℕ
| m zero := m
| m (succ n) := succ (add m n)
So add m n
applies n succ
s to m
. We define m + n
as
add m n
. So for example:
3 + 2
= add 3 2
= add 3 (succ 1)
= succ (add 3 1)
= succ (add 3 (succ 0))
= succ (succ (add 3 0))
= succ (succ (add 3 zero))
= succ (succ 3)
= 5
Lean defines addition by recursion over the 2nd argument, while I
actually think it is better to recur over the first one, even though
the output is the same because add
is commutative (m + n = n +
m
) as we will show
soon. In any case the reasons are a bit involved but I am happy to
answer to explain why if you ask me, However, let’s stick with the
Lean definition.
Now what are the basic algebraic properties of +
? First of all
0
is a neutral element, that is n + 0 = n
and 0 + n =
n
. We may think that we only need to prove one of them since
addition is commutative but actually we will need exactly this
property when proving commutativity. It turns out that one of the two
is trivial while one of them needs induction. First the easy one:
theorem add_rneutr : ∀ n : ℕ, n + 0 = n :=
begin
assume n,
refl,
end
This theorem is easy because n + 0 = n
holds by definition of
add
. However, this is not the case for the other one which does
need induction:
theorem add_lneutr : ∀ n : ℕ, 0 + n = n :=
begin
assume n,
induction n with n' ih,
refl,
apply congr_arg succ,
exact ih,
end
Another important property of addition is that brackets don’t matter,
that is (l + m) + n = l + (m + n)
this is called
associativity. We need induction to prove this but there is a lot of
choice: do we do induction over l m
or n
? And yes, it does
matter!
If you remember the two definitions of band
you should know that
we need to analyse the argument which is used in the pattern
matching. That is we need to do induction on the 2nd argument of
addition which is n
because addition matches on this (had we used
my preferred definition of addition then it would be n
).
theorem add_assoc : ∀ l m n : ℕ , (l + m) + n = l + (m + n) :=
begin
assume l m n,
induction n with n' ih,
refl,
dsimp [(+), nat.add],
apply congr_arg succ,
exact ih,
end
Already the 0
case only works if we choose n
:
l m : ℕ
⊢ l + m + 0 = l + (m + 0)
Both sides reduce to l+m
, notice that l + m + 0
really means
(l + m) + 0
which is equal to l + m
by definition. But also
l + (m + 0)
= l + m
because m + 0 = m
. Hence we just say
refl
.
Now in the successor case:
l m n' : ℕ,
ih : l + m + n' = l + (m + n')
⊢ l + m + succ n' = l + (m + succ n')
we observe that:
(l + m) + succ n'
= succ ((l + m) + n')
and:
l + (m + succ n')
= l + (succ (m + n'))
= succ (l + (m + n'))
Hence after using congr_arg succ
we are back to the induction
hypothesis.
We have shown the following facts about +
and 0
:
0
is right neutral:n + 0 = n
(add_rneutr
),0
is left neutral:0 + n = n
(add_lneutr
),+
is associative :(l + m) + n = l + (m + n)
(add_assoc
).
Such a structure is called a monoid. If you look this up on wikipedia don’t get distracted by the philosophical notion with the same name. Do you know some other monoids?
However, we have not yet talked about commutativity which isn’t
required for a monoid. Ok how do we prove m + n = n + m
? The
choice of variable isn’t clear here because they swap places. Hence we
may as well just go for m
. Here is an attempt:
The 0
case looks like this:
n : ℕ
⊢ 0 + n = n + 0
Clearly the right hand side reduces to n
and hence we just need to
apply add_lneutr
. But the succ
case is less clear:
n m' : ℕ,
ih : m' + n = n + m'
⊢ succ m' + n = n + succ m'
Again the right hand side reduces to succ (n + m')
but there
isn’t much we can do with the left hand side. This is the case for a
lemma which states that the alternative definition of +
is true as
a theorem, provable by induction:
lemma add_succ_lem : ∀ m n : ℕ, succ m + n = succ (m + n) :=
begin
assume m n,
induction n with n' ih,
refl,
apply congr_arg succ,
exact ih,
end
Now we can complete the proof of add_comm
:
theorem add_comm : ∀ m n : ℕ , m + n = n + m :=
begin
assume m n,
induction m with m' ih,
apply add_lneutr,
transitivity,
apply add_succ_lem,
apply congr_arg succ,
exact ih,
end
We need to apply two steps here, which is why we use
transitivity
n m' : ℕ,
ih : m' + n = n + m'
⊢ succ m' + n = n + succ m'
And after transitivity
we see:
2 goals
n m' : ℕ,
ih : m' + n = n + m'
⊢ succ m' + n = ?m_1
n m' : ℕ,
ih : m' + n = n + m'
⊢ ?m_1 = n + succ m'
The ?m_1
is a placeholder which can be filled in later. We are
using our lemma which forces ?m_1
to be succ (m' + n)
and we
are left with:
n m' : ℕ,
ih : m' + n = n + m'
⊢ succ (m' + n) = n + succ m'
which now can be reduced to ih
by using congr_arg succ
.
Using transitivity
and tactics to do equational proofs isn’t very
readable, hence there is the calc
syntax which looks like a usual
equational derivation:
theorem add_comm : ∀ m n : ℕ , m + n = n + m :=
begin
assume m n,
induction m with m' ih,
apply add_lneutr,
calc
succ m' + n = succ (m' + n) : by apply add_succ_lem
... = succ (n + m') : by apply congr_arg succ ih
... = n + succ m' : by refl
end
The disadvantage is that we have to construct each step by hand but it
is certainly more readable. Also we can make the unfolding of
definitions explicit by putting in trivial steps using refl
.
Together with the previous facts we have now shown that ℕ
with
+
and 0
form a commutative monoid.
Mathematicians prefer it if you also have inverse as for the integers
where for every integer i
there is -i
such that i+(- i) = 0
and (- i) + i = 0
. Such a structure is called a group.
6.5. Multiplication and its properties¶
Ok, we are doing things slowly but I won’t go in as much detail this time but leave the fun for you. First of all lets define multiplication:
def mul : ℕ → ℕ → ℕ
| m 0 := 0
| m (succ n) := (mul m n) + m
And as usual we define x * y
to stand for mul x y
. As +
was repeated succ
, *
is repeated +
. That is m * n
is
m
added n
times, for example:
3 * 2
= mul 3 2
= mul 3 (succ 1)
= mul 3 1 + 3
= mul 3 (succ 0) + 3
= mul 3 0 + 3 + 3
= 0 + 3 + 3
= 6
What are the properties of multiplication. First we note that it also
forms a commutative monoid, with 1
now playing the role of 0
for +
– it is the neutral elements. We can show:
theorem mult_rneutr : ∀ n : ℕ, n * 1 = n :=
begin
sorry,
end
theorem mult_lneutr : ∀ n : ℕ, 1 * n = n :=
begin
sorry,
end
theorem mult_assoc : ∀ l m n : ℕ , (l * m) * n = l * (m * n) :=
begin
sorry,
end
theorem mult_comm : ∀ m n : ℕ , m * n = n * m :=
begin
sorry,
end
This time I leave it as an exercise to prove the properties. You will certainly need to use the properties of addition we have already shown and the order in which you prove these propositions may not be the way in which I write them. Indeed, you may want to take the next properties into account as well.
Apart from addition and multiplication both being commutative monoids
they also interact in an interesting way. You have will have seen this
in school when you are asked to simplify an expression of the form
x*(y + z)
by multiplying out to x*y + x*z
. The property is
called distributivity. There is also a case for an empty addition,
ie. when the one argument is 0
we want that x*0 = 0
. And if we
don’t want to assume commutativity (indeed we may use these properties
to prove it) we also need the symmetric cases, That is we have the
following properties:
theorem mult_zero_l : ∀ n : ℕ , 0 * n = 0 :=
begin
sorry,
end
theorem mult_zero_r : ∀ n : ℕ , n * 0 = 0 :=
begin
sorry,
end
theorem mult_distr_r : ∀ l m n : ℕ , l * (m + n) = l * m + l * n :=
begin
sorry,
end
theorem mult_distr_l : ∀ l m n : ℕ , (m + n) * l = m * l + n * l :=
begin
sorry,
end
Now this structure, we have two monoids and the distributivity laws
just stated and we do need to require that addition is commutative,
is called a semiring actually in this case where multiplication is
commutative too it is a commutative semiring. Rings and semirings are a very
central structure in algebra and they are closely related to
polynomials (that is expressions like 7*x^2 + x + 5
but possibly
with higher exponents and several variables).
6.6. Some Algebra¶
Once we have established that multiplication and addition form a commutative ring we can establish many well known equations. For example the well know binomial equality, \((x + y)^2 = x^2 + 2xy + y^2\). Ok to state this lets define exponentiation, which is just repeated multiplication in the same way as multiplication is repeated addition (you may notice a pattern):
def exp : ℕ → ℕ → ℕ
| n zero := 1
| n (succ m) := exp n m * n
In this case we have to do recursion over the 2nd argument because exponentiation is not commutative like addition and multiplication.
The Lean prelude also introduces the standard notation for
exponentiation,
i.e. we define x^y = exp x y
. Now we can state the binomial
theorem:
theorem binom : ∀ x y : ℕ, (x + y)^2 = x^2 + 2*x*y + y^2 :=
begin
sorry,
end
Now you can prove this using the properties we have established above but it is very laborious. You should do it at least once but luckily some clever people have implemented a tactic which does this automatically:
theorem binom : ∀ x y : ℕ, (x + y)^2 = x^2 + 2*x*y + y^2 :=
begin
assume x y,
ring,
end
That is ring
automatically solves any equations which can be shown
using the fact that multiplication and addition form a (semi-)ring. How does
it work? You may not want to know. The basic idea is to reduce the
expression to a polynomial and then it is easy to check wether two
polynomials are equal The nice thing about a system like Lean is that
we also generate the proof from basic principles, there is no danger
of cheating.
We have just opened the door to the exciting realm of algebra. There is not enough time in this course to cover this in any depth but here is a table with a quick overview over different algebraic structures:
Example | Algebraic structure |
---|---|
Natural numbers (ℕ) | Semiring |
Integers (ℤ) | Ring |
Rational numbers (ℚ) | Field |
Real numbers (ℝ) | Complete Field |
Complex numbers (ℂ) | Algebraically complete Field |
A ring is a semiring with additive inverses that is for every number \(x:ℤ\) there is \(-x : ℤ\) such that \(x + (-x) = 0\) and \(-x + x = 0\). Here subtraction is not a primitive operation but it is defined by adding the additive inverse \(x - y := x + (-y)\).
A field also has the multiplicative inverses for all numbers different from \(0\). The simplest example are the rational numbers = fractions. For every \(p : ℚ\) different from \(0\) there is \(p^{-1} : ℚ\) such that \(p * p^{-1} = 1\) and \(p^{-1} * p = 1\). Note that for a fraction \(\frac{a}{b} : ℚ\) the multiplicative inverse is \((\frac{a}{b}) ^{-1} = \frac{b}{a}\).
The real numbers include numbers like \(\sqrt{2} : ℝ\) and \(\pi:ℝ\), they have the additional property that any converging infinite sequence of real numbers has a unique limit. The complex numbers include \(i = \sqrt{-1} : ℂ\) have the additional feature that every polynomial equation has a solution (e.g. \(x^2 + 1 = 0\)) , this is called algebraically complete.
6.7. Ordering the numbers¶
Next we will look at the ≤
relation, which defines a partial
order on the numbers. We say that m ≤ n
if there is a number
k:ℕ
such that n = k + m
. In full lean glory that is:
def le(m n : ℕ) : Prop :=
∃ k : ℕ , k + m = n
local notation (name := le) x ≤ y := le x y
I hasten to add that this is not the official definition of ≤
in Lean, because they are using an inductive relation and I don’t want
to introduce this concept just now.
There is a precise meaning of the word partial order
: a partial
order is a relation which is reflexive, transitive and
antisymmetric. Maybe you remember that we have already met the words
reflexive and transitive when we looked at equality which is an
equivalence relation. Here is a reminder what this meant but this
time applied to ≤
- reflexive (
∀ x : A, x≤x
), - transitive (
∀ x y z : A, x≤y → y≤z → x≤z
)
For an equivalence relation we also demanded that it should be
symmetric (∀ x y : A, x=y → y=x
) but this doesn’t apply to
≤
. Almost the opposite is true, the only situation in which we
have x≤y
and y≤ x
is if they are actually equal. That is we
have
- antisymmetry (
∀ x y : ℕ , x ≤ y → y ≤ x → x = y
)
A relation which has these three properties (reflexive, transitive and
antisymmetric) is called a partial order. It is not hard to prove that
≤
is reflexive and transitive:
theorem le_refl : ∀ x : ℕ , x ≤ x :=
begin
assume x,
existsi 0,
ring,
end
theorem le_trans : ∀ x y z : ℕ , x ≤ y → y ≤ z → x ≤ z :=
begin
assume x y z xy yz,
cases xy with k p,
cases yz with l q,
existsi (k+l),
rewrite← q,
rewrite← p,
ring,
end
For reflexivity we choose k=0
and exploit that 0 + x =x
and
for transitivity we add the differences exploiting
associativity. I am leaving all the equational reasoning to the ring
tactic avoiding unnecessary detail in the proof.
The 3rd one, antisymmetry, I found a bit harder to prove and I needed to show some lemmas. But I don’t want to spoil the fun and leave this to you:
theorem anti_sym : ∀ x y : ℕ , x ≤ y → y ≤ x → x = y :=
begin
sorry,
end
We can also define <
by saying that m < n
means that m+1 ≤
n
:
def lt(m n : ℕ) : Prop :=
m+1 ≤ n
local notation (name := lt) x < y := lt x y
I am not going to discuss <
in much detail now: it is antireflexive
(∀ n : ℕ, ¬ (n < n)
), transitive and strictly antisymmetric (∀
m n : ℕ, m < n → n < m → false
). Indeed antireflexivity and strict
antisymmetry follow from another important property of <
, that it
is well-founded. This means that any sequence starting with a number
and always choosing smaller numbers, like 10 > 5 > 3 > 0
will
eventually terminate. This property is important to prove the
termination of algorithms which are not primitive recursive. We are
not going to define well-founded now because it goes a bit beyond what we have
learned so far.
Another
important property of <
is trichotomy, that is that any two
numbers one is greater than the other or they are equal or one is
smaller than the other:
theorem trich : ∀ m n : ℕ, m < n ∨ m = n ∨ n < m :=
begin
sorry
end
A relation that is transitive, trichotomous and well-founded is called a well-order. Well-orderings are very useful and there is a famous theorem by Cantor that every set can be well-ordered. However, this is not accepted by intuitionists and indeed no well-ordering of the real numbers is known.
6.8. Decidability¶
Equality for natural numbers is decidable, that is we can actually implement it as a function into bool:
def eq_nat : ℕ → ℕ → bool
| zero zero := tt
| zero (succ n) := false
| (succ m) zero := false
| (succ m) (succ n) := eq_nat m n
We need to show that eq_nat
indeed decides equality. First of all we
show that eq_nat
returns tt
for equal numbers:
lemma eq_nat_ok_1 : ∀ n : ℕ , eq_nat n n = tt :=
begin
assume n,
induction n with n' ih,
reflexivity,
exact ih,
end
On the other hand we can also show that if eq_nat
returns tt
then we its arguments must be equal:
lemma eq_nat_ok_2 : ∀ m n : ℕ, eq_nat m n = tt → m = n :=
begin
assume m,
induction m with m' ih_m,
assume n,
induction n with n' ih_n,
assume _,
reflexivity,
assume h,
contradiction,
assume n,
induction n with n' ih_n,
assume h,
contradiction,
assume h,
apply congr_arg succ,
apply ih_m,
exact h,
end
Here we need a double induction. Putting everything together we can
show that eq_nat
decides equality:
theorem eq_nat_ok : ∀ m n : ℕ, m = n ↔ eq_nat m n = tt :=
begin
assume m n,
constructor,
assume h,
rewrite h,
apply eq_nat_ok_1,
apply eq_nat_ok_2,
end
We can do the same for ≤
, that is we implement a function
leb : ℕ → ℕ → bool
and then show
∀ m n : ℕ, m ≤ n ↔ leb m n = tt
.
Certainly not all relations or predicates are decidable. An example
for an undecidable relation is equality of functions that is given
f g : ℕ → ℕ
is f = g
? Two functions are equal iff they agree
on all arguments f = g ↔ ∀ n :ℕ,f n = g n
. It is clear that we
cannot decide this, i.e. define a function into bool, because we
would have to check infinitely many inputs.