Wed 7 Nov 2007
Deriving Mathematics, part 2
Posted at 18:11 +1100
Having started with some thought experiments in the original article, today I want to start to make some concrete progress.
We're starting in the middle of a large map here. Where we are now is the place of knowing about numbers and how we use them in every day life. We can add, multiply, divide, maybe do some trigonometry. In each direction are slight variations on this theme. Removing properties, as we'll do today, or adding in extra features. Some of these directions are no doubt familiar to people, others will be new.
For anybody wanting to follow this series, without having to read any of my other writing, this Atom feed or this web link will contain just the articles in this feed (the Atom feed contains the full text of each article).
Road map
Firstly, a brief road map for the next few posts. We're starting with the integers today and what makes them special. Then we're going to gradually throw away assumptions and see what happens, having a look at other mathematical structures that behave like the integers or might provide clues to what differentiates the integers from other things. After a few posts along these lines, we'll come back to this place and branch out into the fractions (rational numbers) and things like the square root of two (and why it can't be a fraction) and pi. Why are those numbers different? Because the integers and integer-like structures are in some ways not complex enough to demonstrate all things we'd like, we'll also have to periodically retrace our steps as we introduce new ideas: coming back to how those new things work with reduced assumptions.
All the previous paragraph really means is that the concepts will be simple (and get simpler) for a little while before they get "harder", but the objects we'll be examining will still be interesting enough.
What Are Integers?
In my last post, I asked what makes the integers different from, say, one half, or one third. This is a question with a few different answers, depending on what you'd like to focus on. Let's concentrate on a couple of areas for now.
Firstly, if I give you a number, call it x, how can you tell if x is an integer? You have to tell me the algorithm before I tell you x, so it needs to be very general purpose. The reason this problem is hard is because you can't say something like "x is an integer if x divided by y for all possible integer values of y less than x is not always an integer". That argument is circular (attempts to define integers in terms of integers), although it is correct in the sense that it's true for integers, but it may not be true only for integers.
So, let's try this version:
- If x is less than 0, then x is an integer only if -x is an integer. So replace x with -x in the next step.
-
Start with the number 0 and call it y.
- If x is equal to y, x is an integer.
- If x is less than y, x is not an integer.
- Otherwise, add 1 to y and go back to step (2.1).
In informal language, this algorithm starts at 0 and gradually works its way up through the integers, 1, 2, 3, 4, ..., until we either hit x or we go past x (which would mean x lies between two integers and so is not an integer). We had to mess around a little bit to handle negative numbers, but that wasn't too hard.
What we have here is a construction of the set of integers: start with 0, add 1 to get a new number, add 1 to that to get another new number and so on. We also (step (1) in the algorithm) note that every positive integer has a negative counterpart. So we end up constructing the integers as a gradually growing set:
- {0}
- {0, 1, -1}
- {0, 1, -1, 2, -2}
- {0, 1, -1, 2, -2, 3, -3}
- ...
Some people who sat in some pure mathematics classes may recognise the similarity with the set-theoretic definition of the natural numbers here. The difference is that the natural numbers are the non-negative integers (0, 1, 2, 3, ...) and I've assumed that phrases like "add 1" make sense to us without having defined what it means. Remember that I'm starting from what we know now and throwing things away as the assumptions are identified. Certainly, knowing what "add 1" means is an assumption. Think back to our hypothetical "alien on the telephone" I referred to in the first article. How do we explain "add 1" to him? Let's postpone that question for a little bit, although interested readers can refer to the above wikipedia article (plus, possibly the one on Peano axioms, if you want to read ahead. I'll come back to it in the future, though. Pure sets and formal definitions are further down the road than our current location.
Operations On The Integers
So we have collection of numbers and a way to tell, however painfully, whether something is in that collection. This collection is infinitely large, too. We already know that (pick an integer and you can always create a large integer. The collection has no largest element).
We know there are other number than the integers, so it's interesting to ask (again!) what is it that differentiates them from the fractions. One answer was hinted at above: you cannot always divide an integer by another integer and get an integer. 3 divided by 2 is not an integer. However, we can add, subtract and multiply integers and always get an integer. Somehow division is special.
This "feature" of a collection — that you can always perform some operation on one, two or more elements of the collection and end up with another element of the collection — crops up regularly when we study the structure of things in mathematics. It's called closure.
Definition: A set (a bag of elements) is closed under a particular operation if apply that operation to the elements of the set always results in another element of the set.
So the integers are closed under addition, subtraction and multiplication, but not under division.
It's important to notice that closure is a property of both the elements and the operation. Change the operation from addition to division and the integers are no longer closed. Use the operation of division on the set of fractions (instead of integers), except for 0, and the result is always a fraction. So the set of fractions without 0 is closed under division (and addition, subtraction and multiplication).
Closure is a good thing to have, it turns out, not just in real life. Mathematically, it means that our structure (the elements and the operations we are using) is somehow complete and self-contained. We can work with it without worrying about other areas of mathematics.
A bit of experimentation reveals that there aren't many other really common mathematical operations under which the integers are closed. Square roots (and any other roots), for example, don't work (the square root of 2 is not an integer). Sine, cosine, etc are out. The calculus you learnt in high school doesn't make sense when you only have integers (intuitively, this seems obvious, but there are also some very rigorous formal reasons why you can't do this type of calculus on the integers). There are things like exponentiation that, when combined with the integers, form a closed set (2^3 = 2*2*2 = 8, etc).
So let's limit our field of vision to just the integers plus the operations of addition, subtraction and multiplication.
Simplifying: All You Need Is '+'
A common thread in investigating mathematical structures is differentiating between things that are nice to have because they make the work easier and things that are necessary.
In our case, I'm going to claim that we can make do without having three operations. We really only need addition! By need here, I mean that we can defined subtraction and multiplication in terms of addition.
Subtraction is easiest to handle: x - y is exactly the same as x + (-y), where I'm writing -y to mean the negative of y, rather than anything involving subtraction (the duplicate use of the '-' symbol is a slight annoyance to computer scientists and mathematicians everywhere). Also -y is just an element of our set of integers. It's not a new operation; I'm using the fact that we are mentally pairing up 1 and -1 as negatives of each other, and similarly for all other integers. I've now defined subtraction in terms of addition. So no further need for subtraction; it's a handy notation, being shorter than the equivalent version using addition, but it's not a requirement.
Multiplication is easy enough to dispose of if you think back to first grade in primary school (or even earlier). We learnt multiplication as repeated addition. 3 * 5 is 3 + 3 + 3 + 3 + 3. It's also 5 + 5 + 5, which is an equivalence we'll need to think about more later on.
Formally — a mathematician's favourite phrase, meaning "let's write it out as a sentence using symbols — we can define x * y as y + y +... + y, where there are x copies of y in the sum.
(Updated: I inadvertently skipped a bit here. This definition only makes sense for positive values of x. When we come back to multiplication, later on, I'll fill in the missing pieces for negative values of x and when x=0.)
Exercise 1: Notice that this is a strictly one-sided definition: it does explain why x * y is the same as y * x (which is the 3+3+3+3+3=5+5+5 case, above) all the time. You might be wondering why there is even a question here. We all know that "five times three" and "three times five" are the same and we hopefully all get 15 both times. In fact, every time we work out a multiplication, we know we can reverse the order of things. But why is this the case? How do you know you have merely avoided stumbling across the cases where this isn't true? I'm going to leave this as an exercise for the moment. It's actually a little tricky to prove, because it's easy to accidentally make assumptions that aren't true. For the time being, we aren't going to need multiplication much and when we return to it, we'll have a way to view things with fresh eyes.
Exercise 2: As a second exercise for people interested in this sort of thing (and I'll come back to answer this in the future, too): notice that our definition of multiplication doesn't work outside the integers. "Repeated addition" doesn't make sense when we have to use "one third copies of y". So our definition cannot be used to compute "one third times one half". How can you define multiplication for fractions? Why is your definition the same as my one for fractions that are also integers?
The Group Of Integers
We're almost at today's destination. We have a nice (beauty's in the eye of the beholder, I guess) infinite set called the integers. We have an operation called addition, which we'll write as +. We all understand, in some fashion, how + works on the integers. The integers are closed under addition.
Here are some other useful properties that we will be using in the future:
- The order of addition of two numbers does not matter. That is x+y = y+x. We say that addition on the integers is commutative.
- When adding any three numbers, we can do it in any order. That is, x+(y+z) = (x+y)+z. This means that addition on the integers is associative.
- There is a special integer, 0, such that adding 0 to any number equals that number. That is, 0+x = x+0 = x for all integers x (including 0 itself). 0 is called the identity element.
- For every integer x, there is another integer y such that x+y = 0. Because of commutativity, y+x = 0, too, in this case. y is the inverse of x (and vice-versa).
A couple of things to notice here. Firstly, not all operations are commutative. The simplest counter-example is subtraction: 5-3 and 3-5 are not equal.
Secondly, associativity is not true for all operations. Consider exponentiation, for example: 2^(3^2) = 2^9 = 512, whilst (2^3)^2 = 8^2 = 64.
As with closure, these four properties turn up together again and again in mathematical structures. If we have a set and a binary operation (an operation that operates on two elements of the set at a time), such that the set is closed under the operation and together they satisfy the above four points, we call that collection a commutative group.
The integers, together with addition (a binary operation), is a commutative group. The integers, together with multiplication, do not form a group, since point 4 is not satisfied, for example.
We've actually come a reasonably long way in small steps to get to this point. Starting from "the numbers" and everything we can do with them, we've concentrated on the integers, because they seemed interesting. Then we worked out that a few operations didn't really work on the integers, so we stuck with three that did seem useful. Then we were able to throw away two of those, since subtraction and multiplication could both be defined using only addition. Then we (somewhat arbitrarily for now, but I'm setting the stage for next time) defined a commutative group and realised that the integers together with + form a commutative group.
Are There Any Other Groups?
The concept of a group would not be particularly interesting, and certainly not name-worthy, if the integers were the only example possible. So far I haven't motivated why we picked those four definitions, but I'm not skipping ahead too far by doing so.
For now, let me leave you with these two things to contemplate:
- Go back to the example of arithmetic on a clock that I introduced in the first article. We are talking about normal addition on a normal 12-hour clock. Ignore concepts like "a.m." and "p.m.". So 11 plus 4 is 3, etc. Can you see that this is a commutative group? Work through the four points above to check. Pay particular attention to working out which is the identity element here. What hour plays the role that 0 does for the integers?
- Can we remove any integers from our group of integers and still have a group? For example, what about the set of all even integers together with +? What about all odd integers together with +?
Topics: mathematics/series: deriving