Amateur Topologist

Politics, programming, math, and science.

Category: Math

I heard you like spheres: the Banach-Tarski paradox

In most people’s minds, or at least in the minds of those with a mathematical bent, every set of points S in \mathbb{R}^3 (a fancy way of writing three-dimensional Euclidean space) has a volume. Some sets, such as any set consisting of a finite number of points, has zero volume, whereas other sets, such as the set of all points less than one unit away from the origin, have finite but non-zero volume, and still other sets, such as \mathbb{R}^3 itself, have infinite volume. But does every set necessarily have a volume? It turns out the answer is no, even in one-dimensional Euclidean space; I showed this in a previous post. Instead, I’ll show you the Banach-Tarski paradox, which states that it is possible, via translations and rotations, to dissect a three-dimensional sphere (technically a 3-ball) and rearrange it into two spheres of the same radius as the original.

For simplicity, we’ll start by only looking at the surface of the sphere, S^2. Let x refer to rotation by \pi ^ \circ (that is, pi degrees) about the x-axis and let y refer to rotation by \pi ^ \circ about the y-axis. The exact degree of the rotation isn’t important, only that it should be impossible via any combination of x and y to go back to the origin. Let H be the set of all rotations that you can get by combining x, y, and their inverses. Then x and y form what’s known as the ”free group on two generators“, which is composed of all strings using x, y, x^{-1}, and y^{-1} as symbols, with the provision that x and x^{-1} cannot appear together for x=a,b. The multiplication for this group is just writing the strings together, subject to the rule that x and x^{-1} cancel, as do y^{-1} and y. So, for example, (xy^{-1}xxy^{-1})(yx^{-1}y)=xy^{-1}xxy^{-1}yx^{-1}y=xy^{-1}xy. The name arises from the fact that there are two ‘fundamental’ symbols, x and y, and it is ‘free’, since they don’t commute (that is, xy \neq yx).

Now, one very weird fact about the free group on two generators (written F_2) is that you can break it into four pieces, rearrange those four pieces, and then reassemble them into two copies of it. To elaborate, let S(x) denote the set of strings in F_2 that start with x, and similarly for the other three symbols; let e denote the empty string. Then obviously

F_2=\{e\}\cup S(x) \cup S(x^{-1}) \cup S(y) \cup S(y^{-1})
. But we also have (if we include e in aS(a^{-1}))
F_2=aS(a^{-1})\cup S(a)
The reason for this is that if the string s doesn’t start with a, then it’s the same string as aa^{-1}s and so is in aS(a^{-1}); otherwise, it’s in S(a). We also have, for the same reason,
F_2=bS(b^{-1})\cup S(b)

How is that relevant? Well, H partitions S^2 into orbits, where an orbit is a collection of points such that each element of H moves points from the orbit into the orbit, and points not in the orbit into another point not in the orbit. For each orbit, we can pick a point p inside it; let the set of all these points be P. Then we can turn the decomposition of F_2 into a decomposition of H; using H(x) to indicate the set of all rotations in H that start with x, we have

S^2=xH(x^{-1})\cup H(x)=H(x)\cup H(y) \cup H(x^{-1}) \cup H(x^{-1})

These four sets together make up the sphere (except for P itself; we’ll get back to that), but if we pick two of them and rotate one of those two, we also get back the sphere. Therefore, we can break down the sphere into four pieces and recompose them into two spheres. If we then draw lines joining the sphere to the origin, we can turn this into a decomposition of two solid spheres.

Now, there are two omissions in this proof: first off, I ignored the origin. It turns out that you can cut the sphere without the origin up, rearrange it, and then put it back together so that you have the origin. Second, I ignored the axes of the rotations in H; they are the fixed points of these rotations, and so might cause problems; it turns out that if we call those points D; we can again cut up S^2-D and rearrange it into S^2.

So what does this mean? If we want to have any reasonable definition of volume, then we want translations and rotations to leave it the same. Then, we must have that some of these sets don’t have a well-defined volume; much like how you can construct a subset of the real line that doesn’t have one. We break the sphere down into a number of sets, some of which don’t have a defined volume; then we move them around and reassemble them. There’s no contradiction here, since two sets that don’t have a well-defined volume (or non-measurable sets, as they’re known) might have a union that has a volume. Of course, this is physically meaningless, since you can’t actually perform the required divisions; atoms are not infinitely small, after all.

All of my children are mass murderers

Formal logic and English conversational logic are quite different things; everybody knows that when the waitress asks whether you want ‘soup or salad’, ‘both’ is not a valid answer. But even when they have different rules, they can still come to the same conclusion. Consider the statement ‘not all of my children are mass murderers’. Is it true for me, given that I do not actually have any children? According to the usual rules of conversational English’s logical quantifiers (‘all’ and ‘there is’ and their various rephrasings), the answer is no: ‘not all of my children are mass murderers’ implicates that I have at least one child, and furthermore entails that that child is not a mass murderer. But what about if we interpret the statement in formal logic? The answer is still no, but for a completely different reason: it is the negation of the statement ‘all of my children are mass murderers’, and that statement is true.

The reason that I can state truthfully in formal logic ‘all of my children are mass murderers’ without actually having any children is for the same reason that the sum of an empty set is zero and the product of an empty set is one: if it were any other way, then several useful properties of those operations would not hold. For sum, we would lose the fact that the sum of the two sets A and B is the sum of the set A plus the sum of the set B, and similarly for products. These are true because 0 and 1 are identities of addition and multiplication, respectively. For the case of ‘for all’, we can think of it as first applying the predicate to the set, and then taking the logical and of the result; we then must have that \forall_{x \in S}p(x) is true when S is empty, or else we would lose the theorem that \forall_{x \in A}p(x) \wedge \forall_{x \in B}p(x) \Leftrightarrow \forall_{x \in A \cup B}p(x). While it could certainly be special-cased to take into account the cases where A or B is the empty set, it would detract from its simplicity, and would require proofs involving it to first show that neither of the two is empty (which can be a challenge, and might even require additional axioms if dealing with infinite sets, choice functions, etc.)

One of the interesting things that this implies is that for any predicate p, \forall_{x \in \emptyset}p(x), even when p is false for all x, much like how \prod_{x \in \emptyset}0=1. But returning to the linguistics theme, it seems rather odd; this is because the quantifier ‘all’ has an implication that the entities that are being talked about exist; in other words, the statement ‘all x are y’ implies that at least one x exists. If this is known to be false by the speaker, it sounds odd, much like the statement ‘John has two children’ sounds odd if the listener knows that John actually has exactly four children; the statement ‘John has two children’ implicates that he has exactly two.

Discrete Fourier Transforms

This post assumes a basic familiarity with complex numbers, specifically, e^{i\theta}.

What do this image:
The spectrogram spiral from Aphex Twin's Windowlicker
and this video:

have in common? They both involve discrete Fourier transform: taking an input signal (in these cases, music) and determining the frequencies of the sine and cosine waves that make it up, then doing something with those frequencies.
So the interesting question is: how do you do this? At its core, the discrete Fourier transform is this equation, which transforms the sequence x_0,x_1,\dots,x_{N-1} into X_0,X_1,\dots\,X_{N-1}:

X_k = \sum_{n=0}^{N-1} x_n e^{-\frac{2 \pi i}{N} k n} \quad \quad k = 0, \dots, N-1

The obvious question is: if this is only used for signal processing, why are there complex numbers? And the answer lies in the equation e^{i\theta}=\cos\theta +i\sin\theta. Even though the imaginary part isn’t physically meaningful, it turns out that adding it makes the mathematics a lot simpler to manipulate, because exponentials are easy to multiply, whereas individual cosines and sines are not.

How does this help you to figure out how to decompose this into sines and cosines? Well, consider the sequence x_n=e^{\frac{2\pi i n k_0}{N}}. This is the sequence that goes through k_0 complete cosine waves from 0 to N-1, with a matching real component. What’s X_k for this sequence? It’s just
X_k = \sum_{n=0}^{N-1} e^{-\frac{2\pi i}{N} (k-k_0) n}
Obviously, if k=k_0, then this is just \sum_{n=0}^{N-1} 1=N. Less obvious, though, is the fact that if k-k_0 is an integer, then X_k=0. So essentially by transforming the sequence in this fashion, you can ‘decompose’ it into cosine waves. (Sine waves will manifest themselves as imaginary components). So if you have a very large N, you can get a large number of different frequencies, ranging from 1 hertz all the way up to N hertz. There are various intricacies, such as the fact that signals are not, in general, complex numbers, but the core explanation here is enough.

Back to the original image and video. The image is just a spectrogram of the end of Aphex Twin’s ‘Windowlicker’; a spectrogram is just a graphic visualization of the discrete Fourier transform, so the curves in the spiral correspond to notes that move around in frequency, merge, and split. If you listen, you can make out the beginning and end with ease; the middle is harder, due to the sheer multiplicity of tones.
Click here to listen to the ‘spiral’ at the end of ‘Windowlicker’.
The analysis of which keys to press and for how long was likely done by analyzing the spectrum of each key on the piano, comparing it to the spectrum of a recitation of the Proclamation of the European Environmental Criminal Court, and then seeing how you can best combine the key presses to add up to the spectrum of the reading. Of course, it’s more complicated than that, but it just illustrates how powerful this sort of technique is.

As a final note, both the MP3 and JPEG formats are based off of variations on the DFT; however, I don’t understand enough about either of them to be able to explain further. But this does partially explain why JPEGs are so bad at representing sharp edges; due to the so-called ‘Gibbs phenomenon’, the DFT cannot represent a sharp, discontinuous edge well. This is why screenshots of web pages tend to suck as JPEGs; text is all about sharp boundaries, so you’ll always get JPEG artifacts. This is extra-noticeable since JPEGs also do not store high-frequency data; the magnitude of the Gibbs phenomenon is inversely proportional to the highest frequency that gets stored, so a low-frequency-only compression method will have much more artifacting.
Compression artifacts in a JPEG screenshot

Vitali Sets: Sets Without Length

We’re all vaguely familiar with the notion of length when it comes to intervals; the interval [1,2] (meaning the set of all numbers between 1 and 2 inclusive) has length 1, as does the interval (1,2) (the set of all numbers between 1 and 2, exclusive). Even though the latter excludes its endpoints, the points have length zero. So the question is, how do we measure more complicated sets? For example, what is the ‘length’ of the set of all rational numbers between 0 and 1? The intuitive way is this: suppose we have a set S that we want to find the length of. Now, let T be a (countable) of intervals such that S is completely contained in the union of all the elements of T; mathematically speaking, we want
Misplaced alignment tab character &.
leading text: $S\subseteq\bigcup_k T_k&
Since S is contained in T, we obviously should have that the length of S is the length of T, which is itself less than or equal to the sum of the lengths of its constituent intervals:
Misplaced alignment tab character &.
leading text: $\lambda(S) \leq \sum_k \lambda(T_k)&
(where \lambda(S) signifies the length of S.
The fact that T is countable is significant; if we allowed T to be uncountable, then we could define T as T_x=\{x\}, T=\bigcup_x T_x for every x\in S and then every set would have length zero! Not exactly useful.
Obviously any S will have many such Ts; so we can simply define \lambda(S) to be the sum of the lengths of the smallest possible T that covers S; or, if there is no smallest T, the largest l such that for all l’ > l, there is a T that covers S such that \lambda(T) = l’. Again, written out, this becomes
Misplaced alignment tab character &.
leading text: $\lambda(S) = \inf \{ \lambda(T) \}&

This is what is known technically as the Lebesgue measure of a set, after Henri Lebesgue, the 19th century mathematician. This definition gives each set a Lebesgue measure and seems to behave normally under transformations such as translation; it’s not easy to construct a set that acts oddly under Lebesgue measure. But it is doable, and it was first found by a mathematician named Giuseppe Vitali. Let V be the set of irrational numbers in [0,1], restricted so that if x, y \in V, then x-y \notin \mathbb{Q}. Oh, and we also have to include 0 in V. In other words, V is a subset of [0,1] such that no two members differ by a rational number. Then V does not have a well-defined Lebesgue measure, and is called a Vitali set. To show this, first we order the rational numbers in the interval [-1,1], q_1, q_2, q_3, \ldots. We can do this since the rational numbers are all countable. Then, we define
Misplaced alignment tab character &.
leading text: $V_k=V+q_k=\{v+q_k|v\in V\}&
Each V_k is disjoint from each other one, since if v was in V_i and V_j, then we would have v_1+q_i=v_2+q_j, for some v_1, v_2 \in V or v_1-v-2=q_j-q_i, an impossibility by the definition of V. Let V’ = \bigcup_k V_k. Obviously, V’ \subseteq [-1,2], but we also have [0,1] \subseteq V’ To show this, pick some x \in [0,1]. If x \in V, then we’re done, else it was excluded since there is some v\in V such that x-v\in\mathbb{Q}. So we have
Misplaced alignment tab character &.
leading text: $[0,1]\subseteq V' \subseteq [-1,2]&

.
Now, consider what this implies for \lambda(V). We know that
\lambda(V’) = \sum _k \lambda(V_k). But since each V_k is just \lambda V moved around on the real line, we have \lambda(V’) = \sum_k \lambda(V), which will either be infinity or zero depending on whether V has zero measure or not. But by the above relation, we have
&
which directly contradicts what we derived about \lambda(V’). So therefore, \lambda(V) is not well-defined.

This naturally raises the question of how one can determine what sets can be assigned a measure. And here I have to turn to Wikipedia; according to them, what I’ve been calling the measure is actually the outer measure, and the actual measure, which I’ll call \mu, is defined by
\mu(S)=\lambda(S) whenever Misplaced alignment tab character &.
leading text: ...a(A)=\lambda(A \bigcap S) + \lambda(A - S)&

for all sets A
If this condition is not satisfied, then S doesn’t have a measure. To see that V fails this test, consider A=[0,1]. Then obviously \lambda(A)=1. Your homework for today is to prove that \lambda(A \bigcap S) + \lambda(A-S) \not= 1. It’s not enough to show that those two sets are uncountable; there exist uncountable sets of measure 0, such as the Cantor set.

One point that I glossed over in the construction of V is that you have to somehow pick which irrational numbers you’re leaving out and which you’re leaving in. This requires what is known as the axiom of choice; the axiom of choice is independent of standard set theory, and mathematicians are divided on whether they work in set theory with or without the axiom of choice (though the majority do work with it). But that’s a topic for another time.

Constructing the Reals, part 4: Reals

At last, after all that work, we can finally make the real numbers out of rationals! This time, the method being used actually has a name; they’re called Dedekind cuts, after the mathematician Richard Dedekind, among the first mathematicians to actually give a rigorous definition of real numbers. One of the other methods is to simply define reals by their decimal expansion; this has the advantage of being simpler in some cases, but subtraction and division are much harder to define explicitly due to borrowing and the difficulty of explicitly dividing infinitely long numbers.

A real number is simply a nonempty set, A, of rational numbers with these properties:

  1. There is a rational number x not in A.
  2. If x is in A, and y < x, then y is in A.
  3. If x is in A, then there is some z also in A such that z > x.

We can then identify each rational q with the set of all rational x < q; this set is nonempty, it has a number not in it, if x < q and y < x, then y < q, and if x < q, then there is a z with x < z < q.

We first define comparison: A < B if and only if A is a subset of B. Addition is straightforward: A + B is just the set of all a + b, where a and b are in A and B. Subtraction is also fairly simple; define AB to be the set of all a – b, where a < 0 and a is in A and b is in Q\B, the set of all rational numbers not in B. Negation is then just 0 – B. Multiplication is a little harder; if A and B are both non-negative, then AB is just the set of all x that are less than some ab for a, b > 0 and a, b in A, B respectively. On the other hand, if at least one of A and B is negative, we define AB = (-A)B = A(-B) = (-A)(-B) and use these relations to compute a positive product. Division is similar; for B > 0, define B-1 to be the set of all 1/b where b is nonzero and in Q\B, and define A/B to be AB-1. If B < 0, then B-1 = -(-B)-1 and A/B = A(-B)-1.

And there you have it; we now have enough structure to get the rest of the operations we have on the reals, such as exponentiation, logarithms, and functions. As an aside, Dedekind cuts technically consist of two sets, A and Z, with Z = Q\A, but Z is very rarely used, as it’s uniquely determined by A. On the other hand, the surreal numbers are defined via an analogous construction: each surreal corresponds to at least one ordered pair of sets of surreal numbers (L, R). However, the only requirement for surreals is that if x is in L and y is in R, then x < y. At first, this definition of surreals as sets of surreals appears to be circular; if we need surreals to make surreals, how can we make one? But I’ll leave that for you to ponder.

Constructing the Reals, part 3: Rationals

Sorry for the delay; physics camp has been killing me lately.

In the previous Constructing the Reals post, I derived the non-negative rationals from the natural numbers. Now we’ll finally get symmetry by adding negative number to get the signed rational numbers. The naive way to do so is to simply say that a signed rational number is a rational number with a sign. This does make multiplication easy, but it makes addition fairly difficult to define. So we take an alternative definition that makes addition, subtraction, and multiplication all easy; the only difficult part is division.

Define a signed rational number x as an ordered pair (|a|,|b|) of unsigned rational numbers, identifying the unsigned rational x with the signed rational (|x|, |0|). We define (|a|, |b|) = (|c|, |d|) if and only if |a| + |d| = |b| + |c|. Addition is simply pairwise: (|a|, |b|) + (|c|, |d|) = (|a| + |c|, |b| + |d|). Multiplication isn’t, however, but by analogy with algebra, we can easily guess that (|a|, |b|)(|c|, |d|) = (|a||c| + |b||d|, |a||d| + |b||c|). As it turns out, we can finally define subtraction as well: (|a|, |b|) – (|c|, |d|) = (|a|, |b|) + (|d|, |c|).

Division, however, is harder; we have to define it by saying that when |w/x| > |y/z|, the inverse of (|w/x|, |y/z|), written (|w/x|, |y/z|)-1, is equal to (|zx/(wz-xy)|, |0|), where we define ab for non-negative integers a, b to be the unique integer c such that a = b + c (although I should probably prove that such a c exists, that’s not the point). On the other hand, if |w/x|< |y/z|, the inverse is (|0|, |zx/(xy-wz)|). If |w/x|= |y/z|, of course, then (|w/x|, |y/z|) has no inverse (since it would be like the inverse of 0). But using inverses, we can define x/y = x y-1. It can be shown that multiplication and addition under these definitions follow the rules you expect them to, as do addition and subtraction. In other words, they form what is called a field, which can be roughly defined as ‘anything with addition, subtraction, multiplication, and division that behave like they do for real numbers’. Other common fields are the complex numbers and the real numbers themselves, although there are obviously others.

Tomorrow, I’ll finally get to everyone’s favorite: the reals! And this time I mean it; I’ve got the post queued up in WordPress and everything. After that, I’ll show you a different method that both gets you to the reals faster (we get them all at once!) and extends them to both infinitely large and infinitely small numbers, and even numbers that are neither positive, negative nor zero. But that’ll have to wait, and with graduation and physics camp and everything, I don’t know when I’ll start on that series.

Constructing the Reals, part 2: Non-negative rationals

In my first post in the Constructing the Reals series, I explained how to create the natural numbers, such as 0, 7, and 23, using only sets, as well as define addition and multiplication. Now, using those tools, we can construct the (non-negative) rationals, like 11/8, 78/7, and 11/1. To avoid repetition, I’ll leave out the word non-negative in the rest of this post.

But first, we have to define an ordered pair, which is exactly what it sounds like. Formally, the ordered pair (a, b) is the set {{a}, {a, b}}. Using this definition, we distinguish the first part of the pair as the object in both sets and the second element as the object in only one set. We can now define the (non-negative) rational number |x/y|, with x and y natural numbers and y ? 0, as the set of ordered pairs of natural numbers (a, b) with� b ? 0 such that ay = bx. Obviously, this definition includes the ordered pair (x, y), since xy = yx. The reason for the bars is that I intend to use the general notation x/y for all rational numbers; they should not be confused with the absolute value operator.

Much like with the non-negative integers, now that we have the rationals, we need to define what you can do with them. Equality again is defined as being the same set; it’s not too hard to show then that |a/b| = |x/y| if and only if ay = bx. For addition and multiplication we basically pull the definition from how we ordinarily work with rationals: |a/b| < |x/y| if and only if ay < bx, |a/b| + |x/y| = |(ay + bx)/by|, and |a/b||x/y| = |(ax)/(by)|.� But we can now define a new operation that didn’t exist in the natural numbers: division. We just define it as usual: |a/b| / |x/y| = |a/b||y/x| = |ay/bx|. Again, these have the properties we usually associate with them.

Now that we have the rationals, the question arises: how do we place each one of the natural numbers as a rational? It turns out that the answer is simple: each natural number x is identified with the rational number |x/1|. If we do this, then 0 maps to |0/1|, which has the property that |a/b| + |0/1| = |(a1 + b0)/(b1)| = |a/b|: |0/1| is the additive identity. Similarly, 1 maps to |1/1|, which satisfies |a/b||1/1| = |(a1)/(b1)| = |a/b|; it’s the multiplicative identity. If you’re familiar with group theory, then the non-negative rationals as we’ve defined them form a group under multiplication with |0/1| excluded: closure is obvious, associativity follows from the associativity of natural number multiplication, since (|a/b||c/d|)|e/f| = |(ab)c/(de)f| = |a(bc)/x(yz)| = |a/b|(|c/d||e/f|), we just showed that |1/1| is an identity, and the inverse of |a/b| is |b/a|: |a/b||b/a| = |ab/ba| = |1/1|.

Strictly speaking, the standard method to construct the rationals from the natural numbers is to define an equivalence relation, ?, between ordered pairs (x, y) of natural numbers (again with y nonzero), and define rational numbers as equivalence classes of these ordered pairs (the equivalence class of an ordered pair (x, y) is the set of all pairs (a, b) with (a, b) ? (x, y)).� But that definition is essentially identical to the one I used. I will use this definition for negative numbers, to give you an idea of what it’s like, but that will have to wait until the next post! Your homework is to prove that multiplication of rational numbers distributes over addition.

Constructing the Reals, part 1: The natural numbers

We all work with real numbers every day, although different people use different subsets. Everybody uses the integers, numbers with no fractional part such as 2, -158, or 0, to count things out or to figure out how many minutes are left until a certain time. We also use rational numbers (which are fractions of integers such as -5/7 or 11/3 as well as the integers, since 11 = 11 / 1) when calculating prices: a price of $5.99 is the same as 599/100 dollars, and finding a 15% tip is the same as multiplying by 15/100. Mathematicians use real numbers, which are the rationals as well as the square root of 2 or ?, the ratio of the circumference of a circle to its diameter. But most people (including many mathematicians!) don’t really know what the reals are, only that they have certain properties. That’s what this series, ‘Constructing the Reals’, is about: constructing a solid definition of the real numbers starting with only the concept of sets, and in the process proving wrong the famous mathematician Leopold Kronecker, who once remarked that “God made the integers; all else is the work of man.” As we will see, even the integers are ‘the work of man’; God only made the empty set!
Of course, you can’t go straight from sets to real numbers; you have to make stops along the way. There are different orders in which various subsets of the reals can be constructed, but one common way is to start with the non-negative integers (usually, but not always!, called the natural numbers), then construct the non-negative rational numbers, then all rational numbers, and finally all real numbers. This is the path that I’ll take, with one construction per post.

A word of caution is required here: as we construct larger subsets of reals, the ‘same’ number will appear as different sets. For example, 0 is both the set {}, which contains nothing, and the set that contains all pairs of natural numbers where the first element is 0 and the second is not, such as (0, 2) or (0, 9). To avoid ambiguity, I’ll use subscripts on numbers to distinguish which is being considered: natural numbers will use ?, non-negative rationals will use ?+, general rationals will use ?, and the general real number will not use anything. The same convention applies for operations such as addition and comparisons like less than; <? refers to comparing natural numbers, as opposed to <?+ which compares non-negative rationals. It turns out equality is always the same, with two things being equal if and only if they have the same elements, so we don’t need a subscript for those. I’ll also leave out subscripts when there’s no danger of ambiguity. With that out of the way, let’s begin!

The concept of a set is hard to define exactly, but for our purposes we can define it as ‘a collection of other objects without repetition’. The simplest set is the set that contains nothing at all, the empty set, written as {} or ?. So we identify it with the simplest number, 0. The next simplest set is the set that contains the empty set, { {} }, which we call 1. But 2 is not the set that contains 1?, but rather the set containing both 0 and 1: 2 = {0, 1} = { {}, { {} } }. We continue to add more and more numbers like this, each one being the same as the set of all the numbers before it.

We now have the natural numbers, but we don’t yet know how to do anything with them. Equality is obvious: two numbers are equal if and only if they have the same elements in them. The next obvious operation is ‘less than’: we say x < y if and only if x is in y (or, in set-theory notation,