| • Science | • People | • Locations | • Timeline |
| Contents | ||
The Cantor set is defined by repeatedly removing the middle thirds of line segments. One starts by removing the middle third from the the unit interval [0, 1], leaving [0, 1/3] ∪ [2/3, 1]. Now the "middle thirds" of all of the remaining intervals are removed. This process is continued ad infinitum. The Cantor set consists of all points in the interval [0, 1] that are not removed at any step in this infinite process.
Pictorially, the process would look something like this:
| Cantor set, in seven iterations |
The question becomes, what is left when you are done? If you add up the lengths of segments removed, it would calculate out to be:
(For details, see geometric series).
Using this calculation, you may be surprised if there were anything left - after all, the sum of the lengths of the removed intervals is equal to the length of the original interval. However a closer look at the process reveals that we must have something left, since removing the "middle-thirds" of an interval involved removing open sets (sets that do not include their endpoints). So removing the line segment (1/3, 2/3) from the original interval [0, 1] leaves behind the points 1/3 and 2/3. A little reflection will convince you quickly that they will never be removed, in fact none of the endpoints of any of the intervals at any stage in the process will ever be removed. So we know for certain that the Cantor set is not empty.
It may appear that only the endpoints are left, but that would be an error. The number 1/4, for example is in the bottom third, so it is not removed at the first step, and is in the top third of the bottom third, and is in the bottom third of that, and in the top third of that, and in the bottom third of that, and so on ad infinitum -- alternating between top third and bottom third. Since it is never in one of the middle thirds, it is never removed, and yet it is also not one of the endpoints of any middle third.
It can be shown that there are as many points left behind in this process as there were that were removed. To see this, consider the points in the [0, 1] interval in terms of base 3 (or ternary) notation. In this notation, 1/3 can be written as 0.1 and 2/3 can be written as 0.2. If we remove everything betweem 1/3 and 2/3, in ternary notation this is equivalent to removing everything between 0.1 and 0.2. This means that any ternary decimal of the form 0.1xxxxxx gets removed from the set, except those in which either every "x" is 0 or every "x" is "2" – those two are the endpoints.
Because 0.1 = 0.02222222... in base three, we can represent it without using a one in any position. See Cantor's diagonal argument for a discussion of this technical issue.
The next step examines the intervals [0, 0.1] and [0.2, 1] and removes their middle thirds. In this case we are removing everything between 0.01 and 0.02 in the first interval and between 0.21 and 0.22 in the second interval, or in other words, everything with a 1 in the second position after the point. By the time you are done, the numbers which remain are those that can be represented in ternary (base 3) notation with no '1' in any position.
Stated another way, the Cantor set consists of all the numbers between 0 and 1 that can be represented using only 0s and 2s in ternary notation. Therefore, the numbers in the Cantor set can be mapped onto the numbers in [0, 1] by replacing every 2 in the ternary expansion with a 1, and treating the result as a binary expansion. So there are as many points in the Cantor set as there are in [0, 1], and the Cantor set is uncountable (see Cantor's diagonal argument). Since the set of endpoints of the removed intervals is countable, there must be uncountably many numbers in the Cantor set which are not interval endpoints. As pointed out above, one example of such a number is 1/4, which can be written as 0.02020202020... in ternary notation.