| • Science | • People | • Locations | • Timeline |
The field was developed by Andrey Kolmogorov, Ray Solomonoff and Gregory Chaitin starting in the late 1960s. There are several variants of Kolmogorov complexity or algorithmic information. The most widely used one is based on self-delimiting programs and is due to Leonid Levin (1974).
To formalize the above definition of complexity, one has to specify exactly what types of programs are allowed. Fortunately, it doesn't really matter: one could take a particular notation for Turing machines, or LISP programs, or Pascal programs, or Java virtual machine bytecode. If we agree to measure the lengths of all objects consistently in bits, then the resulting notions of complexity will differ only by a constant term: if K1(s) and K2(s) are the complexities of the string s according to two different programming languages L1 and L2, then there is a constant c (which only depend on the languages chosen, but not on s) such that
Here, c is the length in bits of an interpreter for L2 written in L1. (One technical requirement is that it must be possible to embed arbitrary binary data into programs without delimeters, e.g. by providing such data on "standard input" and considering all bits read from this stream as part of the program.)
In the following, we will fix one definition and simply write K(s) for the complexity of the string s.
The first surprising result is that K(s) cannot be computed: there is no general algorithm which takes a string s as input and produces the number K(s) as output. The proof is a formalization of the amusing Berry paradox: "Let n be the smallest number that cannot be defined in fewer than twenty English words. Well, I just defined it in fewer than twenty English words."
It is however straightforward to compute upper bounds for K(s): simply compressIn computer science, data compression is the process of encoding information using fewer bits, or information units, thanks to specific encoding schemes. For example, this article could be encoded with fewer bits if we accept the convention that the word the string s with some method, implement the corresponding decompressor in the chosen language, concatenate the decompressor to the compressed string, and measure the resulting string's length.
The next important result is about the randomnessIn ordinary language, the word random is used to express apparent lack of purpose or cause. This suggests that no matter what the cause of something, its nature is not only unknown but the consequences of its operation are also unknown. In most technical of strings. Most strings are complex in the sense that they cannot be significantly compressed: K(s) is not much smaller than |s|, the length of s in bits. The precise statement is as follows: the probabilityThe word probability derives from the Latin probare (to prove, or to test). Informally, probable is one of several words applied to uncertain events or knowledge, being more or less interchangeable with likely risky hazardous uncertain and doubtful depend that a random string of length n has complexity less than n − k is smaller than 2−k. The proof is a counting argument: you count the programs and the strings, and compare. This theorem is the justification for Mike Goldman's challenge in the comp.compression FAQ:
Now for Chaitin's incompleteness result: though we know that most strings are complex in the above sense, the fact that a specific string is complex can never be proven (if the string's length is above a certain threshold). The precise formalization is as follows. Suppose we fix a particular consistent axiomatic systemIn mathematics, an axiomatic system is any set of axioms from which some or all axioms can be used in conjunction to logically derive theorems. A mathematical theory consists of an axiomatic system and all its derived theorems. An axiomatic system that is for the natural numbersNatural number can mean either a positive integer ( 1, 2, 3, 4,. or a non-negative integer ( 0, 1, 2, 3, 4,. Natural numbers have two main purposes: they can be used for counting ("there are 3 apples on the table"), or they can be used for ordering ("this, say Peano's axioms. Then there exists a constant L (which only depends on the particular axiomatic system and the choice of definition of complexity) such that there is no string s for which the statement
can be proven within the axiomatic system (even though, as we know, the vast majority of those statements must be true). Again, the proof of this result is a formalization of Berry's paradox.
Similar ideas are used to prove the properties of Chaitin's constantIn the computer science subfield of algorithmic information theory the Chaitin constant or halting probability is a construction by Gregory Chaitin which describes the probability that a randomly generated program for a given model of computation or progr.
The minimum message length principle of statistical and inductive inference and machine learning was independently developed by C.S. Wallace and D.M. Boulton in 1968. MML is Bayesian (it incorporates prior beliefs) and information-theoretic. It has the desirable properties of statistical invariance (the inference transforms with a re-parameterisation, such as from polar coordinates to Cartesian coordinates), statistical consistency (even for very hard problems, MML will converge to any underlying model) and efficiency (the MML model will converge to any true underlying model about as quickly as is possible). C.S. Wallace and D.L. Dowe showed a formal connection between MML and algorithmic information theory (or Kolmogorov complexity) in 1999.