(no title)
ssdfsdf | 11 years ago
Therefore I am not sure that compressing the string is likely to give you a sense of the information contained within it, at least information in the sense which we are interested in.
ssdfsdf | 11 years ago
Therefore I am not sure that compressing the string is likely to give you a sense of the information contained within it, at least information in the sense which we are interested in.
gwern|11 years ago
See for example http://www.illc.uva.nl/Research/Publications/Dissertations/D... 'Statistical Inference Through Data Compression'.
ssdfsdf|11 years ago
However I am still not convinced that this upper bound is going to provide us with the information that we need in this case. What we are looking for is a relative measure of the complexity between each genome. The upper bound will not necessarily give us this relative measure because it may not be able to compress the genetic code of organisms by the same factor. The compressibility of a particular genome, by a specific algorithm will be dependent on the method of encoding of information used by the organism. For instance the organism may repeat codes for redundancy, but it may permute the letters of the copy in a predictable way, for it's own reasons. The compression algorithm used will not pick up on this.
It is useful to use compressiblity by a range of algorithms as the input to a machine learning algorithm, or as part of the model in AIXI, but it is not useful for estimating algorithmic complexity (as far as I am concerned, I am open minded, but not convinced yet).