Hutter Prize - Incompressibility of text

May 20, 2008

compression intelligence linguistics

The Hutter Prize reflects that we cannot compress natural language by as much as we would expect:

“…in 1950, Claude Shannon estimated the entropy (compression limit) of written English to be about 1 bit per character [3]. To date, no compression program has achieved this level.” (http://cs.fit.edu/~mmahoney/compression/rationale.html)

It is inspired by the work of Marcus Hutter to show that compression can be used as a functional definition for intelligence.

The idea of the prize is the old one that we (can’t predict, and thus compress, text as much as we expect because we…) need human knowledge to “disambiguate” natural language. That’s an old idea. I believe almost the opposite. But the prize, and the work of Marcus Hutter which motivated it, is interesting for what it says about the predictability of natural language, and in particular the “randomness” of meaning. Where by the “randomness of meaning” I mean that Hutter’s work (like Schmidhuber’s “New AI”) assumes it is necessary to use probabilistic model of intelligence.

It is also a definition of intelligence dependent on goals, note (c.f. W. J. Freeman). Hutter: “No Intelligence without Goals.”

Hutter has written a book on this:

Universal Artificial Intelligence - Sequential Decisions based on Algorithmic Probability

http://www.hutter1.net/ai/uaibook.htm

There’s an annual Euro 50,000 prize for the best effort. (http://prize.hutter1.net/)