The Infinite Monkey Theorem
The infinite monkey theorem is a bar in Austin, TX, but it is also a very cool mathematical theorem that deals with very big numbers. This article is about the theorem which has become very popular in recent times. The theorem as I understand it is as follows:
A monkey hitting keys at random on a typewriter style keyboard for an infinite amount of time will almost surely type the complete works of William Shakespeare.
There are many variants of this theorem which can include pretty much any target text from a single sentence to an entire library. The reasoning behind this supposition is that given infinite time, random input should be able to produce all possible output.
The infinite monkey theorem is important and has been used in software development and testing, commodity computing, project management and SETI to support a greater allocation of resources. The theorem is also used to illustrate basic concepts in probability [1].
The theorem also finds itself in popular culture including the TV cartoon the Simpsons. In one [episode](Last Exit to Springfield) there is a scene where Mr. Burns brings Homer to his mansion. One of his rooms has a thousand monkeys typing on typewriters. One of the monkeys writes a slightly incorrect line from Charles Dickens: "It was the best of times, it was the blurst of times."[2]
In 2011 programmer Jesse Anderson decided to try to recreate this experiment using virtual monkeys that output gibberish. The algorithm would mimic monkey's banging on a keyboard. He wrote a computer program to see if the monkey's gibberish could match every work of Shakespeare. If there is a match from the gibberish to any portion of Shakespeare then the program will mark it matched and continue on. This isn't replicating the theorem exactly, but it is an interesting variation on it.
Jesse's program ran in the cloud on Amazon EC2 instances so he could create computational scale. He was able to start the project and match all 38 works of Shakespeare in 1.5 months. Over the course of the project, 7.5 trillion character groups were generated and checked out of 5.5 trillion possible combinations. [2:1] There are some other interesting statistics to come out of this project:
- monkeys ran 180,000,000,000 character groups per day.
- average iteration lasted 30 minutes and ran 5,000,000,000 character groups.
- monkeys found 1,982,507 distinct character groups and those character groups were found 3,788,175 times for a ratio of 1.8718555.
Proof
The proof of the infinite monkey theorem is pretty straight forward. If two events are statistically independent, then the probability of both of them happening equals the product of each one happening independently.
So if we wanted to show the probability of the monkey's typing the word "banana" it would be the probability of typing the letter b which is \(\frac{1}{50}\) multiplied by the probability of typing an a which is \(\frac{1}{50}\) and...
$$ \frac{1}{50} \times
\frac{1}{50} \times
\frac{1}{50} \times
\frac{1}{50} \times
\frac{1}{50} \times
\frac{1}{50} = (\frac{1}{50})^6 = \frac{1}{15,625,000 ,000}
$$
One in 15 billion sounds like a big number but it isn't zero so given an infinite amount of time you can be sure that you can eventually hit it. The formula for hitting any specific character sequence on a keyboard is:
$$
X_n = \big(\frac{1}{50} \big)^n
$$
As \(n\) grows, \(X_n\) grows smaller. For an \(n\) of a million, \(X_n\) is roughly 0.9999, but for an \(n\) of 10 billion, \(X_n\) is roughly 0.53. As \(n\) approaches infinity, \(X_n\) approaches zero. That is, by making \(n\) large enough, \(X_n\) can be made as small as you desire and the chances of hitting your phrase approaches 100%.
Probabilities
Ignoring punctuation, spacing and capitalization, a monkey typing letters uniformly at random has a chance of one in 26 of typing the first letter in Hamlet. It has a chance of one in \(676\) \( (26 \times 26) \) of typing the first two letters. Because the probability shrinks exponentially, at 20 letters it already has a chance of one in \(26^{20} = 19,928,148,895,209,409,152,340,197,376 \). In the case of the entire text of Hamlet, the probabilities are vanishingly small.[3] Thus there is a probability of one in \( 3.4 \times 10^{183,946}\) to get the text right on the first try. The math for that probability is based on 130,000 characters in Hamlet. There are \(n=26\) characters on a keyboard. Using our formula from earlier we have \(26^{130,000}\). If we take the logarithm of each side we end up with:
$$
log_{10}(n) = 130000 \times log_{10}(26) = 183946.5352
$$
Therefore \(n = 10^{0.5352} \times 10^{183,946} = 3.429 \times 10^{183,946}\).
To get an idea of the magnitude of this probability, lets assume that we have a monkey on every atom in the known Universe. Those monkeys will type away on their keyboards from the big bang until the known end of the Universe.
There are \(10^{80}\) protons in the observable Universe. Assume the monkeys work for \(10^{38}\) years. Assuming the monkeys work non-stop at \(400\) words per minute, that's about \(2000\) characters per minute. There are \(500,000\) minutes in a year which means each monkey types half a billion characters per year. This gives the total letters typed:
$$
10^{80} \times 10^{38} \times 10^9 = 10^{127}
$$
Even with that many monkeys over that very large period of time, they wouldn't likely type Hamlet. No, they will need a far greater amount of time. They will need more than \(360,000\) orders of magnitude more just to have a \(1\) in \(10^{500}\) chance. Or to put it another way, to have a one in a trillion chance to type out Hamlet we would need another \(10^{360,641}\) Universes full of monkeys. \(10^{127}\) might as well be zero compared to \(10^{360,641}\).
One of the more interesting features of this theorem is the magnitude of infinity. We often talk about infinity but it is really hard to gauge how big it is. The infinite monkey theorem shows that even very tiny probabilities are no match for the awesome size and power of infinity.
WhatIs - infinite monkey theorem ↩︎
Jesse Anderson's Million Monkeys Blog Article ↩︎ ↩︎
Wikipedia article: Infinite Monkey Theorem ↩︎