Breeding Shakespeare, Not Typing

“A thousand monkeys, typing on a thousand typewriters will eventually type the entire works of William Shakespeare.”

This quote is often attributed to Thomas Huxley, Darwin’s most faithful followers in the debate that followed the publication of “Origin of Species” in 1859. Other versions of the quote have a million monkeys or an infinite number of monkeys in each case typing on equally many typewriters. Huxley is said to have used it as a metaphor to argue that chance alone would eventually result in the diversity of life on Earth. The story of Huxley is not true, but regardless of who came up with the notion, it certainly is thought provoking.

Calculations show that even the monkeys just typing “To be or not to be, that is the question.” would be incredibly unlikely to happen by chance. I however decided to attempt to breed Hamlet rather than using pure chance and the results were quite interesting.

Well, firstly, if you didn’t follow the link to the calculations above then let me brief you on the results:

If 17 billion monkeys on each of 17 billion habitable planets in each of 17 billion galaxies in the universe would be typing away at the rate of one 41 character line per second for 17 billion years, the odds that they would have come up with “To be or not to be, that is the question.” would still be only around: 0.000000000005%. You can just imagine the odds that they would come up with the full text of Hamlet, let alone the entire works of Mr. William.

Because of these incredibly low odds, anti-evolutionists have used the monkey notion to ridicule that the random processes of evolution could possibly be the force behind nature’s diversity (Brett Watson, the author of the article seems to be one of them). These claims as most other claims of the adversaries of evolution theories have already been answered by people more capable than me.

Nevertheless, as I have lately been thinking (and writing) quite a lot about genetic algorithms, the “thousand monkeys problem” inspired a little experiment. What if, instead of typing randomly on their typewriters, we would rather teach a thousand monkeys a few simple tricks of evolution and give Shakespeare’s text a go?

The methods I propose are really simple.

  1. Initialize: Start of with a set of 1000 random text strings of the desired length.
  2. Select: Evaluate each string based on how many characters match the corresponding character in the desired text. Select the two strings that rate highest in step 2. If more strings are tied at the top, throw away all but two anyway.
  3. Breed: Generate a new set of thousand strings where the first two are the “survivors” from step 3 and the remaining 998 are made by mixing the parent strings at a single random place (e.g. chars 1-10 from string 1 and chars 11-41 from string 2).
  4. Mutate: Change a single character in every one of the 998 new strings.
  5. Repeat: Now repeat steps 2 to 4 until you’ve got the desired string.

Now, this can probably be calculated using some elegant statistical computation methods (if you do the math, please send me your results). As I am not capable of such math, I decided to do it the fun way and wrote a simulation of the above process.

The results made me check my code several times before I believed them. On average the 41 character text would appear in only a little less than 25 repetitions of the process (i.e. 25 generations)!

So to match the (very impressive) working speed of the monkeys from the previous example, let’s say that cutting, pasting and altering one character takes each monkey one second, the evaluation another second and the photocopying process the third. We would have our text ready in less time than it takes to make a cup of tea.

Interestingly enough, the numbers seem to scale, so Excel tells me that a 500,000 character string (roughly the text of an average novel – Harry Potter excluded) would take about 352,000 generations (about 12 days using the same assumptions) and Shakespeare’s entire works (roughly 5,000 kilobytes) about 3.5 million generations (a little less than 125 days).

I don’t want to draw too many conclusions from this, but the results themselves are surely interesting. Obviously the evaluation of the “fittest” is never this clear in evaluation processes in nature and there is never really any “goal” there, but the experiment shows the dramatic difference between a purely random process and the processes at work in evolution. It took our thousand monkey army only one and a half minute to produce the line that the intergalactic typewriting army would not have come up with in the entire lifetime of the universe.

Postnote:
Just as I was about to complete this entry, I found a reference to work by one Mr. Richard Hardison that in the 1980s wrote a simple BASIC program with similar idea in mind to show that evolution is not like the thousand monkeys at all. He uses a quite different method from the one I used in my experiment but with similar results, i.e. that if you add a little “preference” to the randomness, the processes are much shorter.

Additional links:
15 Answers to Creationist Nonsense – Scientific American, July 2002
Monkeys Don’t Write Shakespeare – Wired May 9, 2003 (what really happens when monkeys are given keyboards 🙂

One comment

  1. There is a fun experiment going on at http://user.tninet.se/~ecf599g/aardasnails/java/Monkey/webpages/ While you have the page open, you are running a monkey-typing simulator (no peeing on the keyboard though). They have a very large population of monkeys (that breed, but not at an optimum rate, because they have too much fun typing!) The absolute record so far is the first 11 letters from “Anthony and Cleopatra”. It seems easy to get 7 or 8 letters right from any of the works, but any additional letters seem to be rather painful.

Comments are closed.