Can Random, Non-Directed Processes Create DNA Information?
In my previous post on DNA, I mentioned the following argument:
A) DNA is information.
B) Information cannot arise from a random, non-directed process.
C) Darwinism requires DNA to have arisen from a random, non-directed process.
D) Therefore, Darwinism cannot explain DNA.
In my first post, I demonstrated A) DNA is information. In this post, I will demonstrate B) Information cannot arise from a random, non-directed process.
The first thing to note is an example that Apolonio brought up. He said:
For example, we can conceive of a case where a person knocks over a scrabble box and the letters I Love You comes out with that order.
While this would be a semi-random process creating information, it is not using foundational forces. The specific example requires a person to knock over the Scrabble box. But even if we adjust for that and make it gravity pulling a box off a shelf or something similar, Scrabble tiles are not foundational in nature; they are designed. So the information still requires a non-foundational force (human ingenuity) to create the tiles which are used to create information in the pattern “I love you.”
Even then, the odds that “I Love You” would form are quite rare. Assuming an equal sample of each letter of the alphabet (as well as an infinite supply of them), you have 8 letters, so the odds of pulling these particular letters would be 1/268, or 1 in 208,827,064,576, which is: 4.79 x 10 -12. If you include the space as a character, we have 10 characters and 27 possibilities each draw: 1/2710, or 4.9 x 10-15.
In reality, however, Scrabble boxes do not contain an equal sampling of each letter. Instead you have 12 Es; 9 As & Is; 8 Os; 6 Ns, Rs, Ts; 4 Ds, Ls, Ss, Us; 3 Gs; 2 Bs, Cs, Fs, Hs, Ms, Ps, Vs, Ws, and Ys; 1 J, K, Q, X, Z. Finally, there are 2 blanks. This yields 100 total pieces. If we use the blanks as spaces, the odds for each letter in “I [blank] Love [blank] you” are:
I = 9/100
Blank = 2/99
L = 4/98
O = 8/97
V = 2/96
E = 12/95
Blank = 1/94
Y = 2/93
O = 7/92
U = 4/91
Because we are not dealing with an infinite number of tiles, we have to reduce how many are available after each selection. Thus, we have a 9/100 chance of pulling an I on the first draw from the box. If we do so, there are now only 99 tiles remaining, 2 of which will be blanks. That means we have a 2/99 shot for the blank, etc. Note that when a letter repeats (for instance, the O), we have to decrease the number remaining too. Thus, the first draw of an O is 8/97 but the second is 7/92 (because the first draw picks one of the Os). Finally, we get the combined odds by the following:
9/100 x 2/99 x 4/98 x 8/97 x 2/96 x 12/95 x 1/94 x 2/93 x 7/92 x 4/91, which is:
774,144 / 62,815,650,955,529,472,000
Or 1.23 x 10-14
Which is roughly 1 in 81 trillion. So even though the tiles were created by humans, a random arrangement of them to spell out “I love you” is still extremely rare.
The above does, however, help us understand a bit about DNA. As most are already aware, DNA uses 3-base codons to create amino acids. There are four possible DNA bases (ACGT), and that means that means 43 (64) possible combinations of those letters. However, there are only 20 amino acids. As a result, amino acids are often encoded by multiple numbers of codons. For instance, Leucine (L) can be encoded by CTT, CTC, CTA, CTG, TTA, and TTG. Which means there are 6 possibilities for L. In fact, quickly going through the amino acids (using their single-letter code name) we find:
I = 3
L = 6
V = 4
F = 2
M = 1
C = 2
A = 4
G = 4
P = 4
T = 4
S = 6
Y = 2
W = 1
Q = 2
N = 2
H = 2
E = 2
D = 2
K = 2
R = 6
Stop = 3
As you can see, all 64 possible combinations would be represented in the above. Therefore, we can say that given a random piece of DNA with 3 codons, there is a 3/64 chance that it is I (Isoleucine) and a 2/64 chance that it is N (Asparagine), etc.
Because base pairs are so prevalent, we can treat them as if there is an infinite supply of them. As a result, if we wanted to calculate what the odds would be that six base pairs will code for Isoleucine and then Asparagine, we would simply multiply 3/64 and 2/64 to yield: 6/4096, or about 1 in 683.
Of course, proteins can have hundreds of amino acids chained together in polypeptides. (In fact, by convention, most scientists do not consider a polypeptide chain to be a protein until it has at least 50 amino acids in it, although that is an arbitrary dividing line.) Because of their size, the odds of even a single 50-amino acid polypeptide forming are quite rare. In fact, even if they were simply a chain of L (Leucine), which has a 6/64 chance of forming for each L, the odds of 50 formulating would be 650 / 6450, which is roughly 8 x 1038 / 2 x 1090 which is approximately 4 x 10-52, or 1 chance in 3 x 1051.
Clearly, this method of explaining DNA is insufficient to explain even a basic protein, let alone complex cells and higher organisms.
This brings us to our next point, which is something that Mighty Pile brought up: the definition of information (i.e., something that is non-repeating, non-random, and not based on foundational forces) seems to exclude the ability of random, non-directed processes in the first place. As such, B) seems to be proven by stipulation, which means it relies on a circular argument.
However, when we examine B) carefully we see that it does not rely on circular reasoning when cashed out. To demonstrate how that is possible, I must first point out that the Darwinist must assert the opposite of B). They must assert that information can arise from random, non-directed processes (as evidenced by premise C)). And this is demonstrated by the fact that you are reading this blog post, which is information.
This blog post has an author. The author is not a random, non-directed process. But, if Darwinism is correct, at some point we can link my existence back to a random, non-directed process. Therefore, in a causative sense, the Darwinist would say that a random, non-directed process somehow created a non-random, directed process that was able to create information.
And it is because this option remains open to the Darwinist that B) does not entail circular reasoning. All the Darwinist needs to do is to show that Information can arise from forces that are non-random, non-repetitive (to exclude crystals) and non-foundational if those forces (we will call them meta-forces) are themselves built on random, non-directed forces. In other words, the Darwinist can argue: “Information comes from meta-forces, which are non-foundational; but meta-forces come from foundational forces.” Putting it into this two-step process would avoid the circular reasoning charge, while also giving the Darwinist a possible route to establishing C).
So the question now becomes, can random, non-directed processes create non-random and non-repeating meta-processes that could then create information in the form of DNA? DNA is one of the simplest information processes we can think of (compare it to trying to establish the framework for a spoken language), but even it is vastly complicated. In order for DNA to function, it has to store information that is used to create amino acids that bond together to form proteins that then create the mechanism for storing and reading DNA. In other words, in order for DNA to function biologically, we need to have a loop where DNA is used to create the processes needed to create more DNA. DNA is copied via cellular processes that are created with proteins that are themselves created by DNA. Thus, we have a vicious cycle going on.
But before we get to the loop, is there a simple way to just encode amino acids into DNA? Amino acids, after all, are fairly easy to create in a test tube, as Stanley Miller demonstrated (albeit his experiment does not prove what he thought it proved). Using those same “primitive” conditions, however, it is not possible to create DNA.
DNA also presents a problem because, as you’ve seen above, sometimes as many as six different DNA codons can represent a single amino acid. While moving from a DNA codon to an amino acid is easy, moving from the amino acid to a particular strand of DNA is much harder.
Due to the limitations of DNA, Francis Crick proposed that life began based on RNA instead of DNA. RNA is only single stranded, as opposed to the DNA double helix. RNA can also sometimes function similarly to proteins. DNA, however, is much more stable and less prone to errors (which is why an intelligent being would pick DNA instead of RNA to start life off; and which is why Darwinists claim DNA was “selected for” by Natural Selection).
Which brings up an important point. The “central dogma” (as Crick named it) is DNA to RNA to protein. It doesn’t go in the opposite order. (There are a few exceptions to the strictness of the “central dogma”, most notably RNA viruses (like HIV) which go from a single strand of RNA to DNA before then going through the “central dogma”; but there are no instances that I am aware of where proteins go to RNA then to DNA.) This makes it highly unlikely that amino acids bonded to become proteins and then those proteins created RNA that was then made into DNA and eventually stored in cells.
That means we had to start someway with DNA or RNA and then create proteins from that; but in order to create the proteins, it means we must have the structure in place by which RNA can be converted to a protein. Once again, we’re left with the chicken and the egg problem. And this system cannot have arisen by blind chance, since as you’ve seen even a single protein of 50 of the most common amino acids has astronomically long odds at forming randomly.
Regardless of where we start, we have to have some method of going from a random soup of amino acids to a particular sequence of amino acids being coded in information, be it RNA or DNA. But this will only start to happen if there is a reason for the information of a protein’s make-up to be converted to RNA or DNA.
That DNA is useful for life is not debated. Suppose that the amino acid “soup” manages to create a protein that could be used by a cell later on. It would be useful for the cell to have a way to rapidly create this protein. And the protein is created from amino acids that can be stored in DNA. Obviously, if we have this end in mind, we could design the process by which the DNA code comes about. But this requires teleology, which Darwinism denies. We cannot have the end of a working cell in mind; we have to have completely random processes that somehow create the necessary steps involved.
But suppose that we are left with only the random creation of the system to begin the evolutionary process. According to modern materialistic theory, life first became possible about 3.5 billion years ago. That is, the Earth cooled enough, the atmosphere was in the correct state, water existed, etc. so that life would not be extinguished if it was formed. Amazingly enough, according to these same scientists, the first life on Earth appeared roughly 3.5 billion years ago. In other words, as soon as it was possible for life to exist on Earth, life did exist on Earth. This must mean that the creation of life ought to be an “easy” process, given materialistic claims. If it is easy, then it should not rely on a process that has such poor odds of succeeding. Either life’s occurrence on Earth was a miracle against all odds, or else this cannot be how life began on Earth.
NOTE: This post has been updated since it was originally posted to correct the line: “While this would be a random process creating information, it is not using foundational forces” to “While this would be a semi-random process creating information, it is not using foundational forces.”






May 22nd, 2008 at 6:48 pm
I think you may be on the wrong side of the argument.
Here are some peer-reviewed papers that describe in painful mathematical detail exactly how natural selection generates genetic information:
http://web.danielmorgan.name/Articles/evpapers.html
Also, as Mark-Chu Carroll demonstrates, incompressibility is one of only a few meaningful ways of defining information, and as such, the oversimplification of using syntax/messages to represent information is actually wrong.
http://goodmath.blogspot.com/2006/06/information-theory-index.html
http://scienceblogs.com/goodmath/2006/06/dembskis_profound_lack_of_comp.php
http://scienceblogs.com/goodmath/2007/02/once_again_sal_and_friends_but.php
Best wishes,
sdm
May 23rd, 2008 at 6:54 pm
Thanks for the links. Part of the reason for this post was a fishing expedition, to get Darwinists to provide me with their responses so I could look at them in more detail. So I’ll look over the links in greater detail over the next few days, although scanning the titles through the “evpapers.html” link they don’t seem to be specifically relevant to my post (there may be something within them, of course; and if you know of one for sure, I’ll look at it immediately). But those examples seem to take the pre-existing organism as a given and then say, “Natural Selection can create information out of that.” But I’m not asking whether N.S. could do that here (although I also don’t think it can). My question goes more to the origin of life–how did the organism begin in the first place so that it could then have N.S. “act” upon it to generate more information?
In other words, one could use Dawkins’ response that six monkeys typing away on six typewriters will eventually produce the works of Shakespear. I merely point out that I’m interested in how the typewriters and monkeys came about in the first place in order to then semi-randomly produce Shakespear (I use the word “semi-random” to differentiate between the “random” actions of monkeys and the actual random actions of, say, radioactive decay).
As always, good to hear from you though. :-)
June 11th, 2008 at 4:38 pm
The “monkeys and typewriter” exist in the form of the laws of chemistry and physics — repeating stochastic processes that exist everywhere across the universe, but “select” for life under certain conditions (the right T, chemical composition of earth, water, &c.)…
Did you check out any of the papers? Kimura’s is classic, as he uses Malthus to estimate ~0.3 bits/generation, although this is WAY wrong given current information on rates of mutation…
Anyway, the important part is the approach he uses and the math.
June 11th, 2008 at 6:54 pm
Actually, I haven’t had time to do much reading lately as I’ve been working lots of overtime :-( It’s definitely good for you to remind me. I do have it on my list of things to do when I get some relatively free quantities of time coming my way, but the way my life goes pretty much nothing I plan for happens that way!
Anyway, I do think that even saying that the “monkeys and the typewriter” exist as the laws of chemistry and physics (which is where I thought you would probably go) that we solve the problem. After all, my first reaction is to ask: “Why are the laws of physics and chemistry the way they are?” And ultimately, that’s the point of the TOE and/or GUT (which I think make awesome acronyms regardless of anything else!). But even so, pretty much everything that I’ve read on the subject of what happened immediately after the Big Bang (i.e., what caused the break in symmetry, which lead to the four “foundational” forces (gravity, strong & weak nuclear, and electromagnetic) coming about) was that there was no reason. That is, there’s no physical law that says the universe has to be the way it is (e.g., that magnetic forces should be so much stronger than gravity, such that an object the size of a refrigerator magnet can repel the force of an object with the mass of the Earth!). Interestingly, many people I’ve read say that it is impossible for us to ever know why the forces broke symmetry the way they did, it just the way it did it period (and I have to point out that this is just as much as “science stopper” as saying “Goddidit” would be… :-) ).
But I don’t want to sidetrack too much along those lines. (Actually, I do want to because it’s interesting to me–but it doesn’t have much to do with the point of my post!)
Anyway, I’ll write more later. BTW, you can always feel free to drop me an e-mail too. It’s a yahoo address with the handle “petedawg34″ (lovely spambots, oh how I doth hate thee!)
June 12th, 2008 at 12:06 am
BTW, I did take some time to read through a few of the links on the “Goodmath” site. One thing I want to point out is from this post. After MarkCC ran his previous post through GZIP and compressed the information, he said:
—
Here’s the same information, compressed (using gzip) and then made
readable using a unix utility called uuencode:
M’XL(”)E8$$0“VIU;FL`19$Q=JPP#$7[6?7KIN',`M+]?HHLPH“)\8BMOPYM[#Y/GCDG#MLY2EI8$9H5GLX=*R(_+ZP/,-5-1#\HRJNT`77@LL,MZD,"H7LSUKDW3@$#V2MH(KTO$Z^$%CN1Z>3L*J^?6ZW?^Y2+10;\+SOO'OC"/;7T5QA%987SY02I3I$MLKW"W,VZ-(J$E"[$;'2KYI^\-_L./3BW.+WF3XE8)?@D8X^U59DQ7IA*F+X/MM?I!RJ*%FE%])Z+EXE+LSN*,P$YNX5/P,OCVG;IK=5_K&CK6J7%’+5M,R&J]M95*W6O5EI%G^K)8B/XV#L=:5_`=5ELP#Y#\UJ??>[[DI=J''*2D];K_F230″$`@(“““
That’s only 465 characters; and if I didn’t have to encode it to stop
it from crashing your web-browser, it would only have been 319
characters. Those characters are a lot more random than the original -
so they have a higher information density.
—
The problem with this is that the characters aren’t “more random” (at least not in a meaningful mathematical sense, which surprises me that he’d do that). Now they appear more random if you’re expecting English text. But in that case, the concept of “Randomness” becomes completely relativistic.
As an example, it would be like saying “Te amo” is more random than saying “I love you” because “Te amo” isn’t English. A Spaniard, however, would say that “I love you” is more random than “Te amo.” The only difference is the language being used and who knows it.
FWIW, it would be true that the string “I love you” contains more information in the sense that it contains more characters.
Anyway, I think part of what many Darwinists forget when they look at information theory as it relates to DNA is that compressing data requires an increase in the complexity of the program that interprets the data. And we can see that even simply from the example given by MarkCC above.
It doesn’t take much “work” for an English reader to read through his original two paragraphs. It would be virtually impossible for that same reader to turn the ZIP file back into legible English so he could extract the message…and it’s all to save less than 200 characters (514 characters converted to 319 characters). Naturally, a computer can do the work for us–indeed, the computer must do the work for us.
Now to return to the idea that the ZIP file is “more random” than plain English: this is simply not accurate. In fact, every single character in the ZIP file is required to be exactly what it is in order for the file to be decompressed. In fact, ultimately that text must obey rules that random texts do not (again, I refer to the passage David Kahn wrote where he showed that even two texts that have the same letter frequency as random texts (i.e., each letter appears at roughly the same frequency), if they are encoded with the same key, will display “coincidences” at a rate higher than random texts would when the texts are lined up where the key would fit. ZIP compression would do the same thing, because zipped files have to follow the correct program else no one could unzip any files. Thus, there remains a mathematical difference between a random text and a text generated by ZIP compression; compression, therefore, does not increase randomness.
Finally, MarkCC has defined randomn characters as information; thus, in his system, there is no informational difference between signal or noise. That is: signal = information; noise = information; signal + noise = information. In fact, the more random a message is the more information it contains (since a completely random message cannot be compressed at all). This is, of course, how Shannon defined the terms, which is what MarkCC was going by, and under that system his conclusions follow. Information = entropy. The more entropy you have, the more information you have.
However, IIRC, you studied chemistry or physics, right? You may therefore recall that in physics, entropy destroys information. This would be the direct inverse of Shannon’s view because the more entropy you have, the less information you have.
The fact that there are competing definitions of information was why I specifically defined “information” in my original blog post as “something that is non-repeating and non-random and cannot be explained by only foundational forces.” Thus, my definition already conflicts with MarkCC’s definition, since his is that information is randomness and mine is that information is non-random (interestingly, we both agree that repetitions, or as he called it, redundancies, decrease information).
As I told one person that I spoke to on Triablogue, you can always feel free to substitute a different definition of information in there; but the key is that your definition must have some way to distinguish between signal and noise or else it’s not meaningful to the discussion of DNA.
Because we are working from different definitions, his post really doesn’t have anything to do with my argument. I did enjoy it, however; in fact, it makes me want to pick up my Godel, Escher, Bach again. :-)