billion of those little letters,
A, T, C,
and
G,
which is sufficient to fill a two-hundred-thousand-page phone book, if that type of comparison is appealing. But every time a cell splits into two, from the first time your fertilized egg divides to the newborn skin cells in a paper cut, all the DNA in those cells copies itself. The new cell contains all the same DNA of its parent. 4
Being able to replicate from generation to generation is a neat trick indeed. But alone, that would just fill the world with a pretty molecule. DNAâs secret power is that this string of chemical letters, the bases, is a code that harbors information. That information is an instruction manual for all of the processes of life, including the very instructions required for the replication process itself. Understanding how DNAâs code works will unlock not just how mutation and variation take place, drawing us closer to an answer to the question posed in Darwinâs âinto few forms or into one,â but will give us essential clues as to how life was initially formed.
How DNA Works
All life is made by, or of, proteins. They form the structures and catalysts of biology, and the manufactories of bone, hair, and all the bits of a body that arenât actually made of protein. Naturally, this isnât limited to us, or even to mammals. Every leaf, strip of bark, reptilian scale, horn, fungus, feather, and flower is made of or by proteins. These workhorses of life are themselves made up of strings of smaller units called amino acidsâa generic name for a potentially infinite number of molecular parts that qualify for this chemical moniker.
We now know that each geneâthose discrete units of inheritanceâin a genome is a piece of code made up of the specific pairings of DNA bases that encrypt the construction of a protein. However, only a few parts in many speciesâ genomes are genes. The rest, in fact the overwhelming majority of DNA in humans, comprises notes, instructions, scaffolding, and even insertions from interloping viruses. Some are the remnants of genes from our ancestors whose function has been lost in us, but whose ghosts remain in our genome, free from the pressure of natural selection to slowly rust. Having established in 1953 that DNA was a corkscrewed ladder, and that it had the twin powers of replication and coded meaning, the biggest challenge in biology after Crick and Watsonâs great leap forward was to decode DNA. This meant first identifying which bits of DNA were genes and which were not.
Imagine that every sentence in this book were a single gene: the human genome would be a book forty times longer than this one, filled with random text but with my sentences distributed randomly throughout it. How would you identify which of them were the relevant sentences? Language is studded with punctuation to complement the letters and add composite meaning above the words themselves. Indeeditisprettydifficultforustoextractmeaningfroma sentencethathasnospacesorpunctuationinit.
Fortunately for science, and necessarily for the cell, DNA is no different. Before scientists could work out what the words and sentences of DNA mean, the first challenge was to work out where the spaces wereâwhere a gene begins and endsâand this meant working out its punctuation.
By the 1960s, scientists knew that life was built of or by proteins, that proteins were built from amino acids, and that DNA was the hereditary matter that coded the proteins. The big gap was getting from one to the other, from DNA code to protein. By experimentally inserting a molecule in between two letters of functioning DNA, Crick and another future Nobel Prize laureate named Sydney Brenner intentionally disrupted its code and prevented a protein from being produced. These disruptions are known as frameshift mutations, like a film projector whose shutter speed is wrong, so you see half of one frame and half of another. By interrupting DNA