Their 1963 paper made an important distinction between molecules that carry genetic information—such as DNA or the proteins it encodes—and other molecules, such as vitamins, that cycle through a living creature and out the other end. Information molecules have histories that can be deduced; they have ancestors from which the variant forms, in this creature or that, have descended. Scrutiny of such molecules, wrote Zuckerkandl and Pauling, can tell us three things: how much time has passed since the lineages split, what the ancestral molecules must have looked like, and what were the lines of descent. The first of those three kinds of information became known as the molecular clock, although Zuckerkandl and Pauling hadn’t yet named it. The third kind implied trees.
Zuckerkandl continued reworking and developing these ideas, with Pauling as his coauthor and sponsor. In September 1964, before a distinguished and argumentative symposium audience at Rutgers University, he delivered a long paper that became the definitive version of their shared ideas and that, despite Zuckerkandl having done most of the writing, has been called the “most influential of Pauling’s later career.” In this paper, the two authors offered their memorable metaphor: if the minor changes in molecular variants are proportional to elapsed time over the eons, they said, what you have is “a molecular evolutionary clock.”
It was tentative, a hypothesis. The hypothesis was disputed at the Rutgers symposium and would be controversial in coming years, but it captured attention, it focused thought, and it promised a whole new way of measuring life’s history, if it was right. The molecular clock has since been called “one of the simplest and most powerful concepts in the field of evolution,” and also “one of the most contentious.” Crick himself later judged it “a very important idea” that turned out to be “much truer than people thought at the time.”
Emile Zuckerkandl, meanwhile, moved back to France. Along with Pauling and just a few others, he had helped launch a new scientific enterprise, and when a Journal of Molecular Evolution came into being, in 1971, he was its first editor in chief. His name isn’t familiar to the wider world, as Pauling’s is, but if you say “Zuckerkandl and Pauling” to a molecular biologist today, he or she will think “molecular clock.” Fitting as that may be, it overlooks the other important point: the other metaphor embedded in the long Rutgers paper, where Zuckerkandl wrote that “branching of molecular phylogenetic trees should in principle be definable in terms of molecular information alone.” This was a whole new way of sketching those trees, which rose and spread their branches as the clock ticked.
11
Carl Woese came to the University of Illinois, in Urbana, in 1964, the same year Zuckerkandl delivered the paper at Rutgers. The enterprise that would become molecular phylogenetics—back then bruited under other names, such as Crick’s protein taxonomy, and Pauling and Zuckerkandl’s chemical paleogenetics—had begun to attract interest. Woese saw its deepest possibilities more clearly than anyone else. Molecular sequence information, he realized, could be used to read the shape of the past.
Woese was thirty-six years old and was hired with immediate tenure, which gave him some latitude to undertake risky, laborious research projects without need to worry about quick publications. His professorship was in the Department of Microbiology, though he had trained as a biophysicist, not a microbiologist, and had spent little time if any peering through microscopes at bacteria and other tiny bugs. He was more interested in molecular biology, then still in its early phase. It was a thrilling new branch of science, its methods just being invented, its cardinal principles just taking shape, and he wanted to be part of that. But the molecular clock wasn’t Woese’s topic, and the prospect of a molecular tree of life hadn’t yet captured his imagination. He was focused instead on the genetic code—and not just what he called the cryptographic aspect: the matter of which bases in which combinations specified which amino acids for building proteins. He wanted to go deeper in time and meaning; he wanted to understand how the code had evolved.
He was well aware that Francis Crick and others, including the eclectic Russian physicist George Gamow, had been working on the cryptographic aspect as a theoretical problem, treating it like an abstract intellectual game. That problem had been illuminated, but not solved, since Crick’s 1958 paper, by a new recognition of RNA’s role, as a messenger molecule somehow carrying DNA instructions to the site in a cell where proteins are built. But what was the structure of RNA, and how did it play that role? Gamow and the others were puzzled, and to them the puzzle was a thrilling game. They had even formed an elite, semifacetious little club—limited to twenty members, reflecting the twenty amino acids of life—for the private exchange of ideas about how coding and protein synthesis might work. They called it the RNA Tie Club—RNA because that molecule was still the mysterious intermediary; Tie because such neckwear evoked, and mocked, the clubby bond of an old school tie. As tokens of club membership, these scientists had embroidered neckties, all alike. They had individual tiepins, each representing one amino acid. They embraced their respective amino identities, at least jocularly: Serine and Lysine and Arginine, etcetera. Cute. Woese wasn’t a member.
The cryptographic riddle, so intriguing to Gamow and Crick and the others, was this: How could the four bases of DNA—represented by those four cardinal letters, A, C, G, and T—be combined in groups of at least three, with or without commas, to produce the twenty different amino acids? Woese addressed it alone. He knew that a team led by Marshall Nirenberg, a young biochemist at the US National Institutes of Health, had made better progress with an experimental approach than the RNA Tie Club was making with collegial theorizing. But he wanted to go deeper.
“I differed from the whole lot of them,” Woese wrote decades later, “in perceiving the nature of the code as inseparable from the problem of the nature and origin of the decoding mechanism.” The decoding mechanism? By that, he meant whatever organ or molecule translated the DNA information into real, physical proteins. Its origin? To him, at that time, this was the central biological concern. He wanted to understand not just how that decoding mechanism worked but also how it had come into being roughly four billion years ago. He recognized, more clearly than anyone else, that life could not have progressed beyond its simplest primordial forms without a translation system for applying the information in DNA.
No statement from Woese is more telling of his character, his cantankerous self-image as a scientific outsider, than the beginning of that sentence just quoted: “I differed from the whole lot of them …” He was a loner by disposition. He took a separate path. Not in the club. No RNA tie. He published a few papers in Nature on the coding question, and a comment in Science—all under his sole authorship, suggesting ideas, critiquing what others had done. He offered his own view in full, an evolutionary view, in a 1967 book, The Genetic Code, which was visionary, ambitious, closely reasoned, and mostly wrong. But in science, wrong doesn’t mean useless. Trying to imagine the origins of the genetic code brought Woese around, almost reluctantly, to the tree of life.
He needed some such universal diagram, Woese realized, as a framework for understanding the evolution of that one crucial system at life’s core—the translation system, turning DNA-coded information into proteins. Deep biology required deep history. This conundrum has been nicely expressed by Jan Sapp, a plant geneticist who became a historian of biology and came to know Woese well: “A universal tree would therefore hold the secret to its own existence.” History illuminating biology and vice versa. Evolutionary biology is history, after all. But there was a problem. For microbiology—bacteria and other single-celled creatures—a tree didn’t exist. The known trees didn’t encompass such organisms, or portray their diversity, to any satisfactory extent. Animals could be compared with one another on the basis of their physical appearance and behavior, as Linnaeus and Darwin had compared them; plants could be compared; fungi could be compared. They could be arranged in treelike patterns that reflected their relationships as deduced from such external, visible evidence. But that was impossible with microbes because, even under a high-powered microscope, so many of them looked so much alike.
There were a few basic shapes—rods, spheres, filaments, spirals—and those had served (reliably or not) to define major groups of bacteria. But at the finer level, the level of what we would think of as species, bacterial classification into a natural system, showing evolutionary relationships, was difficult. Arguably impossible. Even some of the experts had given up. It couldn’t be done on the basis of appearance and behavior. It couldn’t be done by way of physiological characteristics (which, in microbes, are what pass for behavior). It couldn’t be done at all, unless someone invented a new method.
“A slight diversion in my research program would be necessary,” Woese recollected later—a wry comment, because by then the diversion had lasted two decades.
12
On June 24, 1969, Woese in Urbana wrote a revealing letter to Francis Crick in Cambridge. He had struck up an acquaintance with Crick about eight years earlier when Woese was an obscure young biologist at the General Electric Research Laboratory in Schenectady, New York, and Crick was already world renowned for the DNA structure discovery. It had begun as a tenuous exchange of courtesies, through the mail—Woese requesting, and receiving, a reprint of one of Crick’s papers on coding—but by 1969, they were friendly enough that he could be more personal and ask a larger favor. “Dear Francis,” he wrote, “I’m about to make what for me is an important and nearly irreversible decision,” adding that he would be grateful for Crick’s thoughts and his moral support.
What he hoped to do, Woese confided, was to “unravel the course of events” leading to the origin of the simplest cells—the cells that microbiologists called prokaryotes, by which they meant bacteria. Eukaryotes constituted the other big category, the other domain, and all forms of cellular life (that is, not including viruses) were classified as one or the other. Prokaryotes (pro being the Greek for “before,” karyon the Greek for “nut” or “kernel”) are cells without nuclei. Eukaryotes (eu for “true”) are the more complicated creatures, including multicellular animals, and plants, and fungi, plus certain single-celled but complex organisms such as amoebae, whose cells contain nuclei (hence the name, meaning “true kernel”). Prokaryotes (“before kernel”) seem to have existed on Earth before eukaryotes. Although bacteria are still around and still vastly successful, dominating many parts of the planet, they were thought in 1969 to be the closest living approximations of early life-forms. Investigating their origins, Woese told Crick, would require extending the current understanding of evolution “backward in time by a billion years or so,” to that point when cellular life was just taking shape from … something else, something unknown and precellular.
Oh, just a billion years further back? Woese was always an ambitious thinker. “There is a possibility, though not a certainty,” Woese told Crick, “that this can be done using the cell’s ‘internal fossil record.’” What he meant by that term was merely the evidence of long molecules, the linear sequences of units in DNA, RNA, and proteins. Comparing such sequences—variations on the same molecule, as seen in different creatures—would allow him to deduce the “ancient ancestor sequences” from which those molecules had diverged, in one lineage and another. And from such deductions, such ancestral forms, Woese hoped to glean some understanding of how creatures had evolved in the very deep past. He was talking about molecular phylogenetics, still without using that phrase, and he hoped by this technique to look back at least three billion years.
But which molecules would be the most telling? Which would represent the best internal fossil record? Frederick Sanger, a humble but visionary biochemist in England, had sequenced the amino acids of bovine insulin, and insulins are a fairly old family of molecules in animals and other eukaryotes, but they don’t go back nearly as far as Woese wanted. Other scientists had sequenced a protein called cytochrome c, also crucial in cell biochemistry among many creatures. But those didn’t satisfy Woese. He wanted something more basic, more universal—something that went all the way back, or nearly all the way, to the beginnings of life.
“The obvious choice of molecules here lies in the components of the translation apparatus,” he told Crick. “What more ancient lineages are there?” By “translation apparatus,” Woese meant the decoding mechanism, the system that turns DNA information into proteins—the same system that Crick had groped toward understanding in his 1958 paper “On Protein Synthesis.” Investigating the translation apparatus would in turn bring Woese around toward his starting point: his desire to learn how the genetic code itself might have evolved. Now, eleven years after Crick’s protein paper, the system was much better understood.
The components Woese had in mind were pieces of a tiny molecular mechanism common to all forms of cellular life. It’s called the ribosome. Nearly every cell contains ribosomes in abundance, like flakes of pepper in a stew, and they stay busy with the task of translating genetic information into proteins. Hemoglobin, for instance, that crucial oxygen-transporting protein. Architectural instructions for building hemoglobin molecules are encoded in the DNA, but where is hemoglobin actually produced? In the ribosomes. They are the core elements of what Woese called the translation apparatus.
Crick hadn’t used that phrase, “translation apparatus,” in his paper. He hadn’t even used the word ribosomes, but he touched upon them vaguely under their previous name, microsomal particles. These particles had only recently been discovered (in 1956, by a Romanian cell biologist using an electron microscope) and at first no one knew what they did. Then they became recognized as the sites where proteins are built, but a big question remained: how? Some researchers suspected that ribosomes might actually contain the recipes for proteins, extruding them as an almost autonomous process. That notion collapsed in 1960, almost with a single flash of insight, when Crick’s brilliant colleague Sydney Brenner, during a lively meeting at Cambridge University, hit upon a better idea. Matt Ridley has described the moment in his biography of Crick:
Then suddenly Brenner let out a “yelp.” He began talking fast. Crick began talking back just as fast. Everybody else in the room watched in amazement. Brenner had seen the answer, and Crick had seen him see it. The ribosome did not contain the recipe for the protein; it was a tape reader. It could make any protein so long as it was fed the right tape of “messenger” RNA.
This was back in the days before digital recording, remember, when sound was recorded on magnetic tape. The “tape” in Brenner’s metaphor was a strand of RNA—that particular sort called messenger RNA (one of several forms of RNA that perform various functions) because it carries messages from the cell’s DNA genome to the ribosomes. A ribosome consists of two subunits, one large, one small, fitted together and performing complementary functions. The small subunit reads the RNA message. The large subunit uses that information to join the appropriate amino acids into a chain, constituting the protein. The ribosomes and the messenger RNA, plus a few other pieces, constitute what Woese called the translation apparatus. By 1969, when Woese wrote to Crick, their crucial roles were appreciated.
Every living cell, including bacteria, including the cells of our own bodies, including those of plants and of fungi and of every other cellular organism, contains many ribosomes. They function as assembly mechanisms, taking in genetic information, plus raw material in the form of amino acids, and producing those larger physical products: proteins. In plainer words: ribosomes turn genes into living bodies. Because the proteins they produce become three-dimensional molecules, a better metaphor than Brenner’s tape-reader, for our own day, might be this: the ribosome is a 3-D printer.
Ribosomes are among the smallest of identifiable structures within a cell, but what they lack in size they make up for in abundance and consequence. A single mammalian cell might contain as many as ten million ribosomes; a single cell of the bacterium Escherichia coli, better know as E. coli, might get by with just tens of thousands. Each ribosome might crank out protein at the rate of two hundred amino acids per minute, altogether producing a sizzle of constructive activity within the cell. And this activity, because it’s so basic to life itself, life in all forms, has presumably been going on for almost four billion years. Few people, in 1969, saw the implications of that ancient, universal role of ribosomes more keenly than Carl Woese. What he saw was that these little flecks—or some molecule within them—might contain evidence about how life worked, and how it diversified, at the very beginning.
Another of Woese’s penetrating insights, back at this early moment, was to focus on a particular portion of ribosomes: their structural RNA. Usually we think of RNA in the role I mentioned above—as an information-bearing molecule, single stranded rather than double helical like DNA, carrying the coded genetic instructions to the ribosomes for application. Transient in space (through the cell) and transient in time (used and discarded). But that’s only one kind of RNA, messenger RNA, performing one function. There’s more. RNA can serve as a building block as well as a message. Ribosomes, for instance, are composed of structural RNA molecules and proteins, just as an espresso machine might be made of both steel and plastic. “I feel,” Woese confided to Crick in the letter, “that the RNA components of the machine hold more promise than (most of the) protein components.” Those RNA components held more promise for plumbing deep history, he reckoned, because they were so old and, probably, so little changed over time.
Woese saw the secret truth that RNA—not just a molecule, but a family of versatile, complex, underappreciated molecules—is really more interesting and dynamic than its famed counterpart, DNA. And this is where that family enters the story and begins taking its position near the center. Woese had decided he would use ribosomal RNA as the ultimate molecular fossil record.
“What I propose to do is not elegant science by my definition,” he confided to Crick. Scientific elegance lay in generating the minimum of data needed to answer a question. His approach would be more of a slog. He would need a large laboratory set up for reading at least portions of the ribosomal RNA. That itself was a stretch at the time. (The sequencing of very long molecules—DNA, RNA, or proteins—is so easily done nowadays, so elegantly automated, that we can scarcely appreciate the challenge Woese faced. Work that would eventually take him and his lab members arduous months, during the early 1970s, can now be done by a smart undergraduate, using expensive machines, in an afternoon.) Back in 1969, Woese couldn’t hope to sequence the entirety of a long molecule, let alone a whole genome. He could expect only glimpses—short excerpts, read from fragments of ribosomal RNA molecules—and even that much could be achieved only laboriously, clumsily, at great cost in time and effort. He planned to sequence what he could from one creature and another and then make comparisons, working backward to an inferred view of life in its earliest forms and dynamics. Ribosomal RNA would be his rabbit hole to the beginning of evolution.
Ribosome structure and function: converting messenger RNA to protein.
Gearing up the laboratory would be step one. Given his low level of administrative skill, he admitted to Crick, that much would be difficult. But besides lab equipment and money and administration, Woese perceived one other necessity. “Here is where I’d be particularly grateful for your advice and help,” he told Crick. He hoped to enlist “some energetic young product of Fred Sanger’s lab, whose scientific capacities complement mine.” By that, he meant: for this great sequencing effort, Woese would need a helper who knew how to sequence.
13
Fred Sanger’s pioneering work was the standard at that time for efforts at sequencing RNA. Building on ideas from earlier researchers, Sanger had developed techniques for cutting a long molecule into short pieces, then separating those pieces by electrophoresis, pulling them apart within a column of gel. The gel column served as a racetrack for fragments of different sizes. With an electrical force applied, each fragment would be attracted toward one end and would migrate through the gel at its own speed, dependent on its molecular size and its electrical charge. As their differing speeds spread them apart, the fragments would show as a characteristic oval spot in a two-dimensional pattern, as captured on film. Each oval could then be read as a short squib of code, using other means of cutting and pulling. This was an advance on the same general method that Pauling had recommended to Zuckerkandl for distinguishing variant forms of a molecule by “fingerprinting.”
Fred Sanger had two things, but perhaps not much else, in common with Linus Pauling: chemistry and a pair of Nobel Prizes. Unlike Pauling, he was a quiet, unassuming man, from a Quaker upbringing in the English Midlands, who won both his Nobels in the branch of science he and Pauling shared—he was the only person awarded twice for chemistry. He received the first prize in 1958, at age forty, for work on the molecular structure of a protein—specifically, bovine insulin. To solve that structure, Sanger adapted some relatively primitive methods from other researchers, in an ingenious way, allowing him to determine which sequences of amino acids compose the two long branches of the insulin molecule. This was a Nobel-worthy achievement for what it said not just about blood-sugar regulation in cows but also about proteins in general: that they’re not amorphous things but have, each protein, a determined chemical composition. From proteins, Sanger turned to sequencing RNA, then DNA, and won his second Nobel in 1980 for the culminating phase of his DNA work. Soon after, at age sixty-five, he retired from science and turned his energies to gardening. He had a nice little home in a village near Cambridge.
“My work had sort of come to a climax,” he said later, and he didn’t care to morph into an administrator. He declined a knighthood, having no desire to be addressed as “Sir Fred” by friends and strangers. “A knighthood makes you different, doesn’t it,” he said, “and I don’t want to be different.” But that Cincinnatus retirement lay long in the future when Carl Woese, in his 1969 letter to Crick, daydreamed of getting a Sanger protégé to help him.