Anything less than a cell, then, has at best a questionable claim to be alive; from cells, you can make every organism on Earth. We have known about the fundamental status of the cell for about two centuries but have not always acknowledged it. For much of the late twentieth century, the cell was relegated before the supremacy of the gene: the biological “unit of information” inherited between generations. Now the tide has turned again. “The cell is making a particular kind of reappearance (#litres_trial_promo) as a central actor in today’s biomedical, biological, and biotechnological settings,” writes sociologist of biology Hannah Landecker. “At the beginning of the 21st century, the cell has emerged as a central unit of biological thought and practice … the cell has deposed the gene as the candidate for the role of life itself.”
Cells do more than persist. Crucially, they can replicate: produce copies of themselves. Ultimately, cell replication and proliferation drives evolution. Life is not what makes this propagation of cells possible; rather, that is what life is.
Biologists towards the end of the nineteenth century recognized that reproduction of cells happens not by the spontaneous formation of new cells, as Schwann believed, but by cell division as Virchow asserted: one cell dividing in two. Single-celled organisms such as bacteria simply replicate their chromosomes and then bud in two, a process called binary fission. But in eukaryotic cells the process is considerably more complex. Cell “fission” was first seen in the 1830s and was called mitosis in 1882 by the German anatomist Walther Flemming, who studied the process in detail in amphibian cells.
Flemming was a champion of the filamentary model of cells – the idea that their contents are organized mainly as long fibrous structures. In the 1870s, he showed that as animal cells divide, the dense blob of the nucleus dissolves into a tangle of thread-like structures (mitosis stems from the Greek word for thread). The threads then condense into X-shaped structures that are arranged on a set of star-like protein filaments dubbed an aster. (The word means “star”, but actually the appearance is more reminiscent of an aster flower.) Flemming saw that the aster gets elongated and then rearranged into two asters, on which the chromosomes break in half. As the cell body itself splits in two, these chromosomal fragments are separated into the two “daughter” cells and enclosed once again within nuclei.
Various stages of cell division or mitosis as recorded by Walther Flemming in his 1882 book Zellsubstanz, Kern und Zelltheilung (Cell Substance, Nucleus and Cell Division).
So cell division is preceded by a reorganization of its contents: apparently, they are apportioned rather carefully into two. The thread-like material seen by Flemming unravelling from the nucleus readily takes up a staining dye (so that it is more easily seen under the microscope), leading it to be called, after the Greek word for colour, chromatin. The individual threads themselves were christened chromosomes – “coloured bodies” – in 1888.
In that same year, the German biologist Theodor Boveri discovered that the movement of chromosomes during cell division is controlled by a structure he called the centrosome, from which the strands of asters radiate. The two asters that appear just before a cell splits in two, each with a centrosome at their core, could in fact be seen to be connected by a bulging bridge of fine filaments, called the mitotic spindle. Flemming became convinced that these spindle fibres act as a kind of scaffold to direct the segregation of the chromosome threads into two groups. He was right, but he lacked a sufficiently sharply resolved microscopic technique to prove it.
So the division of animal cells isn’t just like the splitting of a water droplet into two. It has to be accompanied by a great deal of internal reorganization. Flemming and others identified a series of distinct stages along the way. While cells are going about their business with no sign of dividing, they are said to be in the interphase state. The unpacking of the nucleus into filamentary chromosomes is called prophase, and the formation and elongation of the aster is called metaphase. As the aster-like cluster splits in two, the cell enters the anaphase, from where it is downhill all the way to fission and the re-compaction of the nucleus.
This procedure is called the cell cycle, which is an interesting phrase when you think about it. Its implication is that, rather than thinking of biology as being composed of cells that do their thing until they eventually divide, we might regard it as a process of continual replication and proliferation that involves cells. With all due warning about the artificiality of narratives in biology, we might thus reframe the Great Chain of Being as instead a Great Chain of Becoming.
* * *
It was a fundamental – perhaps the fundamental – turning point for modern biology when, around the turn of the century, scientists came to appreciate that much of the complicated reorganization that goes on when cells divide is in order to pass on the genes, the basic units of inheritance, that are written into the strands called chromosomes. What they were seeing here in their microscopes is the underlying principle that enables inheritance and evolution.
The notion of the gene as a physical entity that confers inheritance of traits appeared in parallel with the development of cell theory in the mid-nineteenth century. The story of how “particulate factors” governing inheritance were posited by the Moravian monk Gregor Mendel from his studies on the cultivation of pea plants has been so often told that we needn’t dwell on it. In the 1850s and ’60s Mendel observed that inheritance seemed to be an all-or-nothing affair: peas made by interbreeding plants that make smooth or wrinkly versions are either one type or the other, not a blend (“a bit wrinkly”) of the two. Of course, real inheritance in humans is more complicated: some traits (like hair or eye colour) may be inherited discretely, like Mendel’s peas, others (like height or skin pigmentation) may be intermediate between those of the biological parents. The puzzle Mendel’s observations raised was why inheritance is not always such mix, given that it comes from a merging of the parental gametes.
Charles Darwin didn’t know of Mendel’s work, but he invoked a similar idea of particulate inheritance in his theory of evolution by natural selection. Darwin believed that the body’s cells produced particles that he called gemmules, which influence an organism’s development and are passed on to offspring. In this view, all the cells and tissues of the body play a role in inheritance, whence the term “pangenesis” that Darwin coined for his speculative mechanism of evolution. These gemmules may be modified at random by influences from the environment, and the variations are acquired by progeny. In the 1890s, the Dutch botanist Hugo de Vries and German biologist August Weismann independently modified Darwin’s theory by proposing that transmission of gemmules could not occur between body (somatic) cells and the so-called “germ cells” that produce gametes. Only the latter could contribute to inheritance. De Vries used the term “pangene” instead of gemmule to distinguish his theory from Darwin’s.
At the start of the twentieth century, the Danish botanist Wilhelm Johannsen shortened the word for these particulate units of inheritance to “gene”. He also drew the central distinction between an organism’s genotype – the genes it inherits from the biological parents – and its phenotype, the expression of those genes in appearance and behaviour.
In 1902 Theodor Boveri, working on sea urchins in Germany, and independently the American zoologist Walter Sutton, who was studying grasshoppers, noticed that the faithful passing on of chromosomes across generations of cells mirrored the way that genes were inherited. Perhaps, they concluded, chromosomes are in fact the carriers of the genes. Around 1915, the American biologist Thomas Hunt Morgan established, from painstaking studies of the inheritance of characteristics in fruit flies, that this is so. Moreover, Morgan showed how one could deduce the approximate positions of two different genes relative to one another on the chromosomes by observing how often the two genes – or rather, the manifestation of the corresponding phenotypes – appear together in fruit flies made by mating of individuals with the respective genes. As the chromosomes were divvied up to form egg and sperm cells, genes that sat close together were more likely to remain together in the offspring. Morgan’s work established the idea of a genetic map: literally a picture of where genes sit on the various chromosomes.
The sum total of an organism’s genetic material is called its genome, a word introduced in 1920. For many years after Morgan’s work, it was suspected that genes are composed of the molecules called proteins, in which the much smaller molecules called amino acids are linked together in chains. Proteins, after all, seemed to be responsible for most of what goes on in cells – they are the stuff of enzymes. And chromosomes were indeed found to consist partly of protein. But those threads of heredity were also known to contain a molecule called DNA, belonging to the class known as nucleic acids (that’s what the “NA” stands for).
No one knew what this stuff did until the mid-1940s, when the Canadian-American physician Oswald Avery and his co-workers at the Rockefeller University Hospital in New York reported rather conclusive evidence that genes in fact reside on DNA. That idea was not universally accepted, however, until James Watson, Francis Crick, Maurice Wilkins, Rosalind Franklin and their co-workers revealed the molecular structure of DNA – how its atoms are arranged along the chain-like molecule. This structure, first reported in 1953 by Watson and Crick, who relied partly on Franklin’s studies of DNA crystals, showed how genetic information could be encoded in the DNA molecule. It is a deeply elegant structure, composed of two chain-strands entwined in a double helix.
The double helix of DNA. This iconic image creates a somewhat misleading picture, since for most of the time DNA in a cell’s chromosomes is packaged up quite densely in chromatin, in which it is wrapped around proteins called histones like thread on a bobbin. The “rungs” of the double-helical ladder consist of pairs of so-called nucleotide bases (denoted A, T, C and G) with shapes that complement each other and fit together well.
So beautiful, indeed, was this molecular architecture and the story it seemed to disclose that modern biology was largely seduced by it. It was immediately obvious to Watson and Crick how heredity could be enacted on the molecular scale. The information in genes could be replicated by unzipping the double helix so that each strand could act as the template on which replicas could be assembled.
Here, then, was how genetic information could be copied into new chromosomes when cells divide: a molecular-scale mechanism for the inheritance described by Mendel and Darwin, which Morgan and others had situated on the chromosomes. DNA married genetics with inheritance at the molecular level, bringing coherence to biology.
And Darwinian evolution? If genes govern an organism’s traits, then random copying errors in DNA replication could alter a trait, mostly to the detriment of an organism but occasionally to its advantage. This is the variation on which natural selection acts to make organisms adapted to their environment.
It all seemed to fall into place. All the important questions – about evolution, genetic disease, development – might now be answered by referring to the information in the genome. Cells didn’t seem to be a very important part of the story except as vehicles for genes and as machines for enacting their commands.
To speak of information being “encoded” in DNA is to speak literally. Genes deploy a code: the genetic code. But what exactly do genes encode? On the most part, it is the chemical structure of a protein molecule, typically an enzyme. Because of the ways in which different amino acids “feel” one another and interact with the watery solvent all around them in the cell, a particular sequence of amino acids determines the way most protein chains fold up into a compact three-dimensional shape. This shape enables enzymes to carry out particular chemical transformations in the cell: they are catalysts that facilitate the cell’s chemistry. So the protein’s sequence, encoded in the respective gene, dictates its function.
A protein’s amino-acid sequence is represented in its gene by the sequence of chemical constituents that make up DNA. There are four of these, called nucleotide bases and denoted by the labels A, T, G and C. Different triplets of bases represent particular amino acids in the resultant protein: AAA, for example, corresponds to the amino acid called lysine.
Turning a gene into its corresponding protein is a two step-process. First, the gene on a piece of DNA in a chromosome is used as a template for building a molecule of another kind of nucleic acid, called RNA. This is called transcription. The piece of RNA made from a gene is then used as a template for putting the protein together, one amino acid at a time. This is called translation, and it is performed by a complex piece of molecular machinery called the ribosome, made of proteins and other pieces of RNA.
Chromosomes consist of lengths of DNA double-helix wound around disk-like protein molecules called histones, like the string on a yoyo. This combination of DNA and its protein packaging is what we call chromatin. The genomes of eukaryotes are divided up into a number of chromosomes that is always the same for every cell of a particular species (if they are not abnormal) but can differ between species. Human cells have 46 chromosomes, in sets of 23 pairs.
* * *
It’s common to see genes called the instructions to make an organism. In this view, the entire genome is then the “instruction booklet”, or even the “blueprint”. This is an understandable metaphor, but misleading. Genes are fundamental to the way an organism turns out: the genome of a frog egg guides it to become a frog, not an elephant, and vice versa. But the way genes influence and to some degree dictate that proliferation of cells is subtle, complex, and resistant to any convenient metaphors from the technological world of design and construction. By leaping from genome to finished organism without taking into account the process of development from cells, we risk simplifying biology in ways that can create some deep misconceptions about how life proceeds and evolves.
To the extent that a gene is an “instruction”, it is an instruction to build a protein molecule. It is far from obvious what, in general, this has to do with the growth and form of an organism: with the generation of our flesh. We know of no way to map an organism’s complement of proteins onto its shape, traits and behaviour: its phenotype. The two are worlds apart: it’s rather like trying to understand the meaning of a Dickens novel from a close consideration of the shapes of its letters and the correlations in their order of appearance.
Besides, this conventional “blueprint” description of what genomes do is too simplistic even if we consider only how they dictate that roster of proteins. Here are some reasons why:
Only about 1.5 per cent of the human genome encodes proteins, and a further 8 to 15 per cent or so is thought to “regulate” the activity of other genes by encoding RNA that turns their transcription up or down. We don’t know what the rest does, and scientists aren’t agreed on whether it is just useless “junk” accumulated, like rubbish in the attic, over the course of evolution, or whether it has some unknown but important biological function. In all probability, it is a bit of both. But at any rate, a lot of this DNA with no known protein-coding or regulatory function is nonetheless transcribed by cells to RNA, and no one is sure why.
Most protein-coding human genes each encode more than one protein. Genes are not generally simply a linear encoding of protein sequences that start at one end of the protein chain and finish at the other; they are, for example, interspersed with sequences called introns that are carefully snipped out of the transcribed RNA before it is translated. Sometimes the transcribed RNA then gets reshuffled before translation, providing templates for several different proteins.
Proteins are not just folded chains of amino acids. Sometimes those folded chains are “stapled” in place by chemical bonds, or clipped together by other chemical entities such as electrically charged ions. Most proteins have other chemical groups added to them (by other enzymes) – for example, a group containing an iron atom is needed by the protein haemoglobin to bind oxygen and carry it around the body in the blood. None of these details, essential to the protein’s structure and function, is encoded in DNA. You would not be able to deduce them from a gene sequence.
We only know what around 50 per cent of gene-encoded proteins do, or even what they look like. The rest are sometimes called “dark” proteins: we assume they have a role but we don’t know what it is.
Plenty of proteins do not seem to have well-defined folded states but appear loose and floppy. Understanding how such ill-defined “intrinsically disordered proteins” can have specific biological roles is a very active area of current research. Some researchers think that the floppiness may not reflect the state of these proteins in cells themselves – but we don’t really know if that is so or not.
Ah, details, details! How much should we care? Do they really alter the picture of genes dictating the organism?
That depends, to some degree, on what questions you are asking. A genome sequence – the ordered list of nucleotides A, T, C and G along the DNA strands of chromosomes – does specify the nature of the organism in question. From this sequence you can tell in principle if the cell that contains it is from a human, a dog or a mouse (something that may not be obvious from a cursory look at the cell as a whole). These distinctions are found only in some key genes: the human genome differs from that of chimpanzees in just 1 per cent of the sequence, and a third of it is essentially the same as the genome of a mushroom.
The differences between the genomes of individual people are even tinier.
But whereas you can look at a real blueprint, and probably an instruction manual, and figure out what kind of object will emerge from the plan, you can’t do that for a genome. Indeed, you can only deduce that a genome will “produce” a dog at all if you have already decoded the generic dog genome for comparison, laying the two side by side. It’s simply a case of seeing if the two genomes superimpose; there’s nothing intrinsic in the sequence that hints at its “dogness”.
This isn’t because we don’t yet know enough about the “instructions” in a genome (although that is the case too). It’s because there is no direct relationship between the informational content of a gene – which, as I say, typically dictates the structure of a class of protein molecules, or at least of the basic amino-acid fabric of those proteins – and a trait or structure apparent in the organism. Most proteins do jobs that can’t easily be related to any particular trait. Some can be: for example, there’s a protein that helps chloride ions get through the membranes of our cells, and if this protein is faulty – because of a mutation in the corresponding gene – then the lack of chloride transport into cells causes the disease cystic fibrosis. But in general, proteins carry out “low-level” biochemical functions that might be involved in a whole host of traits, and which might have very different outcomes if the protein is produced (“expressed”) at different stages in the development or life cycle of the organism. As microbiologist Franklin Harold has said, “the higher levels of order, form and function (#litres_trial_promo) are not spelled out in the genome.”
Might we, then, call a genome not a blueprint but a recipe? The metaphor has rather more appeal, not least because many recipes assume implicit knowledge (especially in older cookbooks). But a recipe is still a list of ingredients plus instructions to assemble them. Genomes do not come with users’ instructions, more’s the pity. Harold offers a different image, allusive and poetic and all the more appealing for that:
I prefer to think of the genome as akin (#litres_trial_promo) to Hermann Hesse’s Magister Ludi [aka The Glass Bead Game]: master of an intricate game of cues and responses, in which he is fully enmeshed and absorbed; a game that is shaped as much by its own internal rules as by the will of that masterful player.
If there was better public communication of the complex, contingent and often opaque relationship of genotype to phenotype, there might be rather less anxiety about the idea that genes affect behaviour. Small variations in each individual’s genetic make-up can have an influence – sometimes a rather strong one – not just what you look like but what your behaviour and personality are like. This much is absolutely clear: there is not a single known aspect of human behaviour so far investigated that does not turn out to show some correlation with what gene variants we have. Even habits or experiences as apparently contingent and environmental as the amount we watch television
or our chance of getting divorced are partly heritable, meaning that the differences between individuals can be partly traced to differences in their genes.
Far from alarming us, this shouldn’t surprise us. We have always been content to believe that, for example, some people seem blessed with talents that can’t obviously be explained by their environment and upbringing alone. By the same token, some seem hardwired to find particular tasks challenging, such as reading or spatial coordination.
Yet perhaps because we have a strong sense of personal agency, autonomy and free will, many people are disturbed by the idea that there are molecules in our cells that are pulling our strings. They needn’t worry. It is precisely because genetic propensities are filtered, interpreted and modified by the process of growing a human cell by cell that they don’t fully determine how our bodies turn out, let alone how our brains get wired … let alone how we actually behave.
Genes supply the raw material for developing our basic cognitive capabilities – to put it crudely, they are a key part of what allows most human embryos to grow into bodies that can see, hear, taste, that have minds and inclinations. But how they exert their effects is very, very complicated. In particular, very few genes affect one trait alone. Most genes have influences on many traits. Some traits, both behavioural and medical (such as susceptibility to heart disease), seem to be influenced – in ways that are imperceptible gene by gene, but detectable when their effects are added up – by most of the genome. That’s why the popular notion of a “gene for” some behavioural trait is misguided. In fact, it means that there may be no meaningful “causal” narrative that can take us from particular genes to behaviours.
* * *
This is precisely why we need to resist seductively simple metaphors in genetics: blueprints, selfish genes, “genes for”. Of course, science always needs to reduce complex ideas and processes to simpler narratives if it is going to communicate to a broader audience. But I’ve yet to see a metaphor in genomics that does not risk distorting or misrepresenting the truth, so far as we currently know it. Fortunately, I do not think this matters for talking about the roles of genes in making a human. We will deal with those roles as they arise, without resorting to any overarching story about what genes “do”.
I haven’t even told you yet the worst of it, though. It’s not simply difficult to articulate clearly what, in the scheme of growing humans the natural way, genes do. For we don’t exactly know how to define a gene at all.