For many genes, the two alleles are always identical in all individuals within a population, but for others they are not. For example, alleles at genes that control the development of our two arms and two legs are always the same within each of us, while alleles at genes that determine our hair colour can differ. It is genes that have more than one type of allele that make each of us different. The more closely related you are to someone, the more similar your genomes.
The further apart two species are on the tree of life, the more different their genomes are. Time has led to their genomes diverging since they shared a common ancestor. However, not all parts of DNA diverge at the same rate as we move backwards along our line of descent towards LUCA. Some genes have changed very little with time. For example, genes associated with translating the genetic code into proteins are said to be highly conserved, for they have hardly changed in the last 4 billion years. Other genes can differ substantially between even closely related species. Heterochromatin, for example, is an important gene that is involved in switching other genes off, and it differs in very closely related species of fruit fly.
New alleles, and new genes that produce new proteins, arise via mutations to the genetic code. Mutations occur because errors are made when strands of DNA are copied when cells divide. A mutation is a copying error, and it is how the Sars-Cov-2 virus mutated to create new strains.
Mutations appear to occur at random, with many impacting the way an organism develops. These impacts can be mild, resulting in outcomes such as my impaired vision due to a failure of my retina to develop properly. Other mutations can be much more debilitating, resulting in blindness, deafness, premature ageing, early death, chronic pain, Downs syndrome and some types of personality disorder. Deleterious mutations consequently prevent development proceeding as it should. However, a small proportion of mutations are advantageous. They result in phenotypic traits being expressed that allow individuals to better avoid death or reproductive failure. It is these advantageous mutations that have allowed life to thrive, to increase in complexity on some branches of the tree of life, and to conquer the world. Without these advantageous mutations, we could never have evolved from LUCA. However, the cost of life’s success is the deleterious mutations. The cost of our existence is countless premature deaths, suffering and failures to thrive of endless organisms that had flawed self-assembly manuals, which meant they left no descendants. Mutation and selection drive evolution, but because mutation can happen anywhere in the genome it means some individuals lose life’s developmental lottery while a few others win.
The simplest type of mutation occurs when a cytosine, thymine, adenine or guanine is not copied correctly. Such a mutation in one of my ancestors is responsible for my poor eyesight (more on that later). An example of a point mutation such as this might mean a cytosine nucleobase is accidentally copied as adenine at some point on chromosome 14. The new allele that is produced might change a single amino acid in the chain the gene codes for, and this could mean the protein it produces takes a different shape. The protein’s changed structure might slightly alter the way a phenotypic trait develops by slowing down a chemical reaction compared to the non-mutated allele. If the new phenotypic trait gives individuals that carry it a survival or reproductive advantage, the newly mutated allele should start to spread within the population, but most mutations are deleterious and are quite quickly lost from the population.
Single-point mutations, where one nucleobase is switched to another, is just one type of mutation. Other types involve more substantial changes. Perhaps the most intriguing are mutations that arise where ‘junk DNA’ that didn’t produce a protein starts to. For a sequence of DNA to start producing a chain of amino acids, the molecules in your cells that read the DNA sequence and translate the code into a protein need to encounter a start codon that initiates the production of a string of amino acids. There are a number of nucleobase triplets that initiate the production of an amino acid chain, but ATG is a common start codon. There are also codons that tell the machinery to stop. Every now and then a point mutation will produce a new start codon in a stretch of junk DNA that was not used to produce proteins. When that happens, a completely new protein can be produced. In most cases, it will not serve a useful function, but on rare occasions the protein may find a role for itself. Mutations such as these are uncommon, but they can produce completely new phenotypic traits that have never been seen before by life.
We tend to think of viruses as being nothing but trouble, but they can also occasionally be the source of new, useful genes. Viruses, such as the coronaviruses that cause COVID or measles or rabies cannot reproduce on their own. They need to use the machinery of your cells, and they do this by infecting you. Once inside your cells, they hijack the cellular machinery to make copies of their genetic code and the proteins that coat their exterior, before self-assembling as new virus particles. Most cells that are infected by viruses die, but not always. The genetic code of some viruses can end up being permanently spliced into the DNA of an infected cell. If this happens in the stem cells that make sperm and eggs (the germline), then the new gene can be passed on to offspring, and can then be used to make proteins and new phenotypic traits. On most occasions these proteins may not have a useful role, but they sometimes do, and viral genes co-opted by hosts are known to have played a role in the evolution of the placenta in early mammals.
There are even more types of genetic mutation, including events known as deletions, insertions and inversions. Stretches of DNA can get copied from one part of the genome and inserted into another part, while DNA sequences can also get flipped around, appearing in reverse order. Sometimes two chromosomes can join to form a larger new one, while others may split in two. On occasion, the whole genome can be duplicated, doubling the number of genes by making a new copy of each one. Recent research has even revealed genes jumping from parasites into their hosts. DNA is dynamic and subject to change.
The history of mutations in our lines of descent means different species have different genes, and different alleles at some of the same genes, and they also have different genome sizes. The Japanese alpine plant Paris japonica has a genome consisting of 149 billion nucleobases. The human genome is fifty times smaller, consisting of a paltry 3 billion or so nucleobases. Compared to the bacteria Nasuia deltocephalinicola that lives inside insects, this is still vast. Its genome consists of only 112,000 nucleobases, nearly six orders of magnitude fewer than P. japonica. Some species appear to require more DNA than others to live their lives, although, perhaps surprisingly, the size of the genome doesn’t necessarily correspond to the number of genes. The plant Arabidopsis thaliana has 27,416 genes packed into a genome of 135 million nucleobases, while Norway spruce trees have about the same number of genes but a genome of 19 billion nucleobases.
The way that DNA is organized also differs between species. The atlas blue butterfly is the animal with the largest number of chromosomes, with 450 arranged into 225 chromosome pairs. In contrast, females of the jack jumper ant have only one chromosome pair. Males of the species, as in all ant species, have only one copy of each chromosome rather than two. Biologists describe them as haploid rather than diploid as their chromosomes do not come in pairs. Remarkably some species of plant have three or more copies of each chromosome. The adder’s tongue fern can have up to ten copies of each, while even the humble banana is triploid, with its chromosomes coming in three copies. Genome size, the number of chromosomes and the number of copies of each chromosome are phenotypic traits, and so can evolve much like any other.