If there are a lot of CpG motifs in a stretch of DNA, there are lots of sites where the methyl group can be added epigenetically. This attracts proteins that repress expression of that gene. In extreme cases, where there are lots of CpG motifs in close proximity, DNA methylation can have an exceptionally profound effect. Essentially, the DNA changes its shape and the gene is completely switched off. Remarkably, it can be switched off not just in that cell, but in all the daughter cells that are created when it divides. In non-dividing cells, such as the neurons in our brains, these patterns of DNA methylation may be established while we are in the womb. Many of them will still be in place 100 years later, if we make it that far.
The realisation that DNA methylation could switch genes off more or less permanently during the lifetime of an individual caused great excitement. This was because it finally gave scientists a mechanism to probe something that had been puzzling everyone for decades. Essentially we have known for a long time that not everything can be explained by genetics. We know this because there are a lot of situations where two things are genetically identical and yet the two things are different. When a caterpillar pupates and then turns into a butterfly, it continues to use the same genome. Genetically identical mice, reared under completely standard laboratory conditions, aren’t all the same weight.
You and I, dear readers, are masterpieces of epigenetics. The 50–70 trillion cells in a human body pretty much all contain exactly the same genetic code.[19] Whether they are salt-secreting cells in our sweat glands, the skin cells on our eyelids or the cells that produce the shock-absorbing cartilage in our knees, they all contain exactly the same DNA. They just use the information in those genes in different ways, depending on the tissue. For instance, the neurons in the brain express the receptors for neurotransmitters but switch off the genes for haemoglobin, the pigment that carries oxygen in our red blood cells.
These are all examples of situations we have referred to for decades as epigenetic phenomena. Yes, exactly the same word as for the modifications, and it makes sense. These are all situations where something else is happening in addition to, or as well as, the genetic code.
The discovery of DNA methylation finally gave us a mechanism to understand how epigenetic phenomena happen. In a neuron, the genes responsible for producing haemoglobin become heavily methylated and are switched off. They stay switched off through life. In the cells that give rise to red blood cells, however, these genes are not methylated and haemoglobin is created. But the genes that code for neurotransmitter receptors are switched off using this epigenetic mechanism in these cells.
DNA methylation is pretty stable. It’s surprisingly difficult to remove this modification. This is a good thing if your cells need to keep certain genes switched off for long periods. But often our cells need to respond to short-term changes in their environment, if we drink alcohol or are stressed out by a job interview, for example. Here they turn to a second system. They add modifications to the histone proteins adjacent to genes. Changing the histone modifications can turn genes off, but because these modifications are relatively easy to remove, the cell has the option of turning the genes back on fairly quickly if it needs to. The histone modifications can also be used to modulate the expression of a gene — turn it on a little, quite a bit, quite a lot, a heck of a lot and so on. At a simplistic level we can think of DNA methylation as the on/off switch and histone modifications as the volume control.
The reason histone modifications can act as the fine-tuning mechanism for gene expression is because there are lots of different ones. If DNA is black-to-white with perhaps a few shades of grey depending on the level of methylation, histone modifications are glorious technicolour. There are multiple amino acids that can be modified on histone proteins, and there are at least 60 different chemical groups that can be added to the various amino acids. That creates an extraordinary degree of complexity because at different genes, or the same gene in different cell types, there are thousands of possible combinations of histone modifications. These will be interpreted by the cell in different ways, because they will attract different complexes of proteins that control the gene expression levels and patterns. Some combinations will drive up gene expression, others will drive it down.
But for years we were faced with a puzzle. The enzymes that add modifications to histone proteins are blind to DNA sequence. They don’t bind DNA and they can’t distinguish one DNA sequence from another. And yet, in the presence of a relevant stimulus, whatever that might be, the enzymes were very precise in how they modified specific histones. They would add (or remove) modifications at the histones positioned at relevant genes, but ignore nearby histones associated with irrelevant genes.
It’s now starting to look as if one of the roles of long non-coding RNAs is to act as a kind of molecular Blu-Tack, attracting histone-modifying enzymes into the vicinity of selected genes. One of the pieces of evidence that this might be the case came from the work analysing the effects of certain long non-coding RNAs in human ES (embryonic stem) cells that was presented in Chapter 8. The researchers showed that about a third of the long non-coding RNAs they examined bound to complexes of proteins that included histone-modifying enzymes. To examine if this binding of long non-coding RNAs to the proteins had any functional consequences, they knocked down expression of the histone-modifying enzyme in the complex. In almost half the cases, the effects on the cell and on gene expression were the same as if they knocked down the long non-coding RNA itself. This suggested that the long non-coding RNA and the histone-modifying enzymes really were working together in the cell.{169}
Many of the investigations of this cross-talk between the long non-coding RNA and epigenetics systems have focused on a specific epigenetic enzyme. This enzyme deposits a specific histone modification that is strongly associated with switching off genes. We can refer to this enzyme as the major repressor.[20] This has been shown to interact with lots of different long non-coding RNAs.
The long non-coding RNA from a gene targets the major repressor to that gene. The major repressor enzyme then creates repressive modifications on the histones, driving down expression of the genes. The repressive modifications attract other proteins, which bind and repress the gene even further.
This control by the major repressor epigenetic enzyme is frequently used to control genes that code for other epigenetic enzymes. Often, these will be genes that have the opposite effect to the major repressor, i.e. they tend to turn genes on. The overall effect is that the major repressor has a strong influence on general patterns of gene expression.{170} It represses genes directly, but also indirectly by preventing expression of epigenetic enzymes that normally switch other genes on. An epigenetic double-punch.
Usually this is a completely normal part of the control of gene expression that happens in our cells, and the system is doing exactly what it’s supposed to, making sure that all the complex cellular pathways run in an integrated fashion. But if one part of the complex interaction between long non-coding RNAs and the epigenetic machinery goes out of kilter, problems may develop.
Unfortunately, this seems to be exactly what is happening in some cancers. The major repressor is over-expressed in certain cancers, such as subsets of prostate{171} and breast{172} cancer, and this over-expression is associated with poor prognosis. In certain types of blood cell cancer, the major repressor has mutated, making it abnormally active.{173} The outcome in each case appears to be that the ‘wrong’ genes are repressed. This creates an imbalance where proteins that drive the cell into proliferation outrun those that usually act as a brake, promoting a cancerous state. Drugs that inhibit the activity of the major repressor are in early clinical trials.{174}
19
The exceptions are the cells of the immune system that fight off specific infections. Unusually, these cells rearrange some of their genes to create different combinations of antibodies and receptors, able to respond to a vast range of foreign proteins.
20
The name for this major repressor enzyme is EZH2. It is responsible for adding three methyl molecules to an amino acid called lysine at position 27 on histone H3. The technical nomenclature for this modification is H3K27me3 and it is the best-characterised repressive mark in epigenetics outside of DNA methylation.