The scientists at deCODE Genetics, a subsidiary of Amgen, sought to deepen the understanding of cis-acting influences of sequence variants on CpG methylation. In the study, the scientists were able to assign CpG methylation, gene expression, and alleles of sequence variants to parental chromosomes, allowing them to investigate correlations between the three sets of measurements on a haplotype level. The group’s findings show that sequence variants drive the correlation between DNA methylation and gene expression.
This work is published in Nature Genetics in the paper, “The correlation between CpG methylation and gene expression is driven by sequence variants.”
One of the advantages of sequencing DNA using nanopores—technology commercialized by ONT (Oxford Nanopore Technology)—is that it allows for DNA sequence analysis in real-time. Not only that, but it also detects chemical modifications, like DNA methylation, of the nucleotides from the same measurements.
Being able to directly measure DNA methylation, while also yielding longer reads of DNA sequences, allows for DNA methylation on chromosomes to be determined from both parents, separately.
More specifically, the group determined haplotype-specific methylation rates of 15.3 million CpG units in 7,179 whole blood genomes. They then identified 189,178 methylation depleted sequences (MDSs), where three or more proximal CpGs were unmethylated on at least one haplotype.
In addition, 77,789 MDSs (~41%) associated with 80,503 cis-acting sequence variants termed allele-specific methylation QTLs (ASM-QTLs). Taken together, the findings show that sequence variants affect DNA methylation and, furthermore, some of these variants can be linked to various diseases as well as other human traits.
RNA sequencing data from 896 samples from the same blood draws that were used to perform nanopore sequencing showed that “the ASM-QTL, DNA sequence variability, drives most of the correlation found between gene expression and CpG methylation. ASM-QTLs were enriched 46.4-fold (95% CI:36.0,58.7) among sequence variants associating with hematological traits, demonstrating that ASM-QTLs are important functional units in the non-coding genome.”
The correlation between DNA methylation and gene expression can be attributed to sequence variants, the authors noted, indicating that these variants are the driving factor.
The majority of sequence variants that have been linked to diseases are found in the noncoding genome. For this reason, it has been difficult to understand how noncoding sequence variants lead to diseases. Studying the effects on DNA methylation has shown that many of these variants correspond to sequence variants that had previously been associated with disease, thereby enabling us to better understand how they lead to the progression of diseases.