Over the past 15 years scientists have identified hundreds of regions in the human genome associated with heart attack risk. However, efficient ways to explore how these genetic variants are molecularly connected to cardiovascular disease have been lacking, limiting efforts to develop therapeutics. Researchers led by investigators at Brigham and Women’s Hospital, in collaboration with the Broad Institute of MIT and Harvard and Stanford Medicine, combined multiple sequencing and experimental techniques to create a new approach, Variant-to-Gene-to-Program (V2G2P), that allowed them to map the relationship between known coronary artery disease (CAD) variants and the biological pathways they impact.
The study, which focused on endothelial cells (ECs) that line blood vessels, highlighted a previously unrecognized role for the TLNRD1 gene. The researchers hypothesized that this gene may be involved in both CAD, which is common, and the much rarer condition cerebral cavernous malformations (CMM).
“Studying how hundreds of regions of the genome, individually or in groups, influence risk of heart attack can be a painstaking process,” said Rajat Gupta, MD, at the Divisions of Genetics and Cardiovascular Medicine at Brigham and Women’s Hospital. “We decided we needed to have better maps showing how genetic variants affect gene expression and how genes affect biological function. If we could combine those two kinds of maps, we could make the bigger connection from variant to biological function.” Gupta is corresponding author of the researchers’ published study in Nature, titled “Convergence of coronary artery disease genes onto endothelial cell programs.” In their report they stated, “… our approach establishes a new, generalizable path to systematically link risk variants to disease genes and to convergent transcriptional programs, providing a rich foundation for further studies to dissect new disease mechanisms.”
Genome-wide association studies (GWAS) have identified hundreds of loci for many common complex diseases such as CAD, the authors wrote, and its reasoned that genetic variants that influence complex traits regulate genes that work together in biological pathways. “Identifying convergence on particular pathways can help discover genes and cellular functions that causally influence disease risk.” However, identifying this convergence is challenging, they continued. This is because complex traits involve contributions from multiple cell types, most risk variants are noncoding and can regulate multiple nearby genes, and it remains unclear which genes work together in which pathways in which cell types. “Linking variants from genome-wide association studies (GWAS) to underlying mechanisms of disease remains a challenge,” they stated. “… pathway-level convergence has been difficult to identify for many diseases because existing approaches can be limited to studying one gene or pathway at a time, underpowered and/or biased towards rediscovering known genes and pathways.”
V2G2P is designed to address these challenges, linking GWAS variants to genes in a systematic and unbiased manner, and so identifying convergence onto specific disease-associated transcriptional programs, they further explained. “Our study introduced a new method to address this challenge, in which we built unbiased maps of genome function using epigenomic data and Perturb-seq and then combined these maps to identify convergence of risk variants onto pathways,” they stated. The method is systematic in that it measures the full transcriptomic effects of all genes in all relevant GWAS loci, facilitating the discovery of new disease-associated pathways and new functions for uncharacterized genes.
To apply the new approach the researchers first, in collaboration with researchers at Stanford Medicine, matched CAD loci previously identified through GWAS to genes impacted by these genetic variants. They then used CRISPRi-Perturb-seq, a technology developed at the Broad Institute of MIT and Harvard, to delete thousands of CAD-associated genes, one at a time, and to examine how each deletion impacted the expression of all the other genes in that cell. In total, the researchers sequenced 215,000 endothelial cells to determine how 2,300 deletions influenced expression of 20,000 other genes in each cell. With applied machine learning algorithms, they were able to identify the biological mechanisms that consistently appeared to be related to CAD-associated variants.
Through their study the researchers found that 43 of 306 of the CAD-associated variants in endothelial cells were linked to genes in the cerebral cavernous malformations (CCM) signaling pathway. “Applying this method to ECs revealed that 43 out of 306 CAD GWAS signals indeed converge onto five transcriptional programs, all related to CCM signaling,” they noted. Their results highlighted a previously unrecognized role for the TLNRD1 gene in regulating the CCM pathway, alongside other known CCM regulators. “Two newly prioritized CAD genes, CCM2 and TLNRD1, strongly regulated these programs (which included dozens of other CAD genes), showed highly similar transcriptional and cellular phenotypes, and physically interacted with one another.”
CCM is a rare, devastating vascular disease that impacts the brain, but the researchers hypothesized that smaller, subtler mutations in the genes involved in CCM may contribute to CAD risk by affecting vascular inflammation, thrombosis, and the structural integrity of the endothelium. Their results led them to consider that TLNRD1 may be involved in both the relatively common CAD and rare CCM. The collective data, they noted, “… indicate that TLNRD1 is a previously unrecognized, evolutionarily conserved member of the CCM signalling pathway.”
The study focused on endothelial cells, which line blood vessels and are increasingly understood to influence CAD risk. The work considered endothelial mechanisms unrelated to lipid metabolism – a known driver of CAD risk for which there are effective therapies, such as statins – in hopes of uncovering other mechanisms driving CAD risk for which therapies may yet be developed.
“Now that we know more about this collection of endothelial cell variants, we can return to patients who have them to see if they have different clinical features or respond differently to the therapies we are already using,” Gupta said. “We are also focused on this study’s implications for CCM patients. It was a coincidence that from this genetic screen designed to look at coronary disease, we implicated new genes for a rare vascular disease, CCM. Perhaps now we can better describe the risk factors and pathways that drive it.”
Going forward, the researchers hope to study patients with endothelial CAD-associated variants as well as CCM patients to determine whether there are distinct opportunities for treating these populations. For the latter, the researchers are interested in determining whether further investigation into TLNRD1 can lead to better forms of genetic testing and risk stratification.
Beyond CAD and CCM, the researchers emphasize that the V2G2P approach can be used to explore the biological mechanisms driving any disease for which a cell-type relevant to that disease can be genetically modified in the lab.
“It was remarkable that this unbiased, systematic approach — in which we deleted all candidate CAD genes in a single experiment —pointed us straight to new genes and pathways that had escaped notice,” added co-corresponding author Jesse Engreitz, PhD, assistant professor of genetics at Stanford Medicine. “This approach will be a powerful strategy for studying many other diseases where genetic risk factors remain to be discovered.”
As the team further pointed out, “The method is systematic in that it measures the full transcriptomic effects of all genes in all relevant GWAS loci, facilitating the discovery of new disease-associated pathways and new functions for uncharacterized genes.”
They suggested that, “by applying Perturb-seq and the V2G2P approach across many cell types and states relevant to various complex diseases, it should be possible to nominate causal disease genes for a large fraction of GWAS loci and map how they converge onto particular cellular pathways. Such a project is becoming increasingly feasible and would provide a foundation for systematic efforts to leverage human genetic data to discover disease mechanisms.”