A novel method, based on studying transcription factor protein interactomes, has been developed to identify genetic variants that are likely to play important roles in congenital heart disease. The new strategy—which combines techniques from genetics, computational biology, stem cell biology, and proteomics—could also be applied to study numerous other diseases with complex genetic causes.
This work is published in Cell, in the paper, “Transcription Factor Protein Interactomes Reveal Genetic Determinants in Heart Disease.”
Nearly 1% of all children are born with congenital heart disease—a range of potentially life-threatening problems with the structure and function of their hearts. For most children, the precise causes of the defects are unknown. One cause could be genetic variants that are involved in the formation of the heart in the womb. But it is unknown which genes contribute to congenital heart disease and how they interact with each other.
“Previous methods have generated long lists of variants detected in patients, but many actually turned out to be inconsequential, so a major challenge in the field has been identifying which variants are most important,” said Deepak Srivastava, MD, Gladstone president and professor in the department of pediatrics at the University of California, San Francisco (UCSF). “Our approach pinpoints variants that are most likely to be involved in disease, allowing us to focus on those variants, deepen understanding of the underlying biology of the disease, and, we hope, move more rapidly toward new treatments.”
Rather than looking at variants in isolation, the novel strategy considers the interactions between proteins to zero in on which variants might be causing disease—in this case, congenital heart disease.
The proteins GATA4 and TBX5 were already known to be required for healthy human heart formation, and to collaborate with a network of additional proteins to help grow a heart.
The researchers carefully mapped out the entire network of interactions between the GATA4 and TBX5 proteins using precursor heart cells grown from human induced pluripotent stem cells. Next, they cross-referenced this 273-protein network with DNA sequencing data from over 3,000 children with congenital heart disease and their parents.
Several dozen variants in the children’s sequencing data matched specific proteins also found in the GATA4-TBX5 network, far more than expected, pinpointing them as candidates that may contribute to congenital heart disease.
“We first identified important protein networks in the types of cells affected in congenital heart disease, and then integrated large-scale, protein-coding sequencing data,” said Bárbara González Terán, PhD, a postdoctoral scholar in Srivastava’s lab. “Many scientists had speculated this approach was possible, but to our knowledge, this is the first time it has actually been done, for any disease.”
Maureen Pittman, a graduate student in the lab of Katie Pollard, PhD, director of the Gladstone Institute of Data Science and Biotechnology, developed a computational tool that ranks the candidates according to their likelihood of contributing to congenital heart disease. This ranking algorithm takes into account characteristics of the variant, the affected gene, and the type of heart defect found in patients with the variant.
“Of the top-ranking variants we identified with the algorithm, some were in genes already known to contribute to congenital heart defects,” said Pittman. “But many had never before been linked to heart development, including a protein called GLYR1, which is involved in turning other genes on and off.”
Additional experiments in cells and mice indicated that GLYR1 indeed plays a central role in the formation of the heart, and a patient variant of GLYR1 disrupts heart development by hampering its interaction with GATA4.
“Identifying GLYR1 as a key gene in heart development opens up a whole new biological space for understanding how this system works,” said Srivastava. “We will continue to study the biology of GLYR1, and we hope that others will follow up on the other high-scoring variants we found.”
The new study relied heavily on proteomics techniques from the lab of Nevan Krogan, PhD, senior investigator at Gladstone and director of the Quantitative Biosciences Institute at UCSF. “The dynamic and teamwork-focused efforts at Gladstone really made this possible,” said Srivastava.
The researchers believe the power of their new method lies in its promise to help illuminate how combinations of variants—rather than single variants on their own—work together to cause congenital heart disease. This method could also be adapted to identify combinations of variants that may underlie other complex diseases. For instance, Pollard’s team is already looking into applying it to neurodevelopmental disorders, including autism and epilepsy.
“With more and more sequencing data being generated every year from patients with complex diseases, our approach will help guide where to focus among all the detected variants,” Srivastava said.