A new computational method can read genomic signatures on ribbons of DNA that loop, bow-like, to facilitate interactions between distantly situated enhancers and promoters. The method, called TargetFinder, uses machine learning to distinguish between interacting and noninteracting enhancer–promoter pairs, and it has generated accurate predictions 85% of the time, suggesting that it can identify subtle gene regulation mechanisms—and thereby reveal new therapeutic targets for genetic disorders.
TargetFinder, which was developed by scientists at the Gladstone Institutes, reconstructs regulatory landscapes from diverse features along the genome. According to a study that appeared April 4 in the journal Nature Genetics (“Enhancer–Promoter Interactions Are Encoded by Complex Genomic Signatures on Looping Chromatin”), TargetFinder was used to analyze hundreds of existing datasets from six different cell types to look for patterns in the genome that identify where a gene and enhancer interact.
The Gladstone Institute team, led by Katherine Pollard, Ph.D., discovered several patterns that exist on the loops that connect enhancers to genes. The team found that its approach accurately predicted individual enhancer–promoter interactions across multiple cell lines with a false-discovery rate up to 15 times smaller than that obtained using the closest gene.
“By evaluating the genomic features driving this accuracy, we uncover interactions between structural proteins, transcription factors, epigenetic modifications, and transcription that together distinguish interacting from non-interacting enhancer–promoter pairs,” wrote the authors of the Nature Genetics article. “We conclude that complex but consistent combinations of marks on the one-dimensional genome encode the three-dimensional structure of fine-scale regulatory interactions.”
The authors also noted that most of the “combinations of marks” that matter are not proximal to the enhancers and promoters. Instead, they decorate the looping DNA.
“It's remarkable that we can predict complex three-dimensional interactions from relatively simple data,” said the study’s first author Sean Whalen, Ph.D., a biostatistician at Gladstone. “No one had looked at the information stored on loops before, and we were surprised to discover how important that information is.”
Performing experiments in the lab to identify all of these gene–enhancer interactions can take millions of dollars and years of research. The new computational approach is a much cheaper and less time-consuming way to identify gene–enhancer connections in the genome. The technology also provides insight into how DNA loops form and how they might break in disease. The scientists have offered all of the code and data from TargetFinder online for free.
“Most genetic mutations that are associated with disease occur in enhancers, making them an incredibly important area of study,” said Dr. Pollard. “Before now, we struggled to understand how enhancers find the distant genes they act upon.”
“Our ability to predict the gene targets of enhancers so accurately enables us to link mutations in enhancers to the genes they target. Having that link is the first step toward using these connections to treat diseases.”