The field of genetic engineering has amassed a huge toolbox: massive online datasets, gene editing tools, and genomic sequencing that is quick and cheap. The missing piece of the puzzle, according to a team of scientists and engineers at Harvard’s Wyss Institute for Biologically Inspired Engineering, Harvard Medical School (HMS), and the MIT Media Lab, has been a clear roadmap to help researchers figure out which genes to target, which tools to use, and how to interpret their results.
To this end, that team has created an integrated pipeline for performing genetic screening studies, encompassing every step of the process from identifying target genes of interest to cloning and screening them quickly and efficiently.
The protocol, called Sequencing-based Target Ascertainment and Modular Perturbation Screening (STAMPScreen), is published in Cell Reports Methods in the paper, “An integrated pipeline for mammalian genetic screening,” and the associated open-source algorithms are available on GitHub.
“STAMPScreen is a streamlined workflow that makes it easy for researchers to identify genes of interest and perform genetic screens without having to guess which tool to use or what experiments to perform to get the results they want,” said author Pranam Chatterjee, PhD, a former graduate student at the MIT Media Lab and who is now the Carlos M. Varsavsky research fellow at HMS and the Wyss Institute. “It is fully compatible with many existing databases and systems, and we hope that many scientists are able to take advantage of STAMPScreen to save themselves time and improve the quality of their results.”
The research of Chatterjee and Christian Kramme, a PhD candidate in the lab of George Church, PhD, a Wyss core faculty member and professor of genetics at HMS, explores the genetic underpinnings of different aspects of biology—fertility, aging, and immunity—by combining the strengths of digital methods and genetic engineering.
But the two were met with limitations. The algorithms could indicate when a gene’s expression pattern changed, but didn’t provide any insight into the cause of that change. When they wanted to test a list of candidate genes in living cells, it wasn’t immediately clear what type of experiment they should run. And many of the tools available to insert genes into cells and screen them were expensive, time-consuming, and inflexible.
“We figured there had to be a better way to do this kind of research, and when we couldn’t find one, we took on the challenge of creating it ourselves,” said Kramme.
Kramme and his colleagues outlined what would be required to make an end-to-end platform for genetic screening that would work across the board.
First, they developed unique computational algorithms that can use transcriptomic data of cell differentiation and identify key transcription factors. The first algorithm gives a high score to genes that are highly connected to other genes and whose activity is correlated with large, cell-level changes. The second algorithm generates networks to represent the dynamic changes in gene expression during cell-type differentiation and then applying centrality measures, such as Google’s PageRank algorithm, to rank the key regulators of the process.
Once the target genes have been identified, the STAMPScreen protocol disrupts those genes in cells. The team of researchers systematically evaluated multiple gene perturbation tools including complementary DNA (cDNA) and several versions of CRISPR in human induced pluripotent stem cells (hiPSCs).
They then created a new tool that allows CRISPR and cDNA to be used within the same cell to unlock synergies between the two methods. For example, CRISPR can be used to turn off expression of all isoforms of a gene, and cDNA to sequentially express each isoform individually, allowing more nuanced genetic studies and greatly reducing background expression of off-target genes.
The candidate transcription factors can then be cloned into barcoded PiggyBac vectors using the team’s patented cloning method named MegaGate. MegaGate, which employs meganucleases to enable high-efficiency cloning, had a cloning success rate of 99.8% and also allowed them to barcode their vectors with ease.
“MegaGate not only solves many of the problems that we kept running into with older cloning methods, it is also compatible with many existing gene libraries like the TFome and hORFeome. You can essentially take Gateway and meganucleases off the shelf, put them together with a library of genes and a library of barcoded destination vectors, and two hours later you have your barcoded genes of interest. We’ve cloned nearly 1,500 genes with it, and have yet to have a failure,” said Alexandru Plesa, a graduate student at the Wyss Institute and HMS.
Chatterjee’s twitter feed explains that, lastly, the barcoded vectors can be combinatorially screened in iPSCs, at varied copy numbers, and the resulting transcriptomes can be read-out by barcode-based sequencing, both at bulk and single-cell resolution.
They also successfully used a variety of methods, including RNA-Seq, TAR-Seq, and Barcode-Seq, to read both the genetic barcodes and the entire transcriptomes of hiPSCs, enabling researchers to use whichever tool they are most familiar with.
The team anticipates that STAMPScreen could prove useful for a wide variety of studies, including pathway and gene regulatory network studies, differentiation factor screening, drug and complex pathway characterizations, and mutation modeling. STAMPScreen is also modular, allowing scientists to integrate different parts of it into their own workflows.
“There’s a treasure trove of information housed in publicly available genetic datasets,” said Church, “but that information will only be understood if we use the right tools and methods to analyze it. STAMPScreen will help researchers get to eureka moments faster and speed up the pace of innovation in genetic engineering.”