Cell by Cell Mapping of the Cancer Transcriptome

Capturing Tumor Heterogeneity for Better Therapeutics and Diagnostics

A biological system may resist interrogation, slowing our efforts to identify system components and understand how they work, individually and in combination. And so we work to enhance our interrogation techniques, for while extracting information from a biological system is challenging, it is also indispensable to a range of missions: revealing how cells organize into tissues, dissecting disease pathogenesis, and advancing diagnostic and treatment efforts.

To find ways of improving cell-level intelligence gathering, a global initiative was organized in late 2016. This initiative, called the Human Cell Atlas, proposes to create comprehensive reference maps of all human cells as a basis for both understanding human health and diagnosing, monitoring, and treating disease. The Human Cell Atlas relies, to a great extent, on the use of transcriptome sequencing to profile the gene expression of individual cells.

Not-So-Hidden GemCodes

“We developed a fast way of profiling tens of thousands of cells, where for each cell, we capture gene expression in an unbiased way,” says Benjamin J. Hindson, Ph.D., CSO, president, and cofounder of 10x Genomics. The resulting profile, he noted, is “essentially like a fingerprint of each different cell in a particular sample.”

In a recent study, Dr. Hindson and colleagues described 10x Genomics’ GemCode technology, a droplet-based platform that combines microfluidics with molecular barcoding and custom bioinformatics software to enable 3′ mRNA counting from thousands of single cells. With GemCode, cells are captured in droplets, and the reverse transcription that occurs within each droplet is used to generate barcoded cDNA.

After the resulting libraries are sequenced, the data is used to perform automated clustering using the company’s Cell Ranger workflow. Cell Ranger is a set of analysis pipelines that processes Chromium single-cell 3′ RNA-Seq output to align reads, generate gene-cell matrices, and perform clustering and gene-expression analysis.

Using the GemCode approach, Dr. Hindson and colleagues collected transcriptome data from about 250,000 cells across 29 samples. “Single-cell RNA-Seq allows heterogeneities within a population to be captured,” Dr. Hindson stated. “It is still necessary, however, to scale analyses and to capture as many cells as possible from a sample.”

In a study on human and mouse cell lines, Dr. Hindson and colleagues showed that GemCode captured about half of the cells loaded. With about 100,000 reads per cell, GemCode detected about 27,000 transcripts per cell, showing comparable sensitivity to existing droplet-based methods.

In a proof-of-concept analysis that used 68,000 fresh peripheral blood mononuclear cells obtained from a healthy donor, Dr. Hindson and colleagues showed that GemCode can be used to interrogate cellular heterogeneity and cluster large immune cell populations. The same proportions of cellular subpopulations were also captured from frozen cells, indicating that this method can be used on frozen samples.

“We need to be able to prepare cells from complex tissues, get them into a suspension, and analyze them,” Dr. Hindson explained. He added that this kind of sample preparation—which may involve samples supplied by different laboratories and taken from different tissues—is a challenge when RNA-Seq experiments are to be performed.

Another challenge is the need to examine the vast amount of data that is generated by such high-throughput, large-scale technologies. “The analysis itself is going to be a very important piece,” commented Dr. Hindson. “In the future, we will see more assays being run on a single sample at the same time and more combination assays.”

Making Every Cell Count

“It is important for oncologists to know which drug to select for a given patient, particularly following relapse,” said Scott R. Manalis, Ph.D., professor of biological engineering at the Koch Institute for Integrative Cancer Research at MIT. For certain cancers with well-defined mutations, such as lung cancer, genomics has been very successful in identifying biomarkers and specifying their use for therapeutic purposes. “But for many cancers,” added Dr. Manalis, “therapies are selected empirically.”

One of the challenges associated with cancer treatment is the cellular heterogeneity that has been increasingly described in tumors and is critical in shaping tumor behavior. Even genetically identical cells may grow at different rates, as shown by experiments on bacteria and some eukaryotes. In addition, differences in the proliferation rates of cells within a malignancy may shape their proliferation or metastatic potential. These differences in growth, despite being biologically critical, are obscured in population-level measurements.

In a recent study, Dr. Manalis and colleagues developed a high-throughput method to measure the growth rate of a cell population’s individual cells simultaneously. The method promises to capture cell-to-cell variations and characterize growth dynamics in different environmental conditions.

Cells in suspension flow through a microfluidic channel with 10–12 resonant mass sensors distributed along its length, weighing each cell repeatedly over the time it spends within a sealed microfluidic channel. Multiple cells traverse the channel at the same time.

In a proof-of-concept experiment, Dr. Manalis and colleagues showed that their microfluidic technique can measure the growth rates of various cell types, including single lymphocytic cells, mouse and human T cells, primary human leukemia cells, budding yeast, and bacteria, at a resolution of 0.2 pg/h for mammalian cells and 0.02 pg/h for bacterial cells. “The cell’s mass accumulation rate provides a biophysical readout of the cell that integrates numerous molecular characteristics,” asserted Dr. Manalis.

Using the mass accumulation rate as a biomarker, Dr. Manalis and colleagues showed that some cells, after being treated with a drug, continue to accumulate mass while the drug is present, a behavior that suggests that the cells could be resistant to the drug. In contrast, cells that show a decreased mass accumulation rate after being exposed to a drug could be responding to the drug.

Recently, Dr. Manalis and his laboratory members developed an approach in which cells are treated with a compound, sorted one by one by means of a microfluidic device, and placed in a well plate. In collaboration with MIT investigators (in the laboratories of Alex K. Shalek, Ph.D., and Douglas A. Lauffenburger, Ph.D.) and Dana Farber Cancer Institute scientists (in the laboratories of David M. Weinstock, M.D., William C. Hahn, M.D., Ph.D., and Keith L. Ligon, M.D., Ph.D.), they are using this approach to discover mechanisms of resistance.

“We look inside each well and determine whether the cell it contains is growing or not growing in response to a particular drug, and then we take that plate, which might have one or many cells, perform single-cell RNA sequencing, and compare the transcriptome of a responsive cell to that of a nonresponsive one, with the hope of identifying new pathways to target,” explained Dr. Manalis.

In this experiment, the mass accumulation rate, a biomarker that integrates many signals, is informative of drug responses that might otherwise not be captured simply by looking at a particular mutation. “If we can use this device to guide how we analyze the single-cell RNA-Seq data and annotate cells as resistant or nonresistant, then that would allow us to analyze the data in a new way,” Dr. Manalis stated.

This approach links a biophysical measurement with the transcriptome to capture a level of insight that one might not be able to obtain from either of the measurements by itself. “We developed the pipeline for doing this,” remarked Dr. Manalis. “And we are starting to take a deep dive into various cancer systems to test this out and see if we can discover new ways to target based on this approach.”


To determine the response of cancer cells to therapy, MIT researchers measure how individual cell masses change when cells are exposed to drugs. The microchip pictured here fluidically links a series of tiny fluid-filled diving boards (top), whose vibrations precisely reveal the masses of the cells passing through them. As cells flow across the array of sensors, each cell is weighed multiple times, thereby revealing the rate at which individual cells undergo mass changes. Researchers are exploring if these measurements, when applied to patient tumor cells, can predict the optimal treatment strategy. [Selim Olcum, Ph.D., Manalis Lab, and Nathan Cermak, Ph.D., Manalis Lab]

Giving Cell Cycle Progression a Spin

“When we look at all the cells together, that pulls away the individual differences between the cells and can mask potential subpopulations,” said Anders Ståhlberg, Ph.D., associate professor in the department of pathology and genetics, Sahlgrenska Cancer Center, University of Gothenburg, Sweden. “Studies at the single-cell level, however, provide a different resolution.”

A major effort in Dr. Ståhlberg’s lab focuses on understanding the regulation of genes involved in cell cycle progression. In most studies that examine the cell cycle, cells are artificially synchronized in preparation for the experiment. In previous work, Dr. Ståhlberg’s and other groups have reported that cell synchronization may cause cellular stress and change gene-expression profiles.

“Traditional approaches to synchronize cells can introduce a lot of biases,” commented Dr. Ståhlberg. As a result, obtaining and preparing the sample emerges as potentially the most challenging part of the overall experiment, “and simply analyzing the material becomes the easy part,” he continued.

Recently, Dr. Ståhlberg and colleagues examined single-cell gene-expression profiles in cells at different phases of the cell cycle and of different sizes. For these experiments, fluorescence-activated cell sorting and quantitative PCR, or real-time PCR (RT-PCR), were used to profile 93 genes at the single-cell level across three cell lines to examine single-cell level transcriptomes and the transition between different stages of the cell cycle.

The distribution of transcripts varied greatly among individual cells, and correlated with the cell cycle but not with cell size. Individual cells showed highly variable and overlapping gene-expression patterns. “We now have techniques that help us work better with individual cells,” asserted Dr. Ståhlberg. “And we need to become even better in handling small samples sizes or samples in which the RNA is partly degraded.”

Further analyses that used feature elimination revealed that a nine-gene panel is comparable with this gene set in classifying the cell cycle phase, and it helped identify a cellular subpopulation that was not clearly visible when all the genes were used. This gene panel was informative for characterizing cell cycle progression for all three cell types used in the study. The analysis illustrated that excluding noninformative genes can identify a novel subpopulation of cells that is characterized by low total transcript levels and the downregulation of several genes involved in cellular proliferation.

Gene Signature Measurement

“The transcriptome provides a functional readout in which one can visualize the status of the material to a higher degree compared to analyzing the structural information provided by DNA,” said Yujin Hoshida, M.D., Ph.D., associate professor of medicine at the Icahn School of Medicine at Mount Sinai.

In many medical conditions, including cancers, the elusive mechanisms of pathogenesis remain a major obstacle in identifying therapeutic strategies. Sometimes, these challenges emerge due to the multiple potential causes that may converge on a pathological outcome. One example is liver cirrhosis, a risk factor for liver cancer, which may have several different etiologies, including hepatitis B or C viruses, alcohol, or non-alcoholic fatty liver disease.

In a study that proposed to identify the liver cancer risk gene signature and cancer chemoprevention targets, “The main issue has been the technical challenge in measuring the transcriptome in a robust way,” explained Dr. Hoshida. For robust gene signature measurement, Dr. Hoshida and colleagues used a digital transcript counting assay known as NanoString.

“This is one of the platforms designed to survey the transcriptome more robustly,” he explained. His team also performed a transcriptome meta-analysis of over 500 patients with cirrhosis to explore cancer chemoprevention targets. This approach led to the identification of several gene-expression modules, one of which was uniquely informative about the future de novo development of liver cancer, suggesting its specific involvement in carcinogenesis.

An unbiased survey of transcriptome signatures caused by an shRNA library-based knockdown of over 5,000 genes, together with an in silico screening of over 20,000 chemical perturbations of this database, supported the involvement of the lysophosphatidic acid pathway in the development of liver cancer in cirrhotic humans and rodents.

In an in vivo experimental model of cirrhosis-driven liver cancer, the pharmacological inhibition of the lysophosphatidic acid pathway reversed the liver cancer gene-expression signature and suppressed cancer development, demonstrating the promise of transcriptome analysis in precision cancer prevention therapy.

“In translating such research findings into the clinic, the business development aspect poses significant challenges due to the still elusive patent landscape in transcriptome-based biomarkers,” Dr. Hoshida added.

Gene Fusions

“The presence of gene fusions in cancer has been known for many years,” said Hui Li, Ph.D., associate professor of pathology at the University of Virginia School of Medicine. The classical example of a gene fusion is the Philadelphia chromosome, in which the ABL1 gene from chromosome 9 is fused to the BCR gene from chromosome 22, and the hybrid tyrosine kinase that is formed as a result leads to the uncontrolled cell proliferation that occurs in chronic myelogenous leukemia. “Essentially this lays the foundation of molecular diagnosis for detecting the fusion product and diagnosing a patient,” he added.

For many years, it has been assumed that all fusion products result from chromosomal rearrangements that form when pieces of a gene are juxtaposed to pieces of other genes. “However, we found that these fusions can be made not only by DNA-level changes but they also occur in the RNA by intergenic splicing,” Dr. Li explained.

Intergenic splicing occurs when the exon of one gene splices with the exon of another gene during the RNA processing step. Another assumption has been that the presence of these fusion products is generally indicative of pathological processes. “However, totally by accident, we found something against this traditional wisdom, which is that fusion RNAs are not cancer-specific, and they are also present as part of normal physiology,” Dr. Li noted.

These findings point toward the dangers of assuming that fusion RNAs invariably have utilities as cancer biomarkers. “One has to first carefully validate fusion RNAs and settle out all these normal existing fusion RNAs,” he said. RNA fusion events cannot be captured by exon sequencing or even by whole genome sequencing.

“Performing RNA sequencing adds another layer, and capturing all these molecules that are potentially misregulated in cancer provides a new repertoire for biomarkers and drug targets,” Dr. Li said.

In a recent study, Dr. Li and colleagues sequenced RNA from a cell line of alveolar rhabdomyosarcoma, an aggressive myogenic pediatric cancer. “One thing about this type of cancer is that for years, people have been seeking the cell of origin,” explained Dr. Li. For patients with advanced stages of alveolar rhabdomyosarcoma, the prognosis did not improve significantly in recent years, and insufficient knowledge about the cell of origin is a therapeutic limitation.

In studies that sequenced the RNA of rhabdomyosarcoma cells and of cells undergoing physiological muscle differentiation, Dr. Li and colleagues revealed that 18 chimeric RNA molecules from an alveolar rhabdomyosarcoma cell line were expressed at myogenic time points during muscle cell differentiation. This occurred at time points when PAX3–FOXO1, a fusion RNA critical for the development of this cancer, is also expressed. Only one of these fusions was induced by PAX3–FOXO1, and the other chimeras were not downstream of it.

These finding revealed the possibility of using fusion RNA profiling to interrogate the etiology of fusion genes relevant for cancer biology. The expression of PAX3–FOXO1 in this cancer and also in certain cells during normal myogenesis supports the myogenic origin of this cancer.

“RNA sequencing expands our tools, and we should look for this underdeveloped and underappreciated pool of biomarkers and potential drug targets, as well as enhancing our basic understanding of disease,” Dr. Li noted.

Identifying the true cell of origin for a malignancy may help develop more rational therapies. “One of the bottlenecks in transcriptome analyses is that the sequencing length and the sequencing depth are still limiting. Longer and deeper reads together with proper bioinformatics provide a better chance to identify fusion RNAs,” Dr. Li concluded.