Send to printer »

Feature Articles: March 15, 2017 (Vol. 37, No. 6)

Gene Expression’s Big Rethink

If The “One Gene, One Protein, One Function” Idea Was True, We Would Have Genomic Gridlock

  • One gene, one protein? No. One gene, one functional product? No, not that, either—even though saying “functional product” has the virtue of recognizing that a stretch of DNA may give rise to a protein or a noncoding RNA.

    Whatever we may assume a gene will do, we should avoid perpetuating the idea that it will do just one thing—or that it will do one thing all by itself.

    So, we should forget “one to one.” Instead, we should think “one to many” or even “many to many.” If we feel unequal to the task, we needn’t despair. We can always resort to bioinformatics.

    We might have dispensed with one-to-one thinking a long time ago, at least as far back as the Human Genome Project. Back then, it was still surprising that the genome contained just 22,000 protein-encoding genes. Not only was this number smaller than scientists had expected, it corresponded to just 1.5% of the genome’s total content. The remaining 98.5% of the genome was sometimes called “junk.” It has come to be appraised more highly. It is now better understood that it contains stretches of DNA that encode RNA molecules that function not as templates for protein synthesis, but as regulatory elements.

    Yet, even as we expand our concept of genomic function, we still need to guard against one-to-one thinking. Whether we are dealing with the genome’s protein-encoding elements or noncoding RNAs, we need to be aware, at a minimum, of variant forms—both protein isoforms and RNA isoforms, the latter of which may include isoforms of miRNAs (microRNAs), or isomiRs.

    “Most of the downstream analyses have been based on the assumption that one gene makes one functional product,” says Ramana V. Davuluri, Ph.D., professor of preventive medicine at Northwestern University Feinberg School of Medicine. “This assumption, from a bioinformatics perspective, is too simplistic.”

    Dr. Davuluri leads a bioinformatics group that is interrogating gene-expression signatures and developing diagnostic and prognostic tools. The group is well aware of recent findings that over half of the genes encoded in the human genome produce multiple protein isoforms with potentially varied functions. These findings cast doubt on the notion that the “gene” is the functional unit in a living cell.

    “In mammalian cells,” notes Dr. Davuluri, “the total products of the transcriptome could be up to 200,000 if all variants and noncoding genes are included.”

  • Complementary Assays

    Two technologies used to interrogate gene expression, RNA-Seq and microarray analysis, often return strongly correlated results. These technologies, however, have not been evaluated for their concordance at the isoform level.

    To understand the correlation between RNA-Seq and exon-array platforms in detecting isoforms, Dr. Davuluri and colleagues compared gene- and isoform-level expression for glioblastoma multiforme transcripts from The Cancer Genome Atlas (TCGA). Glioblastoma multiforme is one of the three malignancies for which TCGA contains both RNA-Seq and exon-array data.

    The investigation revealed that only about 36% of the differentially expressed isoforms identified by RNA-Seq were also classified as differentially expressed by exon arrays, and that about 70% of the ones classified as differentially expressed by exon arrays were also classified as such by RNA-Seq, indicating that isoform-level expression may be masked by gene-expression estimates.

    “Gene-expression arrays and RNA-Seq will be used in a complementary manner,” asserts Dr. Davuluri. “And if the costs of sequencing drop further, people will use sequencing more and more.”

    While microarrays are more cost-effective, RNA-Seq provides several advantages, including single-nucleotide resolution and the possibility of performing analyses without prior knowledge about the targeted sequences.

    To quantitatively compare gene-expression measurements between different analytical platforms and allow signatures to be transferred across them, Dr. Davuluri and colleagues made use of the PIGExClass (platform-independent isoform-level gene-expression-based classification) system. Using this computational tool, the investigators performed the first isoform-level assay for the molecular stratification of cancer.

    Dr. Davuluri’s group examined exon-array and RNA-Seq isoform-level profiles for glioblastoma multiforme samples, and it illustrated the possibility of stratifying patients into one of the four molecular subgroups. As a result of the isoform-level analysis, the subgroup classification changed for 19% of the samples, leading to a different prognostic classification, a finding of critical therapeutic and prognostic relevance.

    “The technology for the data-generating platforms moves fast,” comments Dr. Davuluri. “But the data that comes out from the platforms cannot be understood without informatics.”

  • miRNA Isoform Analysis

     “We used to assume that disease was a matter of a certain number of regulatory molecules and a certain number of regulatory targets,” says Isidore Rigoutsos, Ph.D., professor of pathology, anatomy, and cell biology and director of the Computational Medicine Center at Thomas Jefferson University. “But we have shifted away from this abstract or reductionist view. We have developed an understanding of disease that is much more complex.”

    Paralleling the conceptual shift that led to the departure from the one gene-one polypeptide hypothesis, Dr. Rigoutsos and colleagues revealed that a similar reductionist view has existed when describing nonprotein-encoding genomic loci. In a recent study, Dr. Rigoutsos and colleagues sifted through TCGA data, catalogued miRNA isoforms that could be detected by RNA-Seq, and revealed that some miRNA loci produce several isoforms. For this data, analysis indicated that each locus generated about five isomiRs on average. The distribution of isoforms among loci was uneven, however, and as many as a few dozen distinct isomiRs could be detected from an individual locus.

    “If we include the isoforms, the same number of loci is found to encode more players, which have many more interactions with their own mRNA [messenger RNA] partners, and this provides many more opportunities to create therapeutic targets and approaches,” says Dr. Rigoutsos.

    Experiments from Dr. Rigoutsos’ group support the involvement of different miRNA isoforms in shaping subgroup classification and therapeutic and prognostic outcomes. In a study that involved patients with triple-negative breast cancer, Dr. Rigoutsos found that several isoforms of miRNA-183-5p were upregulated in triple-negative breast cancer in Caucasian, but not in African-American women, and integrative analyses of miRNA/mRNA expression revealed that in luminal A and luminal B breast cancers, their putative interactions differed extensively between the two subtypes, presenting distinct therapeutic and prognostic challenges.

    In cell-culture studies, Dr. Rigoutsos’ laboratory also found that different isomiRs form the same hairpin have distinct effects on mRNAs and the cellular transcriptome. For example, different isomiRs encoded by the miR-183-5p locus had a different targetome, and even a shift in two nucleotides with respect to the archetype miRNA markedly changed the effect of each individual isomiR on the transcriptome.

    Collectively, these analyses revealed that the multitude of miRNA isoforms produced from a miRNA locus provides a much more detailed understanding of the post-transcriptional processes that orchestrate the regulatory events in breast cancer, as compared to only the archetype miRNA produced by the respective locus.

    While protein isoforms have been known for many years, the discovery of miRNA isoforms is much more recent. “We were able to use a lot of approaches and learn a lot about what protein isoforms do, but we did not have the same amount of time, and did not spend the same amount of effort, to understand what the different microRNAs from the same locus do,” says Dr. Rigoutsos.

  • Genetic Variation and Drug Response

    Imagine taking a patient’s skin cells, using them to derive induced pluripotent stem cells [iPSCs], differentiating the stem cells to produce cells of a particular type, and then exposing the differentiated cells to drugs that the patient might be given, suggests Russ B. Altman, M.D., Ph.D., professor of bioengineering, genetics, medicine, and biomedical data science at Stanford University. Such procedures might detect the potential for drug-induced toxicity and reduce the incidence of serious side-effects in the clinical setting.

    The ability to predict adverse effects is particularly important for therapeutic agents that are associated with a high likelihood of failure or adverse effects. Predicting adverse effects could also help tailor treatments in a more rational manner.

    An example of a drug with a challenging adverse effect profile is doxorubicin. This chemotherapeutic agent is known to be cardiotoxic in some patients, but predicting which patients are at risk is difficult. In fact, no reliable means of predicting doxorubicin-induced cardiotoxicity (DIC) exists, so the drug cannot be administered with confidence.

    In a recent study conducted in collaboration with Dr. Paul Burridge from Northwestern University School of Medicine and Dr. Joseph Wu from Stanford Cardiovascular Institute, and other colleagues, bioinformatics analyses performed by Dr. Altman’s group were critical to show that patient-specific human induced pluripotent stem cell-derived cardiomyocytes can recapitulate at the single-cell level the predilection to develop doxorubicin-induced cardiotoxicity.

    “It was pretty straightforward, on the informatics side, to show a correlation between the cellular responses and the clinical responses,” asserts Dr. Altman. “This correlation is incredibly exciting.”

    Human iPSCs obtained from female patients with breast cancer and matched with healthy volunteers were differentiated into cardiomyocytes. RNA-Seq and microarray analyses were subsequently used to profile and compare gene-expression changes in the cardiomyocytes derived from the healthy volunteers and in those from the breast cancer patients with and without clinical DIC. Cells derived from patients presenting clinical DIC were more sensitive to therapy, exhibited increased metabolic stress and reactive oxygen species, and had impaired intracellular calcium signaling, as compared to cells derived from patients who did not show clinical DIC.

    Using microarray analyses to examine gene-expression perturbations in response to various doxorubicin concentrations, this study revealed that in vitro, the cardiomyocytes recapitulated patients’ predilection to DIC. The study also indicated that genetic and molecular analyses could provide a powerful tool to predict clinical toxicity to therapeutic agents.

    “The findings in the research setting are very intriguing,” comments Dr. Altman. “There is a lot of engineering to make them more reliable and reproducible.”

    Even though stem cell studies have shown a lot of promise, reproducibility has been particularly challenging, and results from different labs may vary depending on multiple factors, including small differences in experimental protocols and the versions of the stem cells used by various labs, for which it is very difficult to show equivalency.

    “The work is only half complete when the research is published,” Dr. Altman concludes. “Lots of details need to be addressed before this can be put into routine clinical use.”