Compared with RNA or DNA sequencing, protein sequencing got an early start, as early as the 1950s, when Frederick Sanger adapted partition chromatography and other techniques to determine the amino acid sequences of insulin’s A and B chains. Years later, RNA and DNA sequencing appeared, and they quickly began to outpace protein sequencing. They continued to race ahead even after protein sequencing got a boost from mass spectrometry (MS).
Although MS doesn’t actually sequence proteins, it can analyze protein fragments well enough to allow protein identities to be inferred. In combination with a separation technology, such as liquid chromatography (LC), MS enables large-scale protein profiling. MS is, in fact, the workhorse of proteomics.
Still, MS systems have their limitations. They can derive only so much meaning from protein fragments, which may fail to convey structural information despite providing suggestive mass/charge ratios. Also, MS systems rely on protein databases, which may be incomplete. Finally, MS systems may lack sufficient sensitivity to allow the detection of less abundant proteins.
Although MS-based proteomics has been refined over the years, it is still labor intensive, time consuming, and expensive. And it still doesn’t provide true sequencing. In contrast, nucleic acid sequencing is an established technique. Moreover, it is automated, fast, and cheap.
Of course, proteomics and genomics are not competitors. Rather, they are partners. Together, along with other “omics” approaches, they promise to reveal molecular patterns spanning multiple layers of biology and informing new approaches to diagnostics and therapeutics. These complementary approaches stand to make precision medicine a reality.
Fortunately, there are signs that proteomics is beginning to progress the way genomics has progressed. For example, in proteomics, progress is becoming less dependent on the initiatives of individual laboratories and companies. Proteomics, like genomics before it, is starting to benefit from international collaborations. The Human Proteome Project (HPP), launched in 2010, is creating an international framework for collaboration, data sharing, quality assurance, and accurate annotation of the proteome.
Another promising sign is the continued refinement of MS-based proteomics and the development of new technologies that go beyond MS and approach true sequencing capabilities. New technologies include fluorescent fingerprinting and subnanopore arrays. Also, with the help of bioinformatics technology, proteomics is positioned to benefit from coordination with other omics technologies.
Improvements in proteomics are already having an impact. According to BCC Research, the global market for proteomics is expected to reach $16.8 billion by 2022 while growing at an annual rate of 16.2%.
Advances in MS
Two-dimensional gel electrophoresis was the popular method for separating and quantifying proteins until the introduction of MS in the 1990s. In combination with improved separation and bioinformatics technologies, MS has brought high-speed, high-resolution, and high-throughput operations to protein profiling while enabling the analysis of complex biological samples and facilitating novel applications in biomarker discovery, drug development, and diagnostics.
MS continues to demonstrate improvement, asserts Andreas Hühmer, PhD, senior director, life sciences research, chromatography, and mass spectrometry at Thermo Fisher Scientific. “Earlier this year, [we] introduced the Thermo Scientific Orbitrap Exploris 240 and Thermo Scientific Orbitrap Exploris 120 mass spectrometers,” he says. “[These systems can] simplify access to proteomics for nonexperts, making LC-MS-based protein identification and quantification … highly accessible in biological research.” He notes that users can analyze thousands of proteins in one hour.
Acquiring data from tandem mass spectrometers has transitioned from data-dependent acquisition (DDA), a mode in which all peptides within a certain mass range are fragmented and measured, to data independent acquisition (DIA), a mode in which all peptides are detected in a narrow mass window which cycles to cover a larger predetermined range.
The first hurdle to clear in a proteomic experiment is sample preparation, a task that includes sample clean-up, fractionation, enrichment, and optimization. “We use various enrichment strategies to get us the right cells based on their surface markers, and to deliver that downstream functional readout of each cell that we analyze,” explains Sean Mackay, co-founder and CEO of IsoPlexis. “Our viewpoint is going to be the end-to-end workflow, which includes these sample prep or enrichment areas of expertise, which will allow us to unlock the clinical benefit of proteomics.”
Proteomic workflows typically require protein separation technology, which may use precipitation, membranes, chromatography, electrophoresis, molecular imprinting, or microfluidic chips to separate proteins from biomaterials.
“[We] recently evaluated Chromolith CapRod monolithic capillary columns for discovery proteomics,” says Kevin Ray, PhD, head of analytical research and development at MilliporeSigma. “The use of such ultralong columns with gradients of six hours or more provides a powerful option where ultradeep profiling is required.”
Quantitation and automation
Hurdles involved in sample preparation for MS are being addressed through quantitation and automation of complex liquid-handling workflows together with the introduction of more effective reagents.
“We are continuing to expand our MS reagent portfolio with an eye toward making the technology more accessible, robust, and reproducible,” Ray elaborates. “This includes the provision of immunoaffinity-MS kits for measurements of proteins and antibodies in serum; the expansion of our line of full-length stable isotope labeled proteins; and the use of certified reference materials for quantitative proteomics assays.
“We are expanding upon the success of our SOLu-Trypsin in the development of other solution-stable proteases and reagents, and we will be expanding our Universal Proteomics Standards portfolio to include solutions for high-throughput analyses and automation within discovery workflows.”
Additional quantitation issues are raised by John Rontree, PhD, portfolio director and leader of mass spectrometry at PerkinElmer. “To understand the functions of individual proteins and their place in complex biological systems, it’s necessary to measure quantitative changes in protein abundance relative to those of the system,” he points out. “This is a major factor in the field of translational proteomics, where triple quadrupole MS instruments, such as the PerkinElmer QSight® LC/MS/MS, are now deemed staples. In this respect, modern proteomics will shift from its initial high-end qualitative instruments to simple quantitative MS technologies akin to those in clinical laboratories today.
“Innovation will also revolve more around state-of-the-art workflows to include automated sample preparation, bespoke standards and reagents, and liquid chromatography separation—all controlled through a universal software architecture to embrace these techniques in a single solution,” he continues. “Sample preparation and its automation is perhaps the most challenging area to address among the platforms being created within workflows.
“PerkinElmer has worked hard on this area as part of the global COVID-19 testing push where we have seen widescale adoption of our Janus® G3 automated prep systems to support high-throughput PCR testing environments.”
The power of one: Single-cell proteomics
Cells with identical genomes may have radically different proteomes, and single cells in a complex tissue may have proteomes that vary widely from the population average. Single-cell biology is therefore the key to understanding basic biology, stem cell differentiation, mechanisms of cancer recurrence, and drug efficacy.
“The exciting thing about what we’re doing is the convergence between single-cell biology and proteomics for the first time,” says Mackay of the work at IsoPlexis. The company, which developed the IsoLight single-cell proteomics system, recently launched the IsoSpark and IsoSpark Duo systems.
The IsoSpark is designed to be widely accessible. It occupies a small footprint and offers lower throughput than the IsoLight, which is suited to core laboratories that run numerous samples. The IsoSpark Duo can run two chip families simultaneously, facilitating functional immune profiling. The systems can be used for applications such as immune landscaping, intracellular signaling, OMICs, and high-plex automated immune assays.
“We are an engineering company that thinks of the space-time continuum in the context of biology, specifically in the single cell,” notes Mackay. Essentially, IsoPlexis is developing technology to enable explorations of the secretome, phoshoproteome, and the metabolome that will help “connect the dots” in biology. “This approach,” Mackay insists, “has the potential to unlock breakthroughs in the clinic such as next-generation biomarkers.”
Similar aims may be pursued by Thermo Fisher customers. According to Hühmer, some of the company’s customers are taking advantage of the sophisticated signal amplification approaches offered by the Thermo Scientific Orbitrap Exploris 480 mass spectrometer and the Thermo Scientific Orbitrap Eclipse Tribrid mass spectrometer. These instruments, he says, can be used to “quantify more than 2,000 proteins from a single cell.”
Looking into the interactome
A single cell contains approximately 10 billion protein molecules that may be seen as interactive network elements, much like the 7.8 billion people on Earth may be seen as participants in a vast human social network. Although a cell’s proteomic network works on a tinier scale than the human social network, it might exhibit as much complexity.
In proteomic networks, proteins and protein complexes participate in dynamic, context-specific interactions locally and globally to support biological processes. Understanding these interactions could help scientists develop complex-selective therapeutics and clinical diagnostic methods.
“Identifying interactions in endogenous settings is difficult due to the inherent complexity of the protein population and the fleeting nature of their interactions,” admits Ray. “Two labeling strategies, covalent crosslinkers and proximity biotinylation, are allowing these interactions to be studied en masse with MS. Both labeling techniques allow scientists to tag those events within cells as they occur.”
An innovative bead-conjugation assay developed by PerkinElmer is being leveraged by the National Center for Advancing Translational Sciences, an arm of the National Institutes of Health, to support the search for effective COVID-19 therapeutics. With this assay, which is called AlphaLISA, light is emitted if a biomolecular interaction occurs in proximity of a bead-binding complex.
“Our proximity assays are [facilitating] antiviral drug repurposing research during the COVID-19 pandemic,” states Anis H. Khimani, PhD, senior strategy and market segment leader at PerkinElmer. They are helping to identify therapeutic compounds capable of disrupting the interaction between the SARS-CoV-2 spike protein and the human host receptor, ACE2.
Beyond mass spectrometry
Currently, proteomics relies mainly on MS, an approach that differs from initial proteomics approaches such as Edman degradation. Instead of truly sequencing, MS classifies enzymatically digested proteins based on the mass/charge ratios of protein fragments. Although MS-based proteomics matches genomics in terms of throughput and accuracy, it suffers in terms of sensitivity. Limited sensitivity and insufficiently long read lengths make it difficult for MS to detect rare proteins in mixtures containing abundant proteins.
In a recent review in Science Advances, Winston Timp, PhD, assistant professor at the department of biomedical engineering, Johns Hopkins, and Gregory Timp, PhD, professor, electrical engineering and biological studies, University of Notre Dame, note that in the near term, MS will likely be augmented by long-read transcriptomics and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) or spatial transcriptomics, techniques that can simultaneously profile specific protein and nucleic acid signatures.
Alternative approaches include fluorescent fingerprinting methods, which can generate millions of single-protein reads using fluorosequencing by Edman degradation. In the longer term, the scientists predict that “once the kinks with throughput, bandwidth, and noise are worked out,” arrays of subnanopores will hold center stage in protein sequencing given their “extreme sensitivity and prospects for scaling.”
Therapeutics, biomarkers, and other applications
Proteomic signatures are providing diagnostic, therapeutic, and prognostic biomarkers. They are also deepening our understanding of basic biology.
“One area where we foresee a growing role for proteomics is structural biology,” comments Hühmer. “Ever since the detector revolution in cryo-electron microscopy, interest in this field has really been rekindled.”
Proteomics is leveraging the power of single-cell biology to point to novel ways of understanding how immunotherapies are creating more curative responses in patients. Examples from cell and gene therapy are offered by Mackay.
“We’ve been able to pick up proteomic signatures predictive of in vivo response, for instance, in CD19-oriented cell therapies, but also in novel preclinical cell therapies for solid tumor environments, which have traditionally been very difficult to treat,” he states. “We’re going to utilize the single-cell phosphoproteome to target the complete set of intracellular pathways to treat not only the first pathway that is engaged in a tumor cell, but also the pathways that cause resistance and metastases downstream.”
PerkinElmer is expanding its protein biomarker kit offerings across key disease and virus areas. The company’s assay for assessing the ACE2/S1 protein-protein interaction was described earlier. Other assays may advance development of cell and gene therapies, oncological therapies, and autoimmune therapies. “For downstream bioanalytical separation and analysis, we offer the LabChip® GX II for adeno-associated virus applications, as well as our new LC 300 UHPLC system with SimplicityChrom™ chromatography data system,” adds Khimani.
Quantitative proteomics is being used to support the discovery, characterization, and optimization of therapeutic monoclonal antibodies. The development of stable isotope–labeled antibodies has enabled researchers to use LC-MS/MS to characterize and optimize their biotherapeutics in new and innovative ways. For example, assays for pharmacokinetic studies in animal models can be developed in a much quicker fashion using LC-MS/MS without the extensive assay validation associated with immunoassays. MS-based studies can also provide insights into the mechanism of a drug’s degradation and clearance.
“We see increased use of stable isotope–labeled materials,” notes Ray. “We also see the need for certified reference standards in therapeutic drug monitoring studies to provide accurate measurements of clearance of these drugs in patients.
“Biomarker discovery and stratification will continue be a huge topic in the coming years, particularly in the identification of biomarkers to assess prognosis, and in the classification of responders versus nonresponders to therapeutics. We also expect to see more refined assays. They may include new and existing biomarkers and reflect an improved understanding of isoforms and post-translational modifications.”
Challenges
Advances in instrumentation and bioinformatics have widened the application of proteomics in basic research and in the development of clinical therapeutics. However, challenges remain in the automation of sample preparation, the standardization of analytical procedures, and the sharing of data (including data from reference datasets).
There is an unwillingness to accept the expenses associated with initial setup and maintenance, and a reluctance to adopt innovative technology in the development of therapeutics. All these challenges discourage the widespread use of proteomics in basic, translational, and clinical laboratories. If these challenges are not favorably answered, promising therapies will continue to be abandoned, and proteomics will be relegated to preclinical studies as a hypothesis-generating tool in select laboratories that can afford it. Democratizing proteomics technology will be pivotal in the coming decades.