In a proof-of-concept study, scientists at Delft University of Technology in the Netherlands and the University of Illinois have successfully repurposed DNA nanopore sequencing technology to scan single protein molecules.
As a helicase enzyme pulls a DNA-bound peptide string through a minuscule membrane channel, researchers can now decode changes in ion currents through the nanopore to read off the individual amino acid building blocks of the peptide one at a time. This ability is a landmark in protein identification, paving the way for single-molecule protein fingerprinting, de novo protein sequencing, and analyzing dynamic cellular proteomes.
The findings are published in an article in the journal Science titled, “Multiple re-reads of single proteins at single-amino-acid resolution using nanopores.”
Until now, information on the primary sequence of proteins has been largely obtained from DNA sequences. But neither DNA nor RNA sequences provide information on the abundance of proteins, their splicing, or their modifications post-synthesis. Despite proteins being the specialized functional machinery in our cells, we fall back on skeletal DNA blueprints to understand proteins.
Methods that do analyze proteins such as mass spectrometry, chop them up into pieces and identify proteins from fragmented spectral signatures through comparisons with protein databases. Such methods are expensive, limited to large volumes, and cannot detect proteins found in low abundance in cells.
In the last few decades, scientists have sequenced single molecules of DNA using a cost-effective and portable nanopore-based technology capable of identifying epigenetic stamps in long reads. Cees Dekker, PhD, professor at Delft University of Technology, and his team have now adapted this method to sequence single protein molecules, one amino acid at a time.
“Over the past 30 years, nanopore-based DNA sequencing has been developed from an idea to an actual working device,” Dekker said. “This has even led to commercial hand-held nanopore sequencers that serve the billion-dollar genomics market. In our paper, we are expanding this nanopore concept to the reading of single proteins. This may have great impact on basic protein research and medical diagnostics.”
Henry Brinkerhoff, PhD, who pioneered this work as a postdoctoral fellow in Dekker’s lab and is first author on the paper said, “Imagine the string of amino acids in one peptide molecule as a necklace with different-sized beads. Then, imagine you turn on the tap as you slowly move that necklace down the drain, which in this case is the nanopore. If a big bead is blocking the drain, the water flowing through will only be a trickle. If you have smaller beads in the necklace right at the drain, more water can flow through. With our technique we can measure the amount of water flow (the ion current actually) very precisely.”
“Peptide sequencing with nanopore technology faces challenges, primarily around the use of the controlled translocation of the molecule of interest through the nanopore for signal readouts from the ion current. For protein sequencing, this issue is further complicated by the overall charge distribution and bulkiness of the molecules,” said Nicholas Schork, PhD, deputy director and Distinguished Professor of Quantitative Medicine at the Translational Genomics Research Institute (TGen) in Phoenix, AZ, who has been working on Nanopore technology for genomics and who is not involved in the current study.
“The study by Brinkerhoff et al. resolved the issue of irregular translocation speed by attaching a short piece of DNA to a target peptide and included the use of a DNA helicase to pull the DNA component of a target molecule through a biological nanopore (MspA) at a controlled speed. As the DNA-peptide molecule is threaded through the pore, the change in ion current is read for the DNA section followed by the peptide section,” said Schork.
This new method of protein sequencing is highly specific and sensitive. The authors showed its ability to detect changes in single amino acid substitutions from changes in ion current readouts.
Dekker added, “A cool feature of our technique is that we were able to read a single peptide string again and again. We then average all the reads from that one single molecule, and thus identify the molecule with basically 100% accuracy.”
In the study, the authors demonstrated their ability to rewind peptide reads to obtaining many independent scans of the same molecule, yielding an error rate that is less than one in a million in identifying single amino acid variations.
Aleksei Aksimentiev, PhD, professor of physics at the University of Illinois and his team performed molecular dynamics simulations that showed how the ion current signals relate to the amino acids in the nanopore. These simulations showed that the changes in ion current signals detected as the peptide string makes its way down the nanopore drain results from size exclusion and the binding of the peptide or its side chains to the inner walls of the nanopore.
“My lab used high-end computer simulations to determine how exactly the sequence of a polypeptide chain influences the blockade current through the MspA nanopore, explaining the counterintuitive dependence of the blockade current on the physical size of the amino acids,” said Aksimentiev.
Chemical groups such as sugar or phosphate groups that attach to proteins after they are synthesized, change their native configurations and add another layer to the complexity of determining the true state of cellular proteomes. Dekker said, “These changes are crucial to protein function, and a marker for diseases such as cancer. We think our new approach will allow us to detect such changes, and thus shine some light on the proteins that we carry with us.”
De novo protein sequencing directly determines the amino acid sequence of a protein without referring to known sequences or protein databases. In its present form, the technology is not capable of de novo protein sequencing.
“Because a mapping of ion current to amino acids is lacking in the approach due to the complexity of this mapping, true de novo protein sequencing remains a challenge. Nevertheless, the proposed method is applicable to identifying peptide sequence variations, as shown when authors compared the results of their amino acid sequencing to known reference proteins harboring amino acid substitutions,” said Schork.
“A real strength of the technique is that it can be implemented using existing nanopore sequencing hardware (e.g. the commercial Oxford Nanopore MinION system) by simply changing sample preparation and data analysis protocols. This will allow studies exploring the strategy to be pursued easily and accommodate improvements in the necessary bioinformatics workflows for interpreting the assay,” added Schork.
Brinkerhoff said, “Our approach might lay a basis for a single-protein sequencer in the future, but de novo sequencing remains a big challenge. For that, we still need to characterize the signals from a huge number of peptides to create a ‘map’ connecting ion current signals to protein sequence. Even so, the ability to discriminate single amino acid substitutions in single molecules is a major advance, and there are many immediate applications for the technology as it is now.”
The technology has several limitations that need further attention, the researchers pointed out. For instance, positively charged peptides may not move efficiently through the nanopore. This, the authors noted, can be addressed by engineering the pore so that peptides move through it smoothly regardless of their charge specificity. Peptides around 25 amino acids long can currently be scanned through this method limiting its application to short peptides. Although this is an improvement over the less than ten amino long peptides that can be identified using mass spectrometry, the read length could be improved through fragmentation and shotgun strategies used in traditional protein sequencing.
“Our findings comprise a promising first step towards a low-cost method capable of single-cell proteomics at the ultimate limit of sensitivity to concentration, with a wide range of applications in both fundamental biology and the clinic,” the authors concluded.
The refinement and adoption of this new technology promises real-time, scalable, affordable, and easy-to-use molecular sequencers for proteins that can change how life scientists and healthcare researchers go about their daily work.