May 1, 2010 (Vol. 30, No. 9)

Kathy Liszewski

Innovative Solutions Seek to Remove Difficult-to-Express Tag from Recombinant Proteins

Many challenges confront researchers attempting to express recombinant proteins. Post-translational modifications, quantity produced, and recapitulating function are chief among them. These and other issues will be discussed at CHI’s “Difficult to Express Proteins” conference to be held later this month. Cutting-edge solutions include new ways to optimize gene sequences, novel fusion proteins or tags, and enhancements to functional assays for membrane proteins.

Many lines of evidence have demonstrated that changing the protein coding sequence of a gene can dramatically affect its expression. According to Mark Welch, Ph.D., director of gene design at DNA2.0, “the key is to know which codons to change. Although researchers previously had to isolate and clone genes of interest, we can now tailor-make genes synthetically. This allows us to optimize genes based upon what organism will be used for expression.”

Dr. Welch notes that the field is currently lacking systematic studies showing what gene characteristics are optimal. “There’s a lot of protein-expression folklore involved in design principles, yet also an absence of uniform experimental data. There are a confounding number of variables among examples in the literature that differ in the proteins expressed, vectors, host strains, etc. In each example you generally are presented with only two data points, a natural gene and a synthetic version. There has been no reliable way to combine results from these often contradictory experiments.”

To address this issue, DNA2.0 embarked on a detailed analysis in which it designed and independently synthesized about 40 genes each for two proteins, a DNA polymerase, and a single chain antibody. Each set systematically sampled codon usage and other variables thought relevant to protein expression.

“We found that synonymous codon variation produced more than a 40-fold difference in expression levels. We identified sequence characteristics that correlated with expression by combining multivariate regression methods along with genetic algorithms. Contrary to popular assumptions, we found that codon preferences did not correlate with the codon bias found in natural host genes.”

Dr. Welch suggests that DNA2.0’s strategy can be used in any expression system. It already has been successfully applied to expression in mammalian, yeast, plant, and fungal hosts. “Systematic variation and modeling provide a more reliable way to improve gene design for any host. In every study we’ve obtained dramatic improvements relative to prior optimization strategies. This adds great value to our genes, particularly for clients expressing high-value proteins where even a small boost in expression can mean considerable savings in time and money.”

Polyketide Challenges

Polyketides are natural products produced by plants, fungi, and bacteria that represent a diverse class of compounds with a broad range of activities useful for many applications ranging from anticancer agents to antibacterials. However, these small molecules are so sufficiently complex that they cannot be easily synthesized through organic chemistry. To produce them recombinantly has also presented significant gene-expression challenges.

Blaine A. Pfeifer, Ph.D., assistant professor of chemical and biological engineering at Tufts University, is working to overcome these hurdles. “In the last 15 years, there has been a steady effort to produce polyketides in engineering-friendly organisms such as E. coli. The problem is that many of the polyketide synthetases are large proteins (>300 kD) that have unique assembly characteristics.”

Dr. Pfeifer says pathways to express polyketides are often very complicated. “There can be up to 20 coordinately expressed genes that are required for complete biosynthesis. Some of the larger protein products may be dysfunctional when produced, further complicating the issue. We are finding that the typical rules of molecular biology need to be ‘bent’ to succeed.”

As an example, Dr. Pfeifer has targeted production of a polyketide that is normally purified from specific strains of soil-dwelling bacteria. “Our case study features introduction of 17 genes into an E. coli expression system. To do this requires a number of optimization steps such as engineering different promoters, optimizing codons, adding a chaperonin, and even adjusting the temperature for E. coli growth. Despite these challenges, we succeeded in producing our compound of interest.”

Aside from being able to produce polyketides recombinantly, another important advantage of heterologous production is the ability to engineer new derivatives. “By modifying and re-engineering the products, we may be able to produce new compounds, such as modified antibiotics, that have increased potency. A key goal would be to leverage our recombinant production platform to produce new compounds against, for example, antibiotic-resistant bacterial pathogens.”

Human In Vitro Translation

Some researchers opt for cell-free protein expression, i.e., in vitro translational systems. Benefits include compatibility with microliter-scale reactions and faster expression, since traditional cell-based expression can take from days to weeks. However, current in vitro expression systems suffer from low yields or the inability to include post-translational modifications such as glycosylation, according to Brian Webb, Ph.D., platform manager of proteomics R&D at Thermo Fisher Scientific.

“Typical systems such as wheat germ or E. coli cannot glycosylate proteins. Other mammalian systems such as rabbit reticulocyte lysates in combination with canine microsomal membranes produce low amounts of protein and are not very efficient at glycosylation.” 

Thermo Scientific has developed an in vitro system derived from immortalized human cell lines that provides biologically active proteins with up to a 15-fold increase in expression. “Initially, researcher’s cDNA is cloned into the kit’s expression vector followed by expression using kit reagents. In 90 minutes, our system can yield about 30 micrograms per milliliter of full-length, functional protein that is relatively clean and can be easily analyzed via Western blot.”

One application is the simultaneous study of numerous mutant variants in a microplate or expression of large quantities of a single protein for future experiments. Other uses include enabling expression of toxic proteins that cannot be produced in live cells, analyzing protein-nucleic acid interactions, and studying protein complexes.


Thermo Scientific Pierce human in vitro translation technology synthesizes protein in human cell-free extracts. The human translation machinery imparts superior functionality compared to alternative systems such as rabbit, E. coli, or wheat germ systems, the company reports.

Functional Protein Purification

Adding tags to recombinant proteins is a convenient means to monitor and/or purify proteins of interest. Traditional tags include fluorescent proteins (e.g., green fluorescent protein) and affinity tags such as glutathione-S-transferase (GST) and 6x-histidine (HisTag).

Promega has designed a fusion protein technology that enables protein analysis in vitro and in vivo. “Our approach was to design a platform technology based on the efficient formation of a covalent bond between HaloTag®, an engineered fusion tag, and a set of selective ligands,” explains Rachel Friedman Ohana, Ph.D., a senior R&D scientist. “Due to the versatility of the ligands, one genetic construct can be used for cellular imaging, including protein trafficking, in vitro protein detection by SDS PAGE, protein interaction analysis, and protein purification.”

HaloTag can be utilized to purify proteins from eukaryotic, prokaryotic, or cell-free expression systems. According to Dr. Ohana, “the combination of selective and covalent protein capture overcomes some of the challenges associated with equilibrium-based (affinity) tags and enables efficient capture and purification even at low expression levels. Following immobilization onto a specific resin, the protein of interest is released by cleavage at an optimized TEV protease recognition site resulting in purified protein free of tag. An additional benefit of this system is that it utilizes one physiological buffer throughout the purification, eliminating the need for dialysis.”

To show the utility of the HaloTag purification system in E. coli, the company targeted 23 difficult-to-express proteins. “We found that HaloTag provided superior solubility, purity, and yields compared to HisTag, GST, and MBP (maltose binding protein),” Dr. Ohana says.

The distinguishing purification capabilities of HaloTag are more evident in traditionally lower expressing mammalian cells. According to Dr. Ohana, “HaloTag enables quick optimization of expression levels and provides one-step purification with minimal protein loss. This has been demonstrated by the purification of several functional proteins including kinases, nuclear proteins, and secreted proteins.”


HaloTag® is an enabling technology for protein analysis. According to Promega, a single genetic construct can be combined with multiple ligands for different applications.

Novel Microtags

Another novel tagging and detection system has been developed by Geoffrey Waldo, Ph.D., team leader, biosciences, Los Alamos National Laboratories. “Existing small tags usually require a labeled antibody or the like,” he says. “These can be hard to use especially in living cells. We are trying to overcome this by making families of tiny tags and a specific detector protein for each tag. The detectors bind like antibodies to each particular tag.

“The trick is they can be expressed right in the cell and give a fluorescent signal only when they bind a tag. We are engineering several different fluorescent proteins from various organisms. We split them into two unequal pieces. One is a tiny piece of about 15 amino acids that acts as the tag to attach to the protein. The larger remaining fragment acts as the detector. Neither piece is fluorescent alone. By carefully engineering the properties of the fragments, the larger detector fragment acts like an antibody to the tag and it spontaneously binds to become brightly fluorescent, as bright as the original full-length fluorescent protein.” 

Dr. Waldo is applying the technology to protein trafficking, protein interaction detection, high-throughput measurement of soluble protein in living cells, and engineering proteins for stability and solubility.

Aminoacyl-tRNA synthetases within cells serve to ligate a specific amino acid with its cognate tRNA. The tRNA subsequently contributes that amino acid to the growing peptide chain. Researchers at Princeton University are harnessing that power to incorporate unnatural amino acids as novel tags in recombinant proteins.

“In the last decade a lot of effort has been focused on engineering aminoacyl-tRNA synthetases to incorporate unnatural amino acids into recombinant proteins,” reports A. James Link, Ph.D., assistant professor, chemical engineering and molecular biology. “This is useful for a number of applications. For example, a fluorescent molecule can be linked that will allow tracking the protein intracellularly. For therapeutic applications a polymer such as polyethylene glycol could be added to allow the protein to remain in circulation. Such tags can functionalize a protein. We found a new way to introduce this functionality into proteins.”

Functional groups such as azides and alkynes have emerged as chemical handles to help couple tags to proteins. Dr. Link found that an efficient way to introduce such functional groups is via incorporation of unnatural amino acids.

“Previously we used combinatorial screening techniques to identify new aminoacyl-tRNA synthetases. We found a variant of the E. coli methionyl-tRNA synthetase (MetRS) with the capability to incorporate either its natural amino acid or the unnatural amino acid azidonorleucine into a protein depending on which amino acid it is presented with. In our latest work, we generated an E.coli strain that harbors a single genomic copy of this engineered MetRS.”

Because the bacterial strain can ligate either the unnatural amino acid or its natural substrate (methionine), these strains can be considered dual-purpose organisms. According to Dr. Link, “the genetic code changes as a function of whether or not the unnatural amino acid or the natural methionine is present in the medium in which the E. coli grows. We first grow the cell in the presence of methionine (since it cannot grow with the unnatural amino acid), then we remove that and switch the medium to one containing the azidonorleucine. We harvest and purify the protein and often get yields as high as 20–30 mg protein per liter.”

Once azidonorleucine has been introduced into a protein, the click chemistry reaction (a reaction between azide and alkyne functional groups) can be used to tag the protein with a wide variety of tags including biotin, fluorophores, polymers, and even conjugated metals for imaging applications.

Dr. Link is now working to utilize the bacterial strain in producing uniquely substituted proteins and as a tool for modeling host-pathogen interactions.  

Functional Screening Assays

Membrane proteins not only can be difficult to express but also to purify and assay. Many researchers approach the study of membrane proteins by over-expressing recombinant fragments of membrane-associated enzymes and studying their activity in solution. However, this can be a daunting challenge for many assay systems, according to Scott Gridley, Ph.D., director of bioproducts at BlueSky Biotech.

“Membrane-associated proteins derive significant structural, topological, and relational organization from being constrained on the fluid two-dimensional surface of the membrane.”

The company has developed a technology called TDA 2.0™ that more closely replicates native protein structure to enhance functional assays, Dr. Gridley says. “Template-directed assembly, the process of organizing recombinant protein on a membrane surface, such as a liposome, restores biological context resulting in a more relevant set of data. TDA 2.0 is the only commercially available technology that readily replicates the membrane context in a soluble, fluid, chemically defined system compatible with high-throughput screening.”

How does it work? According to Dr. Gridley, “to reproduce membrane association, the enzymes are polyhistidine tagged, and the lipid nanospheres are derivatized with nickel-nitrilotriacetic acid so that polyhistidine tagged proteins bind with high-affinity. Membrane proteins in the context of TDA 2.0 naturally form biologically relevant multimers, and interact with untagged interacting partners to form higher-order complexes without any special effort. We have some purified proteins to replicate pathway interactions in the context of TDA 2.0 in a chemically defined system already available.”

The company currently offers kits for the insulin receptor, insulin-like growth factor receptor tyrosine kinases, and lyn kinase. For the future, Dr. Gridley indicates the company is “developing kits for nearly every receptor tyrosine kinase family member (~70), as well as extending the technology for use with other membrane protein classes.”

While daunting challenges remain in the arena of difficult-to-express proteins, new paradigms in expression science are helping pave the way for solving those problems.

Previous articleCytoscape
Next articlePEPs Gaining FDA Nod but Remain Hard to Take