While there are an increasing number of companies leveraging the capabilities of artificial intelligence (AI) in the pharmaceutical industry (Box 1), there are relatively few operations dedicated exclusively to biopharmaceutical protein discovery and development. One of those companies is BigHat Biosciences, based in San Mateo, California.
“Our vision was to build, from the ground up, a platform that integrated the best of AI/machine learning (ML) and life sciences wet lab technologies, and apply it to make better antibodies faster,” says Mark DePristo, co-founder and CEO of BigHat.
In 2019, Peyton Greenside, a Stanford Biomedical Informatics PhD and Schmidt Science Fellow, joined forces with DePristo to launch the company. DePristo left his position as Head of Genomics at Google.AI. (He was previously co-director of Medical Genetics at the Broad Institute.)
Several recent scientific and technical innovations in synthetic biology and ML enabled the duo to build BigHat’s Milliner platform. “It was the synthetic biology advances, in the early 2010s, that allowed us to build the rapid iteration lab that we needed to make the ML methods work,” said Greenside, who serves as BigHat’s chief scientific officer. “We use synthetic biology techniques throughout our platform, from DNA synthesis to cell-free protein synthesis.”
Achieving parity between informatics and life sciences resides at the engineering core of BigHat, enabling the company to animate its platform and derive unique value: an alloy of antibody modelling and empirical lab results. Above all, it enables the company to design what it says is the next generation of antibody therapeutics. By placing advances in ML and synthetic biology on an equal footing, BigHat achieves more than the sum of the parts. “This architecture enables us to close the loop between lab and computation,” says Greenside.
Milliner is a highly automated, closed-loop, integrated ML high-speed wet lab dedicated to antibody discovery and development (Fig. 1). “Like classic tech platforms, BigHat’s platform is reusable and scalable, but to a greater extent,” DePristo said. “It’s why we can pursue so many therapeutic programs in parallel in a lean organization.” 1, 2
BigHat occupies a new 30,000-square-foot facility in San Mateo, half of which is dedicated to a state-of-the-art laboratory. This lab space houses Milliner, driven by a fleet of heterogeneous robots and a proprietary integrated laboratory information management system (LIMS). These run everything from protein production to characterization (from biophysics to functional assays) to data processing and retraining machine learning models.
The company has approximately 50 employees today but expects to double in size in the next few years, building on its multidisciplinary roots. “It is not just one area such as AI/ML, or protein engineering. It is not just our computer engineers building our LIMS. It’s every single player on the team. Our integrated teams are getting us to the goal,” said Liz Schwarzbach, chief business officer.
Need for speed
One of BigHat’s values is a rapid turnaround time. “Each week we go from an in silico antibody design, to making that molecule, then purifying it, and fully characterizing it, across not just one attribute such as binding, but a host of properties that are needed to become a viable drug,” said Greenside. “Currently, BigHat designs, creates and tests approximately 800 molecules per week. That’s up from where we started, at 32 per week in 2020. We plan to double capacity several more times in the coming years.”
Automation certainly improves efficiency, but BigHat’s platform offers more than just speed. Another differentiator is generating the high-quality data needed for machine learning by developing smarter lab operations and advanced data processing.
“We can get quite far away from where we started because we can test all of these parameters. We can learn from higher risk designs in earlier iterations in a way that more traditional approaches cannot afford to do. The approach lets us spend more iterations probing and exploring a whole new area of sequence space. You can only afford to do this with a very rapid design-build-test cycle,” explained Greenside (Fig. 2).
This is an application of the “learning and iterations” new product development methodology for early-stage discovery. Business experts advocate this methodology when companies utilize an advanced technology where its capacity for innovation and uncertainty are unknown.
Or as Schwarzbach puts it, “We’re building the plane as we’re flying it.”
Professors De Meyer, Loch, and Pich have argued that a flexible, learn-as-you-go approach is appropriate when unforeseen risks exist.3 One attribute of AI is the capacity to make connections in big data that would otherwise be extremely difficult for humans to do. Unforeseen benefits may exist, too. BigHat’s present drug discovery strategy of exploring the discovery space is well-suited at this early-stage of development.
Small biomolecules and big ideas
BigHat is hoping to develop the next-generation therapeutic antibodies such as BiTE (Bi-specific T-cell engager), Bispecific antibodies (BsAb), single-chain variable fragments (scFv), heavy-chain variable domain only (VHH, sdAbs or nanobodies®), and antibody-drug conjugates (ADC) for variety of applications (Fig. 3).
Milliner is capable of modeling single-domain antibodies that contain three complementarity determining region (CDR) loops that provide specific antigen recognition. Single-domain antibodies (e.g., VHH from camelids) are ~15 kiloDaltons or 10% of the size of natural, human IgG mAbs. Each paratope of IgG mAb has about 15 amino acids, of which 5–8 residues contribute the majority of binding energy for a given epitope. The platform seeks to optimize a single VHH to 10x affinity while maintaining thermostability. In addition, Bighat’s synthetic biology AI/ML may design smaller fragments than naturally occurring single domain antibodies.
“BigHat characterizes every synthesized molecule as comprehensively as possible on each cycle of the platform. This means assessing stability, solubility, yield, purity, affinity, and function. We refer to these assays as surrogates or proxies,” DePristo said. “These properties are predictive of safety and efficacy and other hard-to-measure properties such as in vivo immunogenicity.” Protein aggregation in biotherapeutics has been correlated to elevate immunogenicity, leading to immune-mediated adverse effects.
“Now that our vivarium (mouse and rat animal facility) is online, we are starting to explore how to ‘close the loop’ over more downstream but equally critical in vivo properties like pharmacokinetics (PK) and pharmacodynamics (PD) measurements,” said DePristo. “We’re getting more data on the relationship between in vitro platform measurements and in vivo properties. By learning more about this relationship, we aim to make better design decisions earlier in the platform.”
The company uses Milliner to create antibodies with optimized biophysical and functional properties so they are safer and more effective at treating disease. BigHat is also exploring how to apply its technologies to even more advanced antibody designs such as Intrabodies (intracellular target antibodies), chimeric antigen receptor T cell (CAR-T), better blood-brain-barrier crossing vehicles, and environment-sensitive antibodies, among other classic design challenges.
“Next-generation antibodies may be environment-sensitive molecules that will behave differently in different environments,” explained DePristo. “We are interested in developing antibody fragments that bind to intracellular targets,” added Greenside. “This would increase the properties and scope of biomolecules that one could not have been previously considered. As BigHat grows, we are excited to pursue that frontier.”
Cohesive business strategy
“For many companies, there is an intrinsic alignment to adopt a disease focus. One thing that makes BigHat interesting is that we have a different alignment. We align to areas with high-quality assays predictive of clinical efficacy and safety,” says DePristo. “That’s critical because we can directly optimize our molecules to those measurements.”
The current focus is in areas with well-understood biology, especially high-resolution and high-quality assays. “As diseases become well characterized, the challenge is doing something about it,” DePristo said. “There are well-established methods for developing monoclonal antibodies. We are exploring new, green fields (of molecular space) only made possible by our AI/ML integrated platform. We are particularly excited to partner with biopharma and biotech companies with strong domain expertise who want to explore that space with us to identify those new opportunities. In the long term, BigHat may expand into developing our own assays in therapeutic areas of strategic interest.”
BigHat’s growing internal pipeline consists of discovery programs traversing areas such as oncology and inflammation, focusing on areas with unmet medical needs and therapeutic targets that are challenging with traditional protein engineering methods. “Our general strategy is to focus on a handful of strategic partners. We are focusing on the depth and value of our partnerships. We are concentrating our capacity on those value-generating relationships,” said Schwarzbach.
This a viable and sound strategy. According to Deloitte Insight 2020, Realising Biotech’s Potential mature, diversify R&D platform technology companies maintain on average eight clinical programs, four licensing deals and ten assets across seven therapeutic areas.4 BigHat may be classified as a hybrid end-to-end and platform technology company.
“We work with partners who have proprietary, well-validated assays in other therapeutic areas, or have a unique biology expertise. If our partners have successfully de-risked specific areas, then we can focus on the design of biomolecules. Overall, it helps us create a balanced, internal pipeline strategy with a partner strategy,” Schwarzbach said.
BigHat’s portfolio is also consistent with the continued success of biologics obtaining FDA approval. In 2022, biomolecule approvals surpassed small molecules (19 of 37 CDER novel drugs5), a remarkable biotech achievement milestone. According to a recent Nature Biotechnology publication, “new modalities such as antibody–drug conjugates, bispecific proteins, and cell and gene therapies accounted for about a third of 2022’s approvals, helping push biologics approvals ahead of small molecules for the first time.” 6
Biotech collaborations
In November 2022, BigHat announced a collaboration with Merck to design candidates for up to three drug discovery programs. Merck seeks “to leverage BigHat’s technology and expertise in enabling molecular design of novel biologic candidates.” It is a multi-program collaboration using the Milliner platform to resolve challenging antibody design problems.
In January 2022, “completion of the first stage of Amgen’s research collaboration with BigHat demonstrated the ability of their platform to quickly and significantly optimize next-generation single-domain antibodies, validating the platform as a path to generating target binders with improved properties compared with the original repertoire identified by traditional technologies,” said Philip Tagari, then VP of Research at Amgen. This milestone triggered the initiation of the second stage of work on VHH antibodies.
And in March 2022, BigHat acquired Frugi Biotechnology, a company developing cost-effective and high-quality cell-free protein synthesis (CFPS) technology. Cell-free protein synthesis is a foundational technology for BigHat’s platform, allowing the company to rapidly synthesize and test antibodies in a matter of days.
Funding funnel
To date, BigHat has raised more than $100 million from all sources. In July 2022, Section 32 led a $80-million Series B round with new investors: Amgen Ventures, Bristol Myers Squibb, Gaingels, GRIDS Capital Quadrille Capital and others. Prior investors 8VC, AME Cloud Ventures and Andreessen Horowitz contributed as well.
An attraction was “the company’s alignment with our mission to accelerate the discovery, development and distribution of revolutionary technologies that improve the human condition,” said Steve Kafka, Managing Partner at Section 32. “BigHat is tackling a big challenge with next-generation AI/ML, driving antibody drug development has the potential to have a massive, positive impact on human health.”
“We invest in people and want to work with the top thinkers and doers in their respective fields,” said Kafka. “DePristo and Greenside are pioneers at the intersection of ML and drug discovery and development.”
Serial entrepreneur Rob Chess joined BigHat’s Board of Directors (BOD) last year, offering guidance in precision medicine. “Pioneering a field means not having a straight, paved path; they must create the path themselves. The best way to prepare for the inevitable twists and turns is by having the right people,” said Kafka.
“DePristo and his team are really terrific at building an advisory halo around them. People like Rob Chess are a great example. He has such an impressive history in drug discovery and development. As chairman at Twist Biosciences, he is connected to the entire synthetic biology space.” BigHat has also assembled a distinguished advisory board including Nobel laureate Brian Kobilka, MD (Stanford University).
AI/ML impact on biopharma
Kafka says that AlphaFold’s mapping of the human proteome was “a huge step forward for the industry and its ability to understand protein therapeutics.” He believes that “advanced molecules will soon positively impact human health, and BigHat has all the right ingredients to be the market leader. We are already seeing great progress with respect to BigHat’s pipeline and with our partnerships.”
“We were influenced by AlphaFold 2 (AF2) both directly and indirectly,” says DePristo. “BigHat directly leverages AF2 where appropriate in target selection and design work… AI tech is able to solve problems that were beyond our given capabilities even 10 years ago. It is one of the drivers behind today’s enthusiasm for generative AI technology in bio.”
Are modeling simulations based on big data sufficient to address the challenges in the antibody fragment therapeutics space that exist today? How important is integrated wet lab experimentation to discovery and innovation?
Conclusion
Leading AI drug discovery platforms feature robust, large-scale robotics wet lab integration that cycle through empirical data training, scoring and (virtual and wet) chemical synthesis. Increasingly large quantities of reliable data sets, across a wide range of parameters, lead to more drug-like, better performing synthetic compounds.
“An increase in good quality data will arguably be the single most important factor in advancing ML driven drug discovery,” said Andrew Gordon Wilson, Professor at the Courant Institute of Mathematical Sciences and Center for Data Science, New York University, “Many groups have so far focused on generative models, which are an important component of drug discovery, but comparatively few have been advancing the front lines in (Bayesian) optimization, which is used to efficiently select for desirable properties.”
“We are building a broad toolkit that allows us to engineer diverse modalities to address whatever the therapeutic goal or need is,” said Greenside. BigHat’s AI/ML wet lab platform fully leverages large scale experimentation through high quality, data-driven drug discovery. While no AI-discovered drug has been approved for commercial use to date; nevertheless, there is abundant enthusiasm, activity and investment in this area because of AI’s potential to transform the industry.
Pharmaceutical R&D generates and requires big data, and that is exactly where the application of AI excels. The aspiration is not only will this approach more rapidly generate superior drug-like compounds, but ultimately yield (bio)molecular candidates that advance through clinical trials and prove to be invaluable treatments for patients. With the advent of AI, it is a very exciting time to explore novel approaches to drug discovery and personalized medicines.
References
- Gruver N. et al. Effective Surrogate Models for Protein Design with Bayesian Optimization. The ICML Workshop on Computational Biology, (2021) .
- Stanton S. et al. Accelerating Bayesian Optimization for Biological Sequence Design with Denoising Autoencoders. International Conference on Machine Learning, 20459-20478.
- De Meyer A et al. Management of Novel Projects Under Conditions of High Uncertainty. Working Paper Series, Judge Business School, University of Cambridge, (2006).
- Deloitte Insight 2020, Realising Biotech’s Potential. What is required to scale successfully? Deloitte Development LLC, (2021).
- FDA center for drug evaluation and research (CBER) Advancing Health Through Innovation: New Drug Therapy Approvals 2022 (2023.
- Senior M. Fresh from the biotech pipeline: fewer approvals, but biologics gain share. Nature Biotech. 41, 174–182 (2023).
Barry Davidson is a biopharma strategy and partnering consultant at EtoilePharma, specialized in emerging technologies. Email: [email protected].