Google DeepMind’s renowned protein structure prediction algorithm, AlphaFold, which made a grand leap in solving one of biology’s biggest problems, determining a protein’s 3D structure from its sequence, has received a major update. In a report published in Nature on May 8, researchers unveiled AlphaFold 3’s expanded predictive capabilities from proteins to a broad spectrum of biomolecular interactions, including DNA, RNA, ligands, and more.
The publication, which is a collaboration between DeepMind and Isomorphic Labs, a London-based company launched in 2021 to build upon DeepMind’s AI research to tackle biological problems, emphasizes AlphaFold 3’s potential to advance therapeutic applications, including drug discovery.
Sir Demis Hassabis, PhD, co-founder and CEO of DeepMind and Isomorphic Labs, declared he was “thrilled to announce AlphaFold 3, which can predict the structures and interactions of nearly all of life’s molecules with state-of-the-art accuracy… Biology is a complex dynamical system so modeling interactions is crucial.”
But his enthusiasm was not universally shared. In contrast to the launch of AlphaFold 2 in 2021, Nature’s publication of AlphaFold 3 lacked the open source code. That omission has sparked outcry from the research community, culminating in a protest letter signed by more than 1,000 scientists.
Within an hour of AlphaFold 3’s publication, Stephanie Wankowicz, PhD, a computational biologist at the University of California, San Francisco, says there were “emails flying” between structural biologists, chemical biologists, and more exclaiming, “how can they make claims like this? We can’t test them. We can’t reproduce them.”
In lieu of code, AlphaFold 3 was released to the community as a web server, accessible to individuals without a computational background. While a recent blog post from DeepMind and Isomorphic Labs described the new AlphaFold Server as a facilitator of “novel hypotheses to test in the lab, speeding up workflows and enabling further innovation,” researchers were quick to note the web server’s limitations.
Scientists took to social media to complain how ligand inputs to the web server were limited to a finite list of biological molecules and did not allow custom inputs, such as small molecule drugs. In that vein, while Isomorphic Labs has reported applying AlphaFold 3 to drug design for internal projects as well as with pharmaceutical partners, the public web server is under a license that is limited to non-commercial use.
Upon launch, the AlphaFold Server also imposed a daily limit of 10 requests per day. While that number later increased to 20, with DeepMind planning to “explore other approaches for quota allocation in the future, including weekly or monthly allocations,” high-throughput studies currently remain inaccessible.
“If you’re a cell biologist interested in a particular protein-protein interaction, you could manually input a few jobs [into the AlphaFold 3 web server] and get a result for a couple of systems,” said Roland Dunbrack, PhD, professor at the Fox Chase Cancer Center in Philadelphia, who was one of the original reviewers of the AlphaFold 3 Nature paper.
“But if you’re in structural bioinformatics where you’re interested in methods development or technology, then you really need to be able to do high-throughput calculations,” Dunbrack continued.
Dunbrack emphasizes that these access restrictions prevent the scientific community from applying AlphaFold 3 to its full potential. In his review of the original AlphaFold 3 manuscript, Dunbrack described how AlphaFold 2’s open source code allowed other researchers to make further advances such as the establishment of bespoke servers for ease of use, high-throughput structure prediction, and method development for specific protein types or structural features. These applications were not made directly available by DeepMind.
“It was an incredible disservice to science that [DeepMind] did this wonderful thing with AlphaFold 2, but the lack of availability and the things that we could possibly do with [AlphaFold 3] was just a big step backwards,” Dunbrack told GEN Biotechnology.
Tellingly, after insisting to the Nature editors that “publishing the paper without code would be a mistake,” Dunbrack was not given the opportunity to review the revision prior to publication despite multiple outreach attempts to Nature.
Mohammed AlQuraishi, PhD, assistant professor in the department of systems biology at Columbia University, notes that not being able to access the model weights presents additional restrictions for research building on AlphaFold 3.
“After AlphaFold 2 came out, many people spent time analyzing the behavior of the system and understanding its failure and success modes. This line of research has been incredibly valuable in elucidating how and what these systems learn. This will not be possible for AlphaFold 3 until DeepMind releases the model weights or someone reproduces their results,” AlQuraishi told GEN Biotechnology.
In a Nature Methods paper, published a few days after AlphaFold 3, AlQuraishi and Nazim Bouatta, PhD, senior research fellow at Harvard Medical School, presented OpenFold, a fast, memory efficient and trainable implementation of AlphaFold 2. According to AlQuraishi, OpenFold was used to better understand how AlphaFold generalizes to unseen regions of structure space and found that it is quite robust. For example, the system could still predict the rough geometry of beta sheets even if it was only trained on alpha helices.
AlQuraishi added that independently reproducing AlphaFold 3 would be highly valuable to permit training new variants of the system, including potentially training on private data repositories. To pursue this goal, his group has begun to reproduce AlphaFold 3 based on the Nature’s publication of pseudocode, or the representation of code used to describe the implementation of an algorithm.
Deviation from community standards
Among the social media outcry was an open letter submitted to Nature’s editors, co-authored by researchers in the structural and computational biology field, including Wankowicz and Dunbrack.
The letter amassed more than 1,000 signatures by the end of May and outlined “several deviations from our community’s standards,” including not making the code available to peer reviewers despite “repeated requests,” which the letter authors describe as a failure by the journal to enforce its own policies.
“I have no doubt that [the ability to predict molecular interactions with AlphaFold 3] could be a transformative advance for the field,” Wankowicz told GEN Biotechnology. “That’s why this evokes such a strong response from the scientific community. We want access to it because we know that this is probably really good science.”
Nature’s editor-in-chief, Magdalena Skipper, PhD stated that when making a decision on data and code availability, the journal reflects on several factors including “the potential implications for biosecurity and the ethical challenges this presents.” In such cases, the journal works with authors to provide alternatives that will support reproducibility, such as pseudocode.
However, John Jumper, PhD, one of the senior authors of the AlphaFold 3 article and lead author on AlphaFold 2, reportedly told the press that DeepMind and Isomorphic Labs had consulted more than 50 experts in biosecurity, bioethics, and AI safety and concluded that AlphaFold 3’s marginal biosecurity risks were far outweighed by the system’s potential benefits to science.
Max Jaderberg, PhD, chief AI officer at Isomorphic Labs, and Pushmeet Kohli, PhD, vice president of research at DeepMind, stated that the team is “working on releasing the AF3 model (incl weights) for academic use” within six months. Nature said the journal would update the published paper with the code once it is released. When asked for further comment, DeepMind directed inquiries to the AlphaFold Server FAQ section.
Many researchers are also dissatisfied with Nature’s response in its latest editorial, which described the decision not to publish the code as an “opportunity for conversation” at a time when the majority of global research is privately funded and not published in peer-reviewed journals. “We at Nature think it’s important that journals engage with the private sector and work with its scientists so they can submit their research for peer review and publication,” the editorial continued.
But AlQuraishi responds that the issue is a “seemingly double standard that Nature applies,” where academics are expected to provide the source code behind their work but the same is not true of an industrial lab. “The concern here, of course, is that this may lower the openness standards in the field,” said AlQuraishi.
“[Nature] seems to have set a two-tiered set of standards, one for academics, one for for-profits. This is an incredibly poor precedent to set. Peer-reviewed research should be held to the same standard, regardless of the author’s name or affiliations,” weighed in Wankowicz.
“Many companies want the Nature ‘stamp’ of approval,” said James Fraser, PhD, chair of the department of bioengineering and therapeutic sciences at UCSF. “This editorial shows, nakedly, that this ‘stamp’ is a toxic part of our current research ecosystem, one that bends easily to corporate interests and applies inequitable standards.”
In a letter sent to Nature and posted on the social media platform, X, Anshul Kundaje, PhD, associate professor of genetics and computer science at Stanford University, wrote that while commercial entities are under no obligation to open source or share details about their products, “this does not mean they get to bypass canonical standards for what constitutes a peer-reviewed and verifiable scientific publication. What Nature published as a peer-reviewed article is in fact an advertisement and at best a white paper.”
Taken together, the publication of AlphaFold 3 has sparked a larger conversation within the scientific community around the changing landscape of research tools and the role of journals in communicating science to ensure reproducibility and accessibility to allow further scientific progress. That conversation will continue as the field eagerly awaits the release of AlphaFold 3’s nuts and bolts.
A note of caution, however, was injected by Derek Lowe, PhD, medicinal chemist and veteran blogger at In the Pipeline. “Structure is not everything,” Lowe wrote shortly after the Nature report was posted. “It’s very useful, very good to have, and it will accelerate a lot of really useful research. But it does not take you directly to a drug, nor to a better idea about a target for a drug, nor to a better chance of passing toxicity tests, nor to a better chance of surviving oral dosing and the bloodstream and the liver. Better structure predictions are tools that we can use to attack those crucial problems, but they don’t answer any of them. Drug discovery has not been solved by software, no matter what you might read.”
This article was published in the June 2024 issue of GEN‘s sister peer review journal, GEN Biotechnology.
Fay Lin, PhD, is senior editor for GEN Biotechnology.