Google's AI Co-Scientist Is Changing the Face of Scientific Research

2025-09-25 15:03:16 英文原文

作者：Elie Dolgin

Hey Google! What if, instead of setting reminders or fetching restaurant reviews, you helped crack the mysteries of biology?

That playful question hints at a radical vision now being tested in labs. AI systems are being recast not as digital secretaries, but as scientific partners—co-pilots built to dream up bold, testable ideas.

The pitch sounds revolutionary. But it also makes many scientists bristle. How much true novelty can a machine conjure? Isn’t it more likely to remix the past than to uncover something genuinely new?

For months, the controversy over “AI scientists” has simmered: hype versus hope, parroting versus discovery. But two new studies offer some of the strongest evidence to date that large language models (LLMs) can generate truly novel scientific ideas, leaping to non-obvious insights that might otherwise require many years of painstaking lab work. Both studies showcase Google’s AI-powered scientific research assistant, known as the AI co-scientist.

“These early examples are unbelievable—it’s so compelling,” says Dillan Prasad, a neurosurgery researcher at Northwestern University and an outside observer who has written about the potential for AI co-scientists to supercharge hypothesis generation. “You have AI agents that are producing scientific discovery! It’s absolutely exciting.”

AI Takes on Drug Repurposing

In one of these proof-of-concept demonstrations, a team led by Gary Peltz, a liver disease researcher at Stanford Medicine, tasked the AI assistant with finding drugs already on the market that could be repurposed to treat liver fibrosis, an organ-scarring condition with few effective therapies.

He prompted the tool to look for medicines directed at epigenetic regulators—proteins that control how genes are switched on or off without altering the underlying DNA—and the AI, after mining the biomedical literature, came back with three reasonable suggestions. Peltz added two candidates of his own, and put all five drugs through a battery of tests on lab-grown liver tissue.

Two of the AI’s picks—but none of Peltz’s—reduced fibrosis and even showed signs of promoting liver regeneration in the lab tests. Peltz, who published the findings 14 September in the journal Advanced Science, hopes the results will pave the way for a clinical trial of one standout candidate, the cancer drug vorinostat, in patients with liver fibrosis.

Bacterial Mystery Solved

In the second validation study, a team led by microbiologists José Penadés and Tiago Costa at Imperial College London challenged the AI co-scientist with a thorny question about bacterial evolution. The researchers had shown in 2023 that parasitic scraps of DNA could spread within bacterial populations by hitching rides on the tails of infecting viruses. But that mechanism seemed confined to one host species. How, then, did identical bits of DNA surface in entirely different types of bacteria?

So they tasked the AI with solving the mystery. They fed the system their data, background papers, and a pointed question about what hidden mechanism might explain the jump. The AI, after “thinking” and processing for two days, proposed a handful of solutions—the leading one being that the DNA fragments could snatch viral tails not just from their own host cell but also from neighboring bacteria to complete their journey.

It was uncannily correct.

What the system could not know was that Penadés and Costa already had unpublished data hinting at exactly this mechanism. The AI had, in effect, leapt to the same conclusion that it had taken the researchers years of benchwork to devise, a convergence that astonished the Imperial team and lent credibility to the tool.

“I was really shocked,” says Penadés, who at first thought the AI had hacked into his computer and accessed additional data to arrive at the correct result. Reassured that it hadn’t, he delved into the logic the AI co-scientist used for its various hypotheses and found surprising rigor. “Even for the ones that were not correct,” Penadés says, “the thinking was extremely good.”

An AI Scientific Method

That sound logic prompted the Imperial team to explore one of the AI’s runner-up ideas—one in which bacteria might directly pass the DNA fragments to each another. Working with microbial geneticists in France, the group is now probing that possibility further, with promising early results. “Our preliminary data seem to be pointing toward that hypothesis [also] being correct,” says Costa.

He and Penadés published both the AI’s predictions and their experimental results in the journal Cell earlier this month.

Notably, the Imperial researchers also tried various LLMs not specifically designed for scientific reasoning. These included systems from OpenAI, Anthropic, DeepSeek, and Google’s general-purpose Gemini 2.0 model. None of those jack-of-all-trades models came up with the hypotheses that proved experimentally correct.

Vivek Natarajan from Google DeepMind, who helped develop the co-scientist platform, thinks he knows what explains that edge. He points to the system’s multi-agent design, which assigns different AI roles to generate, critique, refine, and rank hypotheses in iterative loops, all overseen by a “supervisor” that manages goals and resources. Unlike a generic LLM, it grounds ideas in external tools and literature, strategically scales up compute for deeper reasoning, and vets hypotheses through automated tournaments.

According to Natarajan, academic institutions around the world are now piloting the system, with plans to expand access—though the company’s “trusted tester program” is currently at capacity and not accepting new applications. “Clearly we see a lot of potential,” he says. “We imagine that, every time you’re going to try and solve a new problem, you’re going to use the co-scientist to come along on the journey with you.”

Constellation of Co-Scientists

Google is not alone in chasing this vision. In July, computer scientist Kyle Swanson and his colleagues at Stanford University described their Virtual Lab, an LLM-based system that strings together reasoning steps across biology datasets to propose new experiments.

Rivals are moving fast, too: Biomni, another Stanford-led system, is helping to autonomously execute a wide range of research tasks in the life sciences, while the nonprofit FutureHouse is building a comparable platform. Each is vying to show that its approach can turn language models into real engines of discovery.

Many onlookers have been impressed, noting that the studies offer some of the clearest evidence yet that LLMs can generate ideas worth testing at the bench. “This is going to make our jobs much easier,” says Rodrigo Ibarra Chávez, a microbiologist at the University of Copenhagen in Denmark who studies the kind of bacterial genetic hitchhiking explored by the Imperial team.

But critics warn that an over-reliance on AI-generated hypotheses in science risks creating a closed loop that recycles old information instead of producing new discoveries.

“We need tools that augment our creativity and critical thinking, not repackage existing information using alternative language,” Kriti Gaur of the life sciences analytics firm Elucidata wrote in a white paper that evaluated the Google platform. “Until this ‘AI co-scientist’ can demonstrate original, verifiable, and meaningful insights that stand up to scientific scrutiny, it remains a powerful assistant, but certainly not a co-scientist.”

Flowchart timeline of the experimental research that led to the discovery of how cf-PICIs are mobilized between bacterial species. At bottom, it highlights the potential of AI to accelerate research by rapidly recapitulating, with no prior knowledge, previous experimental findings. The blue section of the figure shows an experimental research pipeline that led to a discovery of DNA transfer among bacterial species. The orange section shows how AI rapidly reached the same conclusions.José R. Penadés, Juraj Gottweis, et al.

Reasoning, Not Just Recall

Supporters counter that the latest generation of models show glimmers of what scientists might reasonably call “intelligence.” Systems like Google’s co-scientist not only recall and synthesize vast libraries but also reason through competing possibilities, discard weaker ideas, and refine stronger ones in ways that can feel strikingly human.

“I find it very invigorating,” says Peltz. “It’s like having a conversation with someone who knows more than you.”

Still, the magic doesn’t happen automatically. Extracting valuable hypotheses requires careful prompting, iterative feedback, and a willingness to engage in a kind of dialogue with the AI, notes Swanson. It’s less like pressing a button for an answer and more like mentoring a junior colleague—asking the right questions, pushing back on shallow reasoning, and nudging the system toward sharper insights.

“For now, you still need to be a bit of an expert to get the most use out of these systems,” Swanson says. “But if you ask a well-designed question, you can get really good answers.”

关于《Google's AI Co-Scientist Is Changing the Face of Scientific Research》的评论

暂无评论

发表评论

摘要

AI systems are being tested as scientific partners capable of generating novel scientific ideas. Two new studies demonstrate that large language models (LLMs) can produce non-obvious insights through hypothesis generation. One study led by Gary Peltz at Stanford found that an AI assistant suggested drugs for repurposing to treat liver fibrosis, with two of its picks proving effective in lab tests. Another study by José Penadés and Tiago Costa at Imperial College London used AI to solve a mystery about bacterial DNA transfer mechanisms, aligning with unpublished data from the researchers. These findings suggest that specialized AI models can assist in scientific discovery through logical reasoning and iterative hypothesis refinement, though critics caution against over-reliance on AI-generated hypotheses.