A report from a British university warns that scientific knowledge itself is under threat from a flood of low-quality AI-generated research papers.
The research team from the University of Surrey notes an "explosion of formulaic research articles," including inappropriate study designs and false discoveries, based on data cribbed from the US National Health and Nutrition Examination Survey (NHANES) nationwide health database.
The study, published in PLOS Biology, a nonprofit publisher of open-access journals, found that many post-2021 papers used "a superficial and oversimplified approach to analysis." These often focused on a single variable while ignoring more realistic, multi-factor explanations of links between health conditions and potential causes, along with some cherry-picked narrow data subsets without justification.
"We've seen a surge in papers that look scientific but don't hold up under scrutiny – this is 'science fiction' using national health datasets to masquerade as science fact," states Matt Spick, a lecturer in health and biomedical data analytics at Surrey University, and one of the authors of the report.
"The use of these easily accessible datasets via APIs, combined with large language models, is overwhelming some journals and peer reviewers, reducing their ability to assess more meaningful research – and ultimately weakening the quality of science overall," he added.
The report notes that AI-ready datasets, such as NHANES, can open up new opportunities for data-driven research, but also lead to the risk of potential data exploitation by what it calls "paper mills" – entities that churn out questionable scientific papers, often for paying clients seeking confirmation of an existing belief.
Surrey Uni's work involved a systematic literature search going back ten years to retrieve potentially formulaic papers covering NHANES data, and analyzing these for telltale statistical approaches or study design.
The team identified and retrieved 341 reports published across a number of different journals. It found that over the last three years, there has been a rapid rise in the number of publications analyzing single-factor associations between predictors (independent variables) and various health conditions using the NHANES dataset. An average of four papers per year were published between 2014 and 2021, increasing to 33, 82, and 190 in 2022, 2023, and the first ten months of 2024, respectively.
Also noted is a change in the origins of the published research. From 2014 to 2020, just two out of 25 manuscripts had a primary author affiliation in China. Between 2021 and 2024, this rose to 292 out of 316 manuscripts.
The report says this jump in single-factor associative research means there is a corresponding increase in the risk of misleading findings being introduced to the wider body of scientific literature.
For example, it says that some well-known multifactorial health issues are analyzed as single-factor studies, citing depression, cardiovascular disease, and cognitive function – all recognized as multifactorial – being investigated using simplistic, single-factor approaches in some of the papers reviewed.
To combat this, the team sets out a number of suggestions, including that editors and reviewers at scientific journals should regard single-factor analysis of conditions known to be complex and multifactorial as a "red flag" for potentially problematic research.
Providers of datasets should also take steps including API keys and application numbers to prevent data dredging, an approach already used by the UK Biobank, the report says. Publications referencing such data should be made to include an auditable account number as a condition of access.
Another suggestion is that full dataset analysis should be made mandatory, unless using data subsets can be justified.
"We're not trying to block access to data or stop people using AI in their research – we're asking for some common sense checks," said Tulsi Suchak, a post-graduate researcher at the University of Surrey and lead author of the study. "This includes things like being open about how data is used, making sure reviewers with the right expertise are involved, and flagging when a study only looks at one piece of the puzzle."
This isn't the first time the issue has come to light. Last year, US publishing house Wiley discontinued 19 scientific journals overseen by its Hindawi subsidiary that were publishing reports churned out by AI paper mills.
It is also part of a wider problem of AI-generated content appearing online and in web searches that can be difficult to distinguish from reality. Dubbed "AI slop," this includes fake pictures and entire video sequences of celebrities and world leaders, but also fake historical photographs and AI-generated portraits of historical figures appearing in search results as if they were genuine.
Truly, AI is the gift that keeps on giving. ®