作者:Loreben Tuquero
The authors of the “Make America Healthy Again” report issued by Health and Human Services Secretary Robert F. Kennedy Jr. touted it as a landmark assessment providing “common scientific basis” to shape health policy.
But that “scientific basis” appeared to have errors generated by a likely culprit: generative artificial intelligence.
At least seven of the report’s citations were problematic, as NOTUS first reported. Four contained titles of papers that don’t exist, and three mischaracterized the articles’ findings.
When asked about the report, White House Press Secretary Karoline Leavitt attributed the errors to “formatting issues” that do not “negate the substance of the report.” When asked if AI was used in producing the report, Leavitt deferred to the Health and Human Services Department.
The MAHA report has since been updated online. (Here’s the archived version.) PolitiFact reached out to the Health and Human Services Department but did not hear back.
AI models are trained to mimic language that humans use by predicting one word after another in a sequence. Although AI chatbots like ChatGPT often succeed in producing text that reads like it was written by a human, they often fail to ensure that what they’re saying is factual.
The faux citations were formatted correctly, listed reputable journals and included realistic-sounding digital object identifiers or DOIs.
But the fact that multiple articles cited did not exist “is a hallmark of AI-generated citations, which often replicate the structure of academic references without linking to actual sources,” said Oren Etzioni, a University of Washington professor emeritus and AI researcher.
PolitiFact spoke with researchers in artificial intelligence and neuroscience about the report’s AI-related red flags.
Researchers said the presence of fabricated articles is likely the result of AI “hallucinations,” or results that may sound plausible but are not real.
AI is rapidly advancing, but is still prone to hallucinations. When prompted to generate academic references, generative AI models will often make something up if it finds no exact matches, “especially if prompted to support a specific point,” Etzioni said.
Steven Piantadosi, a psychology and neuroscience professor at University of California Berkeley who leads its computation and language lab, said AI models don’t have any way to know what is true or what counts as evidence.
“All they do is match statistical pattern in text,” he said. “It is interesting and important that they can do this very well, but statistical dependencies between characters is not what you should build a public policy around.”
The Washington Post reported that some citations included “oaicite” in their URLs, which ChatGPT users have reported as text that appears on their output. (OpenAI owns ChatGPT.)
Even in the report’s legitimate citations, some findings were misrepresented or exaggerated — another error common to generative AI tools, which “can confidently produce incorrect summaries of research,” Etzioni said.
The updated version of the MAHA report replaced the fake citations with sources that backed its findings, and in some places, revised how it presented the findings previously linked to the fake citations. (See our spreadsheet.)
One of the fake articles flagged by NOTUS was titled, “Changes in mental health and substance use among US adolescents during the COVID-19 pandemic.” The line that cited it read, “Approximately 20-25% of adolescents reported anxiety symptoms and 15-20% reported depressive symptoms, with girls showing significantly higher rates.”
A closer examination of the citation shows why it’s not authentic: Searching the title does not yield a real article, clicking the DOI link in the citation leads to an error page saying “DOI not found,” and looking at the JAMA Pediatrics volume and issue number referenced leads to an article with a different title and different authors.
The updated MAHA report replaced the citation with a 2024 report from KFF, which said that in 2021 and 2022, 21% of adolescents reported anxiety symptoms and 17% reported depression symptoms.
The original report cited two nonexistent articles in a section about direct-to-consumer advertising for ADHD drug use by children and antidepressant use by teenagers. The report said advertising for antidepressants in teenagers showed “vague symptom lists that overlap with typical adolescent behaviors” and was linked to “inappropriate parental requests for antidepressants.”
The new version of the report now reads, “DTC advertising is believed to encourage greater use of psychotropic medications in adolescents, including antianxiety, antipsychotic, and antidepressant classes,” now citing a 2006 study that used data from 1994 to 2001. The authors believed direct-to-consumer advertising played a part in encouraging more use of psychotropics.
Another finding in the MAHA report about asthma drug prescriptions said “an estimated 25-40% of mild cases are overprescribed,” citing a nonexistent article. Pediatric pulmonologist Dr. Harold J. Farber, who was listed as the article’s supposed first author, told NOTUS that was an “overgeneralization” of his research.
The updated report removed those figures. It now reads, “There is evidence of overprescription of oral corticosteroids for mild cases of asthma.”
The MAHA report incident highlights the risks of including AI-generated content in official government reports without human review.
“If AI tools were used to generate citations for a federal report, that raises serious accountability questions. Without clear disclosure or verification processes, there is no way for readers to know whether the information is grounded in real evidence,” Etzioni said. “In matters of public health, especially involving children and mental health, these lapses are particularly troubling.”
This fact check was originally published by PolitiFact, which is part of the Poynter Institute. See the sources for this fact check here.