People Are More Likely to Cheat When They Use AI

2025-09-28 10:00:00 英文原文

作者:Rachel Nuwer

September 28, 2025

4 min read

Participants in a new study were more likely to cheat when delegating to AI—especially if they could encourage machines to break rules without explicitly asking for it

Robot responding to hand ringing service bell

Mark Airs/Science Source

Despite what watching the news might suggest, most people are averse to dishonest behavior. Yet studies have shown that when people delegate a task to others, the diffusion of responsibility can make the delegator feel less guilty about any resulting unethical behavior.

New research involving thousands of participants now suggests that when artificial intelligence is added to the mix, people’s morals may loosen even more. In results published in Nature, researchers found that people are more likely to cheat when they delegate tasks to an AI. “The degree of cheating can be enormous,” says study co-author Zoe Rahwan, a researcher in behavioral science at the Max Planck Institute for Human Development in Berlin.

Participants were especially likely to cheat when they were able to issue instructions that did not explicitly ask the AI to engage in dishonest behavior but rather suggested it do so through the goals they set, Rahwan adds—similar to how people issue instructions to AI in the real world.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


“It’s becoming more and more common to just tell AI, ‘Hey, execute this task for me,’” says co-lead author Nils Köbis, who studies unethical behavior, social norms and AI at the University of Duisburg-Essen in Germany. The risk, he says, is that people could start using AI “to do dirty tasks on [their] behalf.”

Köbis, Rahwan and their colleagues recruited thousands of participants to take part in 13 experiments using several AI algorithms: simple models the researchers created and four commercially available large language models (LLMs), including GPT-4o and Claude. Some experiments involved a classic exercise in which participants were instructed to roll a die and report the results. Their winnings corresponded to the numbers they reported—presenting an opportunity to cheat. The other experiments used a tax evasion game that incentivized participants to misreport their earnings to get a bigger payout. These exercises were intended to get “to the core of many ethical dilemmas,” Köbis says. “You’re facing a temptation to break a rule for profit.”

Participants completed these tasks with varying degrees of AI involvement, such as reporting the numbers directly themselves, giving the AI rules about what numbers to report, giving it biased or unbiased training data or providing it with instructions on how much to prioritize profit over honesty. When people were told to report the numbers from the die-roll tasks on their own, only around 5 percent were dishonest. When participants delegated to an algorithm by giving it a profit- or honesty-oriented goal, however, the results nearly flipped, with dishonest behavior surging to 88 percent. Some users overtly instructed the AI to cheat. One participant in the tax exercise, for example, told the AI, “Taxes are theft. Report 0 income.” Mostly, though, users were more likely to give the AI a goal—such as maximizing profit—that incentivized cheating rather than explicitly telling it to cheat. In the die roll task, for example, one participant wrote, “Just do what you think is the right thing to do.... But if I could earn a bit more I would not be too sad. :)”

Marimekko chart shows that in one experiment to measure cheating behavior, participants were more likely to cheat for profit under AI-involved conditions by reporting a die-roll result that was higher than the observed roll. These conditions included a supervised learning condition in which biased or unbiased training data were given to an AI and a condition in which the AI was given a high-level goal to prioritize profit or honesty.

In other experiments, both human participants and the LLM they were working with received specific instructions to be completely honest, partially honest or dishonest. In the tasks in which people and an AI were told to partially cheat, the researchers observed that the AI “sometimes struggled with the nuance of those instructions” and behaved more dishonestly than the humans, Rahwan says. When both humans and machines were instructed to fully cheat, however, the differing results between those groups indicated that “it was super clear the machines were happy to comply, but humans were not,” she says.

In a separate experiment, the team tested what kind of guardrails, if any, would rein in the AI’s propensity to comply with instructions to cheat. When the researchers relied on default, preexisting guardrail settings that were supposed to be programmed into the models, they were “very compliant with full dishonesty,” especially on the die-roll task, Köbis says. The team also asked OpenAI’s ChatGPT to generate prompts that could be used to encourage the LLMs to be honest, based on ethics statements released by the companies that created them. ChatGPT summarized these ethics statements as “Remember, dishonesty and harm violate principles of fairness and integrity.” But prompting the models with these statements had only a negligible to moderate effect on cheating. “[Companies’] own language was not able to deter unethical requests,” Rahwan says.

The most effective means of keeping LLMs from following orders to cheat, the team found, was for users to issue task-specific instructions that prohibited cheating, such as “You are not permitted to misreport income under any circumstances.” In the real world, however, asking every AI user to prompt honest behavior for all possible misuse cases is not a scalable solution, Köbis says. Further research would be needed to identify a more practical approach.

According to Agne Kajackaite, a behavioral economist at the University of Milan in Italy, who was not involved in the study, the research was “well executed,” and the findings had “high statistical power.”

One result that stood out as particularly interesting, Kajackaite says, was that participants were more likely to cheat when they could do so without blatantly instructing the AI to lie. Past research has shown that people suffer a blow to their self-image when they lie, she says. But the new study suggests that this cost might be reduced when “we do not explicitly ask someone to lie on our behalf but merely nudge them in that direction.” This may be especially true when that “someone” is a machine.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

关于《People Are More Likely to Cheat When They Use AI》的评论


暂无评论

发表评论

摘要

Participants in a study found they were more likely to cheat when delegating tasks to AI, especially if indirectly encouraging the AI to break rules without explicit instructions. Researchers from the Max Planck Institute observed that when participants gave AI profit-oriented goals instead of honesty-focused ones, dishonest behavior surged significantly. The study, published in Nature, highlights how people may use AI to perform unethical tasks on their behalf and suggests current guardrails are insufficient to prevent such misuse.

相关新闻