作者:Michael Hogan, Ph.D., is a lecturer in psychology at the National University of Ireland, Galway.
Co-authored by Aleksandra Siwek, Laura Kearney, and Michael Hogan\
If you have an interest in generative artificial intelligence (AI) and its use in organisations, you may have come across the study by Dell'Acqua and colleagues (2025), "The Cybernetic Teammate: A Field Experiment on Generative AI Reshaping Teamwork and Expertise," a large-scale field study at Procter & Gamble. This highly cited study reports substantial increases in productivity and performance when employees worked with AI. In the broader context of teamwork and organisational science, it's valuable to take a closer look at the study to advance understanding of human-AI collaboration and organisational workflows.
In this study, professionals were assigned to work with or without AI on real product innovation challenges in a one-day virtual workshop. Solutions were compared across four experimental conditions: (1) individual working without AI, (2) individual working with AI, (3) two-person team without AI, and (4) two-person team with AI. The AI system was based on GPT-4 accessed through Microsoft Azure.
Although overall solution quality was rated on a relatively simple 1-10 scale, the findings suggest that working with GPT-4 increased product innovation solution quality. Crucially, when examining the top 10 percent highest quality solutions, teams working with AI were more likely to produce these solutions compared with individuals working with AI. This suggests that teamwork is king.
While Dell’Acqua and colleagues frame GenAI as a “Cybernetic Teammate,” the study protocol, which uses a variety of innovative prompting strategies, suggests that GPT-4 acts more like an interactive tool supporting idea generation and deliberation, but not necessarily a teammate. Importantly, genuine teamwork implies a process of role negotiation, emergent coordination, interdependent goal pursuit, and bidirectional influence. As such, a focus on GenAI as a team member implies a variety of unique analytical and design requirements.
In Figure 1 below, we highlight four levels of analysis that are important to consider when evaluating human-AI teamwork. We will describe each level in turn, from top to bottom.
Before designing and evaluating human-AI teamwork, it's important to understand the task-process architecture. The Cybernetic Teammate study does not present a detailed task-process analysis related to distinct and interdependent human and AI work roles. Two established models are useful here: McGrath's Task Circumplex and Steiner's Task Taxonomy (see Forsyth, 2014). McGrath’s model asks us to define the nature of the teamwork task across cooperation-conflict and conceptual-behavioural dimensions. It highlights four task types (Generate, Choose, Execute, and Negotiate) that are relevant for human-AI collaboration. The product innovation task used by Dell’Acqua and colleagues largely involves a series of Generate task functions—iterative creative ideation and deliberation—but also a series of Choose task functions, as individuals and teams converge on solutions. However, without a detailed task-process analysis, it is unclear how Generate, Choose, Execute, and Negotiate task functions are operative in the teamwork scenario.
Steiner's model asks us to reflect further on issues of task divisibility, performance criteria, and interdependence. Divisibility determines whether human and AI team members can work on separate task components or whether teamwork centres on a unitary (i.e., indivisible) task requiring continuous coordination. In the Cybernetic Teammate study, it is unclear how GenAI might be working independently, and it is unclear how different team member inputs were coordinated.
Performance Criteria determine whether human-AI teams optimise for quantity or quality. While Dell’Acqua and colleagues focused on solution quality, it is unclear how the prompts (task instructions) given to humans and AIs aligned with the quality criteria used by experts to evaluate solutions.
Interdependence analyses clarify how human and AI contributions combine (e.g., Does it involve simple additive processes or complex coordination?). The Cybernetic Teammate study does not provide a detailed analysis of interdependence patterns (i.e., the extent to which participants coordinated their knowledge with AI input), although a substantial proportion of participants retained AI-generated content in their final solutions.
The "Big Five" Teamwork Model developed by Salas and colleagues (2005) identifies five core features of effective teamwork: team leadership, mutual performance monitoring, backup behaviour, adaptability, and team orientation. These behaviours are sustained and coordinated by mutual trust, closed-loop communication, and shared mental models. Although the Cybernetic Teammate study doesn't analyse these teamwork behaviours, future workplace studies can clarify how GenAI can function as a genuine teammate, including whether it can display these core behaviours.
Notably, Dell'Acqua and colleagues interpret positive emotional outcomes as evidence of GenAI emulating teamwork's social aspect. However, their approach to analysis does not address trust and team orientation, which are central to effective teamwork.
An important shortcoming of the Cybernetic Teammate and other studies focusing on performance and productivity effects is their limited analysis of team development dynamics. The team development model proposed by Wang and colleagues (2025) is useful here. This model highlights how human-AI teams can evolve through different developmental phases, where the focus shifts from team formation to task-role development, team development, and, ultimately, team improvement.
By focusing on a one-day workshop, the study takes a snapshot at a single point. The primary focus appears to be task-role development—developing role clarity, capability awareness, and managing task allocation. However, the supporting process isn't fully specified.
The final level in our analytical framework is critical for effective human-AI teamwork. Human critical leadership includes reflective oversight (i.e., meta-cognitive monitoring of AI and team performance with strategic adaptation), ethical stewardship (i.e., bias detection, stakeholder impact assessment, and responsible AI use), strategic vision (i.e., purpose maintenance, change management, and long-term goal alignment within organisations), and relationship management (i.e., team cohesion building, member development, and social support). These essential human functions cannot be delegated to AI systems. Critical leadership functions are not central to the Cybernetic Teammate study. Nevertheless, this study represents an important step toward developing comprehensive understanding of human-AI teamwork and designing, analysing, and evaluating organisational teamwork dynamics in this new era of GenAI. We have much to learn, and we need to proceed with careful, systematic analysis.