Microsoft will soon let Copilot agents drive computers through the GUI just like humans – by clicking buttons, selecting menus, and even completing forms on screen.
On Wednesday, the Windows empire said it plans to enable computer use from within Copilot Studio - Microsoft's platform for building and deploying AI agents. This will spare employees from having to click buttons and fill forms themselves, while still keeping enterprise data corralled inside Microsoft's cloud - Redmond insists none of it is used to train its models.
"Computer use enables agents to interact with websites and desktop apps by clicking buttons, selecting menus, and typing into fields on the screen," explained Charles Lamanna, corporate VP for business and industry, Copilot, in the corp's marketing bumf.
"This allows agents to handle tasks even when there is no API available to connect to the system directly. If a person can use the app, the agent can too."
AI agents are, as far as we can tell, pieces of software that talk to other pieces of software as well as users, using generative AI to make decisions and form outputs.
Today, Microsoft Copilot Studio enables customers to create AI-driven agents to automate certain tasks, but these agents only work with specific services, like SharePoint. The new type of agents should be much more flexible. For instance, you could create an agent and prompt it to carry out a series of steps that involve browsing a previously unseen website, extracting some data, and passing that data to a desktop app.
Lamanna suggests several scenarios where the new Copilot agents could come in handy, such as automating the input of large amounts of data from multiple sources to a central repository, automatically collecting market data for research, or using AI text and image recognition capabilities to process invoices.
AI automation differs from programmed instructions in that the agent can adapt on the fly when it encounters obstacles or unexpected changes in the interface. Instead of crashing with an error, it uses built-in reasoning to muddle through, at least according to Microsoft.
"Computer use adapts to changes in apps and websites automatically," Lamanna claimed. "It adjusts in real time using built-in reasoning to fix issues on its own, so work continues without interruption."
With any luck, said reasoning does not involve unexpected deletions or policy violations, as one concerned user fretted about in a a social media thread solicited by a Copilot Studio product manager.
However, turning over computational tasks to Copilot may involve unanticipated costs. As with cloud services, the bill for AI's boil-the-ocean approach to computation use isn't necessarily easy to anticipate and there's potential for bill shock if certain tasks turn out to be computationally demanding.
Concerns about costs have been raised by users of OpenAI's computer use API, and by users of Anthropic's computer use API.
Microsoft is bringing computer use to Copilot Studio users through an early access research preview that requires a signup. Expect to hear more about this at Microsoft Build 2025 next month. ®