OpenAI unveiled Operator, its first AI agent, for ChatGPT Pro subscribers in the US.
It can autonomously complete tasks like booking reservations or buying groceries.
The agent is powered by a new model built in GPT-4o called CUA.
Experts predicted that 2025 would be the year AI agents go mainstream, and OpenAI is delivering on that forecast.
On Thursday, OpenAI unveiled Operator, a system that can use a web browser to do things like book travel reservations and buy products.
While chatbots like OpenAI's popular ChatGPT use generative AI to respond to queries, Operator is an agent designed to perform tasks autonomously.
OpenAI said Operator would be available Thursday in the US for users of ChatGPT Pro, a $200 monthly plan that provides access to its latest models, including o1. In the coming months, the company said, it will also be made available to subscribers of ChatGPT Plus, OpenAI's $20 monthly subscription tier, and to users in other countries.
During a livestream announcing Operator on Thursday, OpenAI CEO Sam Altman called the release an "early research preview," adding that it would be refined over the coming months. He said OpenAI would also have more agents to launch.
The interface is similar to ChatGPT. Users prompt Operator with a request, like "book a dinner reservation at 7 p.m." They can select a specific website through which they want to process the request, such as OpenTable, or send the request through a search engine like Google.
Operator summarizes its reasoning process in a sidebar so users can identify steps where it makes mistakes, which OpenAI says it's still prone to do.
Users can also upload a picture of a handwritten grocery list and prompt Operator to purchase the items on the list.
Users can choose a specific site, such as Instacart, for Operator to purchase the groceries from. If no site is selected, it will default to a search engine.
Reiichiro Nakano, a member of the company's technical staff, said in the livestream that Operator was powered by CUA, a new model built on GPT-4o.
It's "trained to use and control a computer in the same way that humans can, by just looking at the screen and using a mouse and keyboard to control it," he said.
Nakano said the model bypassed the need for APIs, mechanisms that allow software components to communicate with each other, and "unlocks a whole new range of software we can use that was previously inaccessible."
He added that the model removed "one more bottleneck in our path towards AGI," or artificial general intelligence.
Still, Operator has a way to go before it matches humans' ability to navigate the web.
OpenAI said that in a benchmark measuring how AI agents navigate common operating systems, like the open-source operating system Linux, Operator scored 38.1%, compared with 72.4% for humans. In another benchmark measuring how AI agents navigate common websites, Operator scored 58.1%, compared with 78.2% for humans.