OpenAI is rolling out an update to the AI model that powers its autonomous agent, Operator, which can browse the web and utilize certain software within a cloud-hosted virtual machine to complete tasks on behalf of users.
The updated Operator will be powered by a model based on o3, a cutting-edge model in OpenAI’s o series of “reasoning” models, replacing the existing custom version of GPT-4o.
Notably, o3 outperforms its predecessor in various benchmarks, particularly in tasks that involve math and reasoning.
As stated in a blog post by OpenAI, “We are replacing the existing GPT-4o-based model for Operator with a version based on OpenAI o3. The API version of Operator will continue to be based on 4o.”
Operator is part of a growing suite of agentic tools released by AI companies in recent months, with the goal of creating sophisticated agents that can perform tasks with minimal supervision.
Similarly, Google offers a “computer use” agent through its Gemini API, which can browse the web and perform actions on behalf of users, as well as a consumer-focused offering called Mariner. Additionally, Anthropic’s models can perform computer tasks, including opening files and navigating web pages.
According to OpenAI, the new o3 Operator model has been fine-tuned with additional safety data for computer use, including datasets designed to teach the model decision boundaries on confirmations and refusals.
A technical report released by OpenAI highlights the performance of o3 Operator on specific safety evaluations, demonstrating that it is less likely to refuse to perform “illicit” activities and search for sensitive personal data, and less susceptible to prompt injection attacks compared to the GPT-4o Operator model.
As noted in OpenAI’s blog post, “o3 Operator employs the same multi-layered approach to safety as the 4o version of Operator. Although o3 Operator inherits o3’s coding capabilities, it does not have native access to a coding environment or terminal.”
Source Link