Google DeepMind is touting Gemini 2.5 as its most advanced AI model to date, boasting unparalleled intelligence.
The initial model from this latest generation, an experimental version of Gemini 2.5 Pro, has reportedly achieved state-of-the-art results across a broad spectrum of benchmarks, according to DeepMind.
Koray Kavukcuoglu, CTO of Google DeepMind, refers to the Gemini 2.5 models as “thinking models,” emphasizing their capacity to reason through complex thoughts before generating a response, thereby enhancing performance and accuracy.
According to Kavukcuoglu, the ability to “reason” extends far beyond mere classification and prediction, encompassing the system’s capacity to analyze information, deduce logical conclusions, incorporate context and nuance, and ultimately make informed decisions.
DeepMind has been actively exploring methods to augment AI’s intelligence and reasoning capabilities for some time, utilizing techniques such as reinforcement learning and chain-of-thought prompting. This groundwork led to the introduction of their first thinking model, Gemini 2.0 Flash Thinking.
“With Gemini 2.5,” Kavukcuoglu notes, “we have achieved a new level of performance by combining a significantly enhanced base model with improved post-training.”
Google plans to integrate these thinking capabilities directly into all future models, enabling them to tackle more complex problems and support more capable, context-aware agents.Gemini 2.5 Pro secures the top spot on the LMArena leaderboard
Gemini 2.5 Pro Experimental is positioned as DeepMind’s most advanced model for handling intricate tasks. As of the latest update, it has secured the top spot on the LMArena leaderboard – a key metric for assessing human preferences – by a significant margin, demonstrating a highly capable model with a high-quality style:

Gemini 2.5 excels in maths, science, coding, and reasoning
Gemini 2.5 Pro has demonstrated state-of-the-art performance across various benchmarks that demand advanced reasoning, including those related to maths and science.
Notably, it leads in maths and science benchmarks – such as GPQA and AIME 2025 – without relying on test-time techniques that increase costs, like majority voting. Additionally, it achieved a state-of-the-art score of 18.8% on Humanity’s Last Exam, a dataset designed to evaluate the human frontier of knowledge and reasoning.
DeepMind has placed significant emphasis on coding performance, and Gemini 2.5 represents a substantial leap forward compared to its predecessor, 2.0, with further improvements in the pipeline. 2.5 Pro excels in creating visually compelling web applications and agentic code applications, as well as code transformation and editing.
On SWE-Bench Verified, the industry standard for agentic code evaluations, Gemini 2.5 Pro achieved a score of 63.8% using a custom agent setup. The model’s reasoning capabilities also enable it to create a video game by generating executable code from a single-line prompt.
Building on the strengths of its predecessors
Gemini 2.5 builds upon the core strengths of earlier Gemini models, including native multimodality and a long context window. 2.5 Pro launches with a one million token context window, with plans to expand this to two million tokens soon, enabling the model to comprehend vast datasets and handle complex problems from diverse information sources.
Developers and enterprises can now begin experimenting with Gemini 2.5 Pro in Google AI Studio, while Gemini Advanced users can access it via the model dropdown on desktop and mobile platforms. The model will be rolled out on Vertex AI in the coming weeks.
Google DeepMind encourages users to provide feedback, which will be used to further enhance Gemini’s capabilities.
(Photo by Anshita Nair)
See also: DeepSeek V3-0324 tops non-reasoning AI models in open-source first

Want to learn more about AI and big data from industry leaders? Check out the AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events, including the Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
Source Link