Emphasizing a new AI model for the agentic era, Google has introduced Gemini 2.0, which the company calls its most capable model yet.
Announced December 11, the Gemini 2.0 Flash experimental model will be available to all Gemini users. Gemini 2.0 is billed as having advances in multimodality, such as native image and audio output, and native tool use. Google anticipates Gemini 2.0 enabling the development of new AI agents closer to the vision of a universal assistant. Agentic models can understand more, think multiple steps ahead, and take action on a user’s behalf, with supervision, Google CEO Sundar Pichai said
Gemini 2.0’s advances are underpinned by decade-long investments in a differentiated full-stack approach to AI innovation, Pichai said. The technology was built on custom hardware such as Trillium, which features sixth-generation TPUs (tensor processing units), which powered Gemini 2.0 training and inference. Trillium is also generally available to customers who want to build with it.
With this announcement, Google also introduced a new feature, Deep Research, which leverages advanced reasoning and long-context capabilities to act as a research assistant, exploring complex topics and compiling reports. Deep Research is available in Gemini Advanced.
While Gemini 1.0, introduced in December 2023, was about organizing and understanding information, Gemini 2.0 is about making the information more useful, Pichai said. In touting Gemini 2.0, Google cited Project Mariner, an early research prototype built with Gemini 2.0 that explores the future of human-agent interaction, starting with a browser. As a research prototype, it can understand and reason across information in a browser screen, including pixels and web elements like text, code, images, and forms, and then use that information via an experimental Chrome extension to complete tasks.