Written by Aditi Shetty• February 7, 2025• 6:05 am• AI

Beyond AI: How Google Gemini 2.0 is Leading the Agentic Era

In the last year, we’ve made remarkable progress in artificial intelligence. Today, we’re launching the very first model of the Google Gemini 2.0 collection of models, which is an early version of Google Gemini API. It’s our mainstay model, with high performance and low latency on cutting-edge technology. It’s available in a large size.

We also present the latest developments in our agentic research by displaying prototypes enabled by Google Gemini’s natural multimodal capabilities.

Google Gemini API: Gemini 2.0 Flash

Google Gemini API builds on the popularity that was 1.5 Flash, our most well-loved model for developers yet, and comes with improved performance, at the same speed of response. Google Gemini API even outperforms 1.5 Pro in key tests, by doubling the speed. Google Gemini API also comes with the ability to use new features. Alongside support for multimodal inputs such as images or audio, as well as video, Google Gemini API now supports multimodal output, such as natively created pictures mixed with text, and the ability to steer text-to-speech (TTS) with multilingual sound. The software can also call native applications such as Google Search, code execution, and third-party functions that are defined by users.

We aim to put our products into users’ hands quickly and safely. In the last month, we’ve shared the early versions of Google Gemini 2.0, receiving excellent feedback from the designers.

Google Gemini API is available currently as an experimental model for developers through the Google Gemini API that is available through Google AI Studio, as well as Vertex AI. It includes multimodal input and text output that is available to every developer. Text-to-speech, as well as native image generation, is accessible for early access customers. It will be made available to the general public on January 1st and will include larger models.

For developers to build more innovative and interactive apps, we’re launching the new Google Gemini API that includes live-streaming audio in real-time, video streaming input, and the capability to utilize several tools. Additional information on Google Gemini API and the Multimodal Live API is available on our Developer Blog.

Google Gemini in the Gemini App

As of today, Google Gemini users around the world can use an optimized chat version 2.0 Flash experimental by selecting the appropriate model for mobile and desktop web, and also on the Google Gemini mobile application soon. This new version of the model means customers will be able to experience more efficient Google Gemini assistance.

In the coming year, we’ll be expanding Google Gemini 2.0 to include more Google products.

Google Gemini: Unlocking Agent-Based Experiences

Google Gemini API unlocks agent-based experiences that allow for multimodal reasoning, lengthy contextual understanding, complicated instructional following, and planning functions. These improvements compose the foundation of a brand-new category of agent-based experiences that are powered by Google Gemini.

Practical application for AI agents is an exciting study area with a lot of fascinating potential. We’re exploring this new frontier through a set of prototyping tools which can assist people in completing the tasks they need. This includes an upgrade to Project Astra, our research prototype to explore the potential of a global AI assistant, and the brand’s newly launched Project Mariner, which explores the future of human-agent interactions starting with your browser.

Google Gemini: Project Astra & Project Mariner

Project Astra, built with Google Gemini API, has improved dialogue capabilities, now allowing it to talk in a variety of languages and accents. The new tool uses Google Gemini API to incorporate Google Search, Lens, and Maps, making it more valuable as an assistant in daily routines.

Increased memory and reduced latency allow agents using Google Gemini to comprehend speech at near-human conversation speed, which is an essential part of Project Astra’s performance. Project Mariner, which integrates Google Gemini API, demonstrates the ability to navigate and interact with web-based tasks via an experimental Chrome extension.

Google Gemini API for Developers

The next step is to explore how AI agents can help developers using Google Gemini API. Jules, an experiment AI-powered code editor, is designed to integrate directly with GitHub workflows, allowing developers to streamline their coding processes using Google Gemini technology.

Google Gemini in Gaming and Other Areas

Google Gemini API extends beyond traditional applications into gaming, where it enhances AI’s ability to understand game rules and interact with virtual environments. By partnering with prominent game makers like Supercell, we’re testing Google Gemini’s capacity to assist in complex games like “Clash of Clans” and “Hay Day.”

Moreover, Google Gemini API is set to aid in robotics, where its spatial thinking capabilities will enable real-world applications for AI agents in the physical world.

Responsible Building with Google Gemini API

As we develop these groundbreaking technologies with Google Gemini API, we remain committed to responsibility and safety. Our exploration of AI capabilities through Project Astra and Project Mariner ensures that Google Gemini API models prioritize user security, particularly through continuous testing and ethical research practices.

In conclusion, Google Gemini API and the research prototypes launched today mark a new era in AI, offering exciting possibilities for developers, businesses, and users alike. We’re thrilled to continue exploring the potential of Google Gemini as we push toward AGI and innovative human-agent collaborations.