What is Google Gemini? (Models, Capabilities & How to Use)

Summary: Google Gemini is a family of multimodal AI models designed to understand text, code, images, audio and video. Available in three sizes, Gemini powers Google’s chatbot, also named Gemini, and integrates with several Google apps and services.

Gemini is a family of generative AI models created by Google and the name of the company’s chatbot. The models come in different variations and are being incorporated into several Google products, including Gmail, Docs and its Chrome browser.

What Is Google Gemini?

Gemini is a family of AI models created by Google to power many of its products, including its Gemini chatbot as well as Gmail, Docs and the Google search engine.

The Gemini models are multimodal, meaning their capabilities span text, image and audio applications. They can generate natural written language, transcribe speeches, create artwork, analyze videos and more. Gemini was developed by Google’s AI research labs DeepMind and Google Research, and is the culmination of nearly a decade of work. Like other AI products, Gemini is expected to get better over time as the industry continues to advance.

What Is Gemini 3?

Gemini 3, is Google’s most advanced family of large language models. Released in November 2025, it expanded on earlier model iterations, with a greater emphasis on long-term reasoning, multimodal understanding, persistent memory and more reliable agentic behavior.

Like its predecessors, Gemini 3 is multimodal, but this generation is built to manage extended tasks, adapt to user preferences and work across documents, apps and datasets over time. The Gemini 3 Pro and Flash models in particular refined Google’s tool-use and calling capabilities, enabling smoother interactions with external services, code execution environments and search tools.

Gemini 3.1 Pro, the Gemini 3 family’s latest model, is designed as a “smarter, more capable baseline” for complex problem-solving and a step forward in core reasoning. It more than doubles the reasoning performance of Gemini 3 Pro (achieving a 77.1 percent on the ARC-AGI-2 benchmark), and can execute advanced tasks like generating animated scalable vector graphics (SVGs) and building interactive 3D simulations.

Gemini 3.1 Pro is available through the Gemini app, NotebookLM, the Gemini API in Google AI Studio and Google Cloud’s Vertex AI.

Previous Gemini Models

Google has come out with several Gemini models over the years.

Gemini 2.5 Flash

Gemini 2.5 Flash is designed for both speed and cost-efficiency, featuring a 1 million-token context window for processing massive data sets like long videos and complex codebases. As the first Gemini Flash-tier model to include native “thinking” capabilities, it allows users to view its step-by-step reasoning process while maintaining the low latency required for real-time applications like conversational agents and large-scale data extraction.

Gemini 2.5 Pro

In March 2025, Google released Gemini 2.5 Pro. Like previous models, it supported multimodal processing and featured a 2 million token context window, enabling it to process extensive documents, codebases and other data-types within a single query. Gemini 2.5 Pro was designed to handle advanced reasoning tasks such as math, science and logic. It could also run external functions, execute code, format data and perform searches via APIs. Google positioned Gemini 2.5 Pro as an advanced, general-purpose model, and as a coding assistant capable of generating, modifying or debugging code with minimal supervision

Gemini 2.0

Gemini 2.0, released in February 2025, was designed with agentic AI in mind, meaning it could not only understand and generate content, but also take action, interact with external tools and complete multi-step tasks on the user’s behalf. To enable this, the models combined advanced reasoning, tool use and extended memory. They also introduced a new “calling feature,” which enabled the models to interact with external services like Google Search, APIs or execute code to complete tasks it could not handle internally.

Gemini 2.0 was made up of three main models — Pro, Flash and Flash-Lite. Gemini 2.0 Pro was built for complex prompts, coding and massive information processing through its 2 million token window. Gemini 2.0 Flash utilized smaller context windows for high volume, high frequency tasks and was the default model for the Gemini mobile app. Meanwhile, Gemini 2.0 Flash-Lite offered a cost efficient option for large text workload and long-context tasks.

Gemini 1.0

Gemini 1.0 was Google’s first family of models, which included Gemini 1.0 Ultra, Gemini 1.5 Pro, Gemini 1.0 Nano, Gemini 1.5 Flash.

Gemini 1.0 Ultra was the company’s largest model at the time and handled complex tasks. According to Google, it was also the first model to outperform human experts on a benchmark spanning physics, law and ethics. It was also incorporated into many of its most popular products including Gmail, Docs, Slides and Meet. The rest of the Gemini 1.0 models extended these core functions, for example, Gemini 1.5 Pro introduced a longer context window, helping it process more information at once.

To extend this capabilities to other environments. Google introduced more lightweight and efficient versions. Gemini 1.0 Nano for example brought generative capabilities directly onto smartphones, enabling features such as on-device summarization and smart replies without relying on external servers. Meanwhile, Gemini 1.5 Flash offered a faster, more cost-efficient alternative to Gemini 1.5 Pro. Together, the Gemini 1.0 models established Google’s foundation for high-end reasoning and flexible deployment across platforms.

What Can Google Gemini Do?

Gemini is a multimodal model, so it is capable of responding to a range of content types, whether that be text, image, video or audio.

Generate Text

Gemini can generate text, whether that’s used to engage in written conversations with users, proofread essays, write cover letters or translate content into different languages. Gemini can also understand, explain and generate code in some of the most popular programming languages, including Python, Java, C++ and Go.

Like any other LLM, though, Gemini has a tendency to hallucinate. “The results should be used with a lot of care,” Subodha Kumar, a professor of statistics, operations and data science at Temple University’s Fox School of Business, told Built In. “They can come with a lot of errors.”

Produce Images

Gemini is able to generate images from text prompts, similar to other AI art generators like Dall-E, Midjourey and Stable Diffusion.

This capability was temporarily halted to undergo retooling after Google was criticized on social media for producing images that depicted specific white figures as people of color. Image generators have developed a reputation for amplifying and perpetuating biases about certain races and genders. Google’s attempts to avoid this pitfall may have gone too far in the other direction, though.

Analyze Images and Videos

Gemini can accept image inputs and then analyze what is going on in those images and explain that information via text. For example, a user can take a photo of a flat tire and ask Gemini how to fix it, or ask Gemini for help on their physics homework by drawing out the problem. Gemini can also process and analyze videos, generate descriptions of what is going on in a given clip and answer questions about it.

Understand Audio

When fed audio inputs, Gemini can support speech recognition across more than 100 languages, and assist in various language translation tasks — as shown in this Google demonstration.

Streamline Workflows

Gemini can be integrated into several Google Workspace products, including Gmail, Docs and Drive. Users can query Gemini (through its chatbot interface) to find a document in their Drive and summarize it, or automatically generate specific emails. “It becomes a little bit of an assistant in that sense,” Gen Furukawa, an AI expert and entrepreneur, told Built In.

Within more specific business contexts, professionals can use Gemini to produce drafts for blog posts, emails and advertisements in Docs; generate images for Slides presentations by inputting a text prompt and selecting a visual style; and even tailor their virtual background in Google Meet with a detailed text prompt.

More on Generative AIA Comparison of the Top AI Models: Features Use Cases and Cost

How to Access Google Gemini

Gemini can be accessed in several ways:

For free: You can go to gemini.google.com and use it for free through the Gemini chatbot. Or you can download the Google Gemini app on your smartphone through the App Store or Google Play store.

Paid version: You can subscribe to the Google AI Plus service for $7.99 a month, the Google AI Pro service for $19.99 a month or the Google AI Ultra service for $249.99 a month.

Developers can also access the models through the Gemini API in Google AI Studio and Vertex AI.

Gemini is a work in progress, so it might generate answers that are inaccurate, unhelpful or even offensive. And it retains users’ conversations, location, feedback and usage information, according to Google’s privacy policy. So users may want to avoid consulting Gemini for professional advice on sensitive or high-stakes subjects (like health or finance), and refrain from discussing private or personal information with the AI tool.

More on Artificial IntelligenceExplore Built In’s AI Coverage

Notable Gemini Updates

Since early 2023, Google has rapidly expanded and refined its Gemini model family. The timeline below highlights key releases and enhancements that demonstrate its growing role in AI-powered search, productivity tools and developer platforms.

Gemini Able to Handle Multi-Step Tasks on Android (February 2026)

Google introduced an early preview of multi-step task automation for Gemini on Android, allowing the assistant to execute complex daily chores like booking rideshare trips or ordering food via apps like DoorDash. By long-pressing the power button and asking Gemini for help, users can delegate these errands to Gemini, which runs the necessary apps in a virtual window while providing live progress updates through notifications. This update will initially launch as a beta feature for the Samsung Galaxy S26 series and Pixel 10 devices in the United States and Korea.

Gemini 3.1 Pro Release (February 2026)

Google released Gemini 3.1 Pro, an upgraded core intelligence model designed to tackle complex problem-solving and advanced reasoning tasks where simple answers are insufficient. Gemini 3.1 Pro showed a leap in performance, achieving a verified score of 77.1 percent on the ARC-AGI-2 logic benchmark — more than double that of Gemini 3 Pro — while enabling capabilities like generating animated SVGs and building interactive 3D simulations directly from text prompts.

Music Creation Feature Added to Gemini App (February 2026)

The Gemini app allows users to create 30-second music tracks from text prompts or uploaded images in beta, all powered by Google’s Lyria 3 model. The feature enables the creation of original instrumentals and lyrics across various genres, with custom cover art generated by Nano Banana and SynthID watermarking applied for Google AI-generated content identification.

Gemini Integrates Google Developer Program (January 2026)

Following the success of Anthropic’s Claude Code, Google released new features and capabilities to entice developers. With the new update, Pro and Ultra users now have access to the Google Developer Program and cloud credits at no extra costs. With the integration, developers can build applications and software using Google AI Studio, its agentic IDE, and seamlessly host it online. With Gemini, users also get a 1-million-token context window, which is significantly higher than Claude Code’s context window.

Chrome Launches Agentic Browsing Capabilities (January 2026)

Powered by Gemini 3, the agentic browsing feature enables Pro and Ultra users in the United States to offload tasks to Chrome. According to the company, the new features can schedule appointments, fill out online forms, manage subscriptions and add shopping items to an online cart.

Apple Partners With Gemini For Enhanced Siri Assistant (January 2026)

Apple announced that it will use Gemini’s AI models to power its Siri personal assistant. The partnership comes after Apple previously used OpenAI’s GPT models to power various AI features and as it has delayed its proprietary AI features. Despite working with another company, Apple plans to continue running its AI features on user devices or a secure cloud to protect personal data.

Opal Integrated into Google Gemini (December 2025)

Google integrated Opal, its tool for building AI-powered mini apps, directly into the Gemini web app to help users create highly customized experimental Gems. Accessible via the Gems manager, this update introduces a new visual editor that organizes prompts into easy-to-follow steps for simplified editing and workflow management. While the Gemini integration streamlines the creation process, users can still access the Advanced Editor on the opal.google site for more granular control and sophisticated customization options.

Enhanced Gemini Deep Research Agent Release (December 2025)

Google released a more powerful Gemini Deep Research agent, an autonomous research tool optimized for complex, long-running context gathering and synthesis tasks. Powered by the Gemini 3 Pro model and available to developers via the Google Interactions API, the reimagined Gemini Deep Research agent iteratively plans its investigation, formulates queries and searches deep into the web to autonomously navigate information landscapes. These functions allow it to generate well-cited and comprehensive reports for advanced applications in areas like financial services, biotechnology and market research.

Gemini 3 Release (November 2025)

Google released its much anticipated Gemini 3 in mid-November. During the initial launch, the company only made one model publically available: Gemini 3 Pro. It included improvements in long-term reasoning, multimodal understanding, persistent memory and was built with Agentic AI in mind. Gemini 3 Pro is available on the Gemini web and mobile, Gemini API, AI Studio and Vertex AI.

Gemini Enterprise Release (October 2025)

Google announced the release of a dedicated AI platform for businesses. Named Gemini Enterprise, it is an entirely different platform from the Gemini chatbot, and allows businesses to securely create and deploy AI assistants using their own data from Google Cloud and other connected applications. According to Google, companies like Figma, Klarna and Virgin Voyages have already started using the platform. Pricing for an annual membership for Gemini Enterprise starts at $21 per seat per month for startups and $30 per seat per month for the “plus” membership.

Gemini Added to Google Chrome (September 2025)

Google has added more AI capabilities to Chrome, this time by integrating Gemini. The update makes Gemini more easily accessible on the browser with a new icon, and introduces features such as video and tab summarization. In the coming months, Google plans to add agentic features to Gemini in Chrome as well, which will automate tasks like online grocery shopping and booking appointments. Gemini’s integration into Chrome coincides with the roll out of Preplexity’s Comet and Atlassian’s Dia Browser, both of which could potentially eat into Google’s dominance in the online search industry.

Gemini 2.5 Release (April 2025)

Google launched Gemini 2.5, a multimodal model built on the Gemini 1.5 architecture with enhanced performance on math, reasoning and code. Gemini 2.5 Pro powers Google's AI features across Workspace, Search and the Gemini app, with Pro and Flash variants available via API through Google Cloud's Vertex AI and AI Studio.

Gemini 2 Family Release (December 2024)

Google announced Gemini 2 as the next generation of its model family, integrating real-time multimodal inputs and stronger tool use. Gemini 2 expanded the model’s capabilities across languages, code and image understanding while enhancing integrations across Android and ChromeOS. It also introduced Gemini Flash, a lightweight version optimized for speed and cost.

Gemini Advanced Subscription Launch (February 2024)

Google launched Gemini Advanced as part of a new paid tier for Google One AI Premium users. The offering gave subscribers access to Gemini 1.5 Pro in Google Workspace apps, positioning Gemini as a direct competitor to OpenAI’s ChatGPT Plus. The move also reflected Google’s strategy of embedding AI across its consumer software suite.

Gemini 1.5 Release (February 2024)

Gemini 1.5 debuted with a 1-million-token context window and strong performance across MMLU, HumanEval and other benchmarks. It introduced notable memory capabilities — allowing it to remember past interactions in user chats — and marked a shift toward long-context processing for enterprise and agentic workflows.

Gemini Rebranding and Model Consolidation (December 2023)

Google rebranded Bard as Gemini and formally introduced the Gemini 1.0 model series — Nano for mobile, Pro for general use and Ultra for complex tasks. This marked the first phase of Google DeepMind and Google Research’s LLM efforts under a unified brand. Gemini 1.0 Pro launched globally in English and began powering Bard and some Android features.

Bard Launch (February 2023)

Google introduced Bard as its conversational AI assistant, initially powered by its LaMDA language model. Bard was Google’s first major response to ChatGPT, designed to answer questions and assist with tasks using web-informed responses. Though early reviews noted limitations, the launch marked a pivotal moment in Google's generative AI trajectory and laid the groundwork for the Gemini model family.

Frequently Asked Questions

What can Google Gemini be used for?

Gemini is an AI tool that can answer questions, summarize text and generate content. It also plugs into other Google services like Gmail, Docs and Drive to serve as a productivity booster. Because Gemini is multimodal, its capabilities span across text, images and audio, meaning it can do tasks such as generating natural written language, transcribing speeches, creating artwork, analyzing videos and more, according to Google.

What are the different versions of Gemini?

Google Gemini is available in three versions or modes: Fast, Thinking and Pro.

Each Gemini mode utilizes a Gemini 3 model (Gemini 3 Flash for Fast and Thinking modes, Gemini 3.1 Pro for Pro mode) and is optimized for different tasks, ranging from complex reasoning to lightweight, fast responses.

Is Google Gemini free?

A basic version of Google Gemini is available for free at gemini.google.com and on the Google Gemini mobile app.

Users can also subscribe to the Google AI Plus service for $7.99 a month, the Google AI Pro service for $19.99 a month or the Google AI Ultra service for $249.99 a month.

Who made Google Gemini?

Google Gemini was made by Google DeepMind and Google Research — AI research labs and subsidiaries under the Google corporate umbrella.

How to access Google Gemini?

To access the free version of Google Gemini on smartphone, users can download the Google Gemini app from the App Store or Google Play store. To use Gemini in chatbot form, users can go to gemini.google.com.