Published: October 21, 2025
The built-in AI capabilities powered by Gemini Nano are designed to be seamless for both users and developers. When you use a built-in AI API, the model management happens automatically in the background. This document describes how Chrome handles Gemini Nano model downloads, updates, and purges.
Initial model download
When a user downloads or updates Chrome, Gemini Nano is
downloaded on demand
to ensure Chrome downloads the correct model for the user's hardware.
The initial model download is triggered by the first call to a *.create()
function (for example, Summarizer.create()
) of any built-in AI
API that depends on Gemini
Nano. When this happens, Chrome runs a series of checks to determine the best
course of action.
First, Chrome estimates the device's GPU performance by running a representative
shader. Based on these results, it decides to either:
- Download a larger, more capable Gemini Nano variant (such as 4B parameters).
- Download a smaller, more efficient Gemini Nano variant (such as 2B parameters).
- Fall back to CPU-based inference if the device meets separate static requirements. If the device doesn't meet the hardware requirements, the model is not downloaded.
The download process is built to be resilient:
- If the internet connection is interrupted, the download continues from where it left off once connectivity is restored.
- If the tab that triggered the download is closed, the download continues in the background.
- If the browser is closed, the download will resume on the next restart, provided the browser opens within 30 days.
Sometimes, calling availability()
can trigger the model download. This occurs
if the call happens shortly after a fresh user profile starts up and if the
Gemini Nano powered scam detection
feature is active.
LoRA weights download
Some APIs, like the Proofreader API, rely on Low-Rank Adaptation (LoRA) weights that are applied to the base model to specialize its function. If the API depends on LoRA, the LoRA weights are downloaded alongside the base model. LoRA weights for other APIs are not proactively downloaded.
Automatic model updates
Gemini Nano model updates are released on a regular basis. Chrome checks for
these updates when the browser starts up. Additionally, Chrome checks for
updates to supplementary resources, like LoRA weights, on a daily basis. While
you can't programmatically query the model version from JavaScript, you can
manually check which version is installed
on chrome://on-device-internals
.
The update process is designed to be seamless and non-disruptive:
- Chrome keeps operating with the current model while downloading the new version in the background.
- Once the updated model is downloaded, it's hot swapped, which means the models are switched with no downtime. Any new AI API call will immediately use the new model. Note: It's possible for a prompt running at the exact moment of the swap to fail.
- Every update is a full new model download, not a partial download. This is because model weights can be significantly different between versions, and computing and applying deltas for such large files can be slow.
Updates are subject to the same requirements as the initial download. However, the initial disk space check is waived if a model is already installed. LoRA weights can also be updated. A new version of LoRA weights can be applied to an existing base model. However, a new base model version always requires a new set of LoRA weights.
Model deletion
Chrome actively manages disk space to ensure the user doesn't run out. The Gemini Nano model is automatically deleted if the device's free disk space drops below a certain threshold. Additionally, the model is purged if an enterprise policy disables the feature, or if a user hasn't met other eligibility criteria for 30 days. Eligibility may include API usage and device capability. The purge process has the following characteristics:
- The model can be deleted at any time, even mid-session, without regard for running prompts. This means an API that was available at the start of a session could suddenly become unavailable.
- After being purged, the model is not automatically re-downloaded. A
new download must be triggered by an application calling a
*.create()
function. - When the base model is purged, any related LoRA weights are also purged after a 30-day grace period.
Your role in model management
Having a good understanding of the built-in AI model's lifecycle is key to getting the user experience right. You're not done with downloading the model once, you also need to be aware of the possibility of the model to suddenly disappear again under disk space pressure, or the model to be updated when a new version comes out. This is all taken care of by the browser.
By following best practices around downloading the model, you'll create a good user experience on initial download, re-downloads, and updates.