Prompt API

Thomas Steiner

Alexandra Klepper

發布日期：2025 年 5 月 20 日，上次更新時間：2025 年 9 月 21 日

說明	網頁	擴充功能	Chrome 狀態	意圖
GitHub	來源試用	Chrome 138	查看	實驗意圖

透過 Prompt API，您可以將自然語言要求傳送至瀏覽器中的 Gemini Nano。

提示 API 的用途非常廣泛，舉例來說，您可以建構：

AI 輔助搜尋：根據網頁內容回答問題。
個人化新聞動態消息：建立動態消息，根據類別動態分類文章，並允許使用者篩選該內容。
自訂內容篩選器：分析新聞報導，並根據使用者定義的主題自動模糊或隱藏內容。
建立日曆活動。開發 Chrome 擴充功能，自動從網頁擷取活動詳細資料，讓使用者只需幾個步驟就能建立日曆項目。
輕鬆擷取聯絡人資訊：建立擴充功能，從網站擷取聯絡資訊，方便使用者與商家聯絡，或將詳細資料新增至聯絡人清單。

以上僅列舉幾個可能性，我們很期待看到你的創作。

查看硬體需求

開發人員和使用者在 Chrome 中透過這些 API 操作功能時，必須遵守下列規定。其他瀏覽器的操作規定可能不同。

語言偵測器和翻譯器 API 適用於 Chrome 電腦版。這些 API 無法在行動裝置上運作。在 Chrome 中，只要符合下列條件，即可使用 Prompt API、Summarizer API、Writer API、Rewriter API 和 Proofreader API：

作業系統：Windows 10 或 11；macOS 13 以上版本 (Ventura 和後續版本)； Linux；或 Chromebook Plus 裝置上的 ChromeOS (從 Platform 16389.0.0 以上版本)。使用 Gemini Nano 的 API 目前不支援 Android 版、iOS 版 Chrome，以及非 Chromebook Plus 裝置上的 ChromeOS。
儲存空間：包含 Chrome 設定檔的磁碟區至少要有 22 GB 的可用空間。
內建模型應該會小很多。確切大小可能會因更新而略有不同。
GPU 或 CPU：內建模型可透過 GPU 或 CPU 執行。
- GPU：VRAM 必須超過 4 GB。
- CPU：RAM 16 GB 以上，CPU 核心 4 個以上。
網路：無限量數據或不計量的連線。
重要詞彙：計量付費連線是指有數據用量上限的網際網路連線。Wi-Fi 和乙太網路連線預設通常為非按流量計費，行動網路連線則通常為按流量計費。

瀏覽器更新模型時，Gemini Nano 的確切大小可能會有所不同。如要判斷目前大小，請前往 chrome://on-device-internals。

使用 Prompt API

Prompt API 會使用 Chrome 中的 Gemini Nano 模型。雖然 API 已內建於 Chrome，但來源首次使用 API 時，系統會另外下載模型。使用這項 API 前，請先詳閱並同意Google 的生成式 AI 使用限制政策。

如要判斷模型是否已可使用，請呼叫 LanguageModel.availability()。

const availability = await LanguageModel.availability({
  // The same options in `prompt()` or `promptStreaming()`
});

如要觸發下載並例項化語言模型，請檢查使用者啟用。然後呼叫 create() 函式。

const session = await LanguageModel.create({
  monitor(m) {
    m.addEventListener('downloadprogress', (e) => {
      console.log(`Downloaded ${e.loaded * 100}%`);
    });
  },
});

如果對 availability() 的回應是 downloading，請監聽下載進度並告知使用者，因為下載可能需要一段時間。

在 localhost 上使用

Chrome localhost 提供所有內建的 AI API。將下列旗標設為「已啟用」：

chrome://flags/#optimization-guide-on-device-model
chrome://flags/#prompt-api-for-gemini-nano-multimodal-input

然後按一下「重新啟動」或重新啟動 Chrome。如果發生錯誤，請排解本機主機問題。

模型參數

params() 函式會告知語言模型的參數。物件包含下列欄位：

defaultTopK：預設的 top-K 值。
maxTopK：前 K 個值上限。
defaultTemperature：預設溫度。
maxTemperature：最高溫度。

await LanguageModel.params();
// {defaultTopK: 3, maxTopK: 128, defaultTemperature: 1, maxTemperature: 2}

建立工作階段

提示 API 執行後，您可以使用 create() 函式建立工作階段。

您可以使用選用的選項物件，透過 topK 和 temperature 自訂每個工作階段。這些參數的預設值會從 LanguageModel.params() 傳回。

const params = await LanguageModel.params();
// Initializing a new session must either specify both `topK` and
// `temperature` or neither of them.
const slightlyHighTemperatureSession = await LanguageModel.create({
  temperature: Math.max(params.defaultTemperature * 1.2, 2.0),
  topK: params.defaultTopK,
});

create() 函式的選用選項物件也會採用 signal 欄位，可讓您傳遞 AbortSignal 來終止工作階段。

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const session = await LanguageModel.create({
  signal: controller.signal,
});

使用初始提示詞加入脈絡資料

透過初始提示，您可以為語言模型提供先前互動的相關背景資訊，例如允許使用者在重新啟動瀏覽器後，繼續使用儲存的工作階段。

const session = await LanguageModel.create({
  initialPrompts: [
    { role: 'system', content: 'You are a helpful and friendly assistant.' },
    { role: 'user', content: 'What is the capital of Italy?' },
    { role: 'assistant', content: 'The capital of Italy is Rome.' },
    { role: 'user', content: 'What language is spoken there?' },
    {
      role: 'assistant',
      content: 'The official language of Italy is Italian. [...]',
    },
  ],
});

使用前置字元限制回覆內容

除了先前的角色，您還可以新增 "assistant" 角色，進一步說明模型先前的回覆。例如：

const followup = await session.prompt([
  {
    role: "user",
    content: "I'm nervous about my presentation tomorrow"
  },
  {
    role: "assistant",
    content: "Presentations are tough!"
  }
]);

在某些情況下，您可能不想要求生成新回覆，而是想預先填入部分 "assistant" 角色回覆訊息。這有助於引導語言模型使用特定回覆格式。如要這麼做，請在結尾的 "assistant" 角色訊息中加入 prefix: true。例如：

const characterSheet = await session.prompt([
  {
    role: 'user',
    content: 'Create a TOML character sheet for a gnome barbarian',
  },
  {
    role: 'assistant',
    content: '```toml\n',
    prefix: true,
  },
]);

新增預期輸入和輸出內容

Prompt API 具有多模態功能，且支援多種語言。建立工作階段時，請設定 expectedInputs 和 expectedOutputs 模態和語言。

type：預期模式。
- 如果是 expectedInputs，則可以是 text、image 或 audio。
- 對於 expectedOutputs，提示 API 僅允許 text。
languages：設定預期語言的陣列。Prompt API 接受 "en"、"ja" 和 "es"。我們正在開發其他語言的支援功能。
- 如果是 expectedInputs，請設定系統提示語言和一或多個預期使用者提示語言。
- 設定一或多種 expectedOutputs 語言。

const session = await LanguageModel.create({
  expectedInputs: [
    { type: "text", languages: ["en" /* system prompt */, "ja" /* user prompt */] }
  ],
  expectedOutputs: [
    { type: "text", languages: ["ja"] }
  ]
});

如果模型遇到不支援的輸入或輸出內容，您可能會收到 "NotSupportedError" DOMException。

多模態功能

有了這些功能，您就能：

允許使用者轉錄即時通訊應用程式中傳送的語音訊息。
描述上傳至網站的圖片，以用於說明文字或替代文字。

如要瞭解如何搭配音訊輸入使用 Prompt API，請參閱 Mediarecorder Audio Prompt 示範；如要瞭解如何搭配圖片輸入使用 Prompt API，請參閱 Canvas Image Prompt 示範。

Prompt API 支援下列輸入類型：

音訊：
視覺化：
- HTMLImageElement
- SVGImageElement
- HTMLVideoElement (使用目前影片位置的影片影格)
- HTMLCanvasElement
- ImageBitmap
- OffscreenCanvas
- VideoFrame
- Blob
- ImageData

這個程式碼片段顯示多模態工作階段，首先會處理兩張圖片 (一張 Blob 和一張 HTMLCanvasElement)，並讓 AI 比較兩者，然後讓使用者透過錄音 (以 AudioBuffer 形式) 回覆。

const session = await LanguageModel.create({
  expectedInputs: [
    { type: "text", languages: ["en"] },
    { type: "audio" },
    { type: "image" },
  ],
  expectedOutputs: [{ type: "text", languages: ["en"] }],
});

const referenceImage = await (await fetch("reference-image.jpeg")).blob();
const userDrawnImage = document.querySelector("canvas");

const response1 = await session.prompt([
  {
    role: "user",
    content: [
      {
        type: "text",
        value:
          "Give a helpful artistic critique of how well the second image matches the first:",
      },
      { type: "image", value: referenceImage },
      { type: "image", value: userDrawnImage },
    ],
  },
]);
console.log(response1);

const audioBuffer = await captureMicrophoneInput({ seconds: 10 });

const response2 = await session.prompt([
  {
    role: "user",
    content: [
      { type: "text", value: "My response to your critique:" },
      { type: "audio", value: audioBuffer },
    ],
  },
]);
console.log(response2);

附加訊息

推論作業可能需要一些時間，尤其是使用多模態輸入內容提示時。預先傳送預先決定的提示來填入工作階段，模型就能搶先開始處理。

initialPrompts 在建立工作階段時很有用，但除了 prompt() 或 promptStreaming() 方法之外，append() 方法還可用於在工作階段建立後，提供額外的背景提示。

例如：

const session = await LanguageModel.create({
  initialPrompts: [
    {
      role: 'system',
      content:
        'You are a skilled analyst who correlates patterns across multiple images.',
    },
  ],
  expectedInputs: [{ type: 'image' }],
});

fileUpload.onchange = async () => {
  await session.append([
    {
      role: 'user',
      content: [
        {
          type: 'text',
          value: `Here's one image. Notes: ${fileNotesInput.value}`,
        },
        { type: 'image', value: fileUpload.files[0] },
      ],
    },
  ]);
};

analyzeButton.onclick = async (e) => {
  analysisResult.textContent = await session.prompt(userQuestionInput.value);
};

提示經過驗證、處理並附加至工作階段後，append() 傳回的 Promise 就會完成。如果無法附加提示，系統會拒絕 Promise。

傳遞 JSON 結構定義

將 responseConstraint 欄位新增至 prompt() 或 promptStreaming() 方法，以傳遞 JSON 結構定義做為值。接著，您可以使用 Prompt API 搭配結構化輸出內容。

在下列範例中，JSON 結構定義會確保模型以 true 或 false 回覆，判斷指定訊息是否與陶藝相關。

const session = await LanguageModel.create();

const schema = {
  "type": "boolean"
};

const post = "Mugs and ramen bowls, both a bit smaller than intended, but that
happens with reclaim. Glaze crawled the first time around, but pretty happy
with it after refiring.";

const result = await session.prompt(
  `Is this post about pottery?\n\n${post}`,
  {
    responseConstraint: schema,
  }
);
console.log(JSON.parse(result));
// true

實作方式可以包含 JSON 結構定義或規則運算式，做為傳送至模型的訊息內容。這會使用部分輸入配額。您可以將 responseConstraint 選項傳遞至 session.measureInputUsage()，藉此測量系統會使用多少輸入配額。

如要避免這種情況，請使用 omitResponseConstraintInput 選項。如果這麼做，建議在提示中加入一些指引：

const result = await session.prompt(`
  Summarize this feedback into a rating between 0-5. Only output a JSON
  object { rating }, with a single property whose value is a number:
  The food was delicious, service was excellent, will recommend.
`, { responseConstraint: schema, omitResponseConstraintInput: true });

提示模型

您可以使用 prompt() 或 promptStreaming() 函式提示模型。

以要求為準的輸出內容

如要取得簡短結果，可以使用 prompt() 函式，在結果可用時回傳回覆。

// Start by checking if it's possible to create a session based on the
// availability of the model, and the characteristics of the device.
const { defaultTemperature, maxTemperature, defaultTopK, maxTopK } =
  await LanguageModel.params();

const available = await LanguageModel.availability({
  expectedInputs: [{type: 'text', languages: ['en']}],
  expectedOutputs: [{type: 'text', languages: ['en']}],
});

if (available !== 'unavailable') {
  const session = await LanguageModel.create();

  // Prompt the model and wait for the whole result to come back.
  const result = await session.prompt('Write me a poem!');
  console.log(result);
}

串流輸出內容

如果預期回覆內容較長，請使用 promptStreaming() 函式，以便在模型傳回部分結果時顯示。promptStreaming() 函式會傳回 ReadableStream。

const { defaultTemperature, maxTemperature, defaultTopK, maxTopK } =
  await LanguageModel.params();

const available = await LanguageModel.availability({
  expectedInputs: [{type: 'text', languages: ['en']}],
  expectedOutputs: [{type: 'text', languages: ['en']}],
});
if (available !== 'unavailable') {
  const session = await LanguageModel.create();

  // Prompt the model and stream the result:
  const stream = session.promptStreaming('Write me an extra-long poem!');
  for await (const chunk of stream) {
    console.log(chunk);
  }
}

停止提示

prompt() 和 promptStreaming() 都接受含有 signal 欄位的選用第二個參數，可讓您停止執行提示。

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const result = await session.prompt('Write me a poem!', {
  signal: controller.signal,
});

工作階段管理

每個工作階段都會追蹤對話情境。系統會將先前的互動納入考量，以利後續互動，直到工作階段的內容視窗填滿為止。

每個工作階段可處理的權杖數量上限。如要查看距離達到這項限制還差多少，請使用下列方法：

console.log(`${session.inputUsage}/${session.inputQuota}`);

進一步瞭解工作階段管理。

複製課程

如要保留資源，可以使用 clone() 函式複製現有工作階段。系統會建立對話的分支版本，保留脈絡和初始提示。

clone() 函式會接收選用的選項物件，其中包含 signal 欄位，可讓您傳遞 AbortSignal 來終止複製的會話。

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const clonedSession = await session.clone({
  signal: controller.signal,
});

終止工作階段

如果不再需要工作階段，請呼叫 destroy() 來釋出資源。工作階段遭到終止後，就無法再使用，且所有正在執行的作業都會中止。如果您打算經常提示模型，建議保留工作階段，因為建立工作階段可能需要一些時間。

await session.prompt(
  "You are a friendly, helpful assistant specialized in clothing choices."
);

session.destroy();

// The promise is rejected with an error explaining that
// the session is destroyed.
await session.prompt(
  "What should I wear today? It is sunny, and I am choosing between a t-shirt
  and a polo."
);

示範

我們建構了多個範例，探索 Prompt API 的多種用途。下列是網頁應用程式的試用版：

如要在 Chrome 擴充功能中測試 Prompt API，請安裝示範擴充功能。您可以在 GitHub 上找到擴充功能原始碼。

成效策略

網頁版 Prompt API 仍在開發中。在我們建構這項 API 時，請參閱工作階段管理最佳做法，以獲得最佳成效。

權限政策、iframe 和 Web Worker

根據預設，Prompt API 僅適用於頂層視窗和同源 iframe。您可以使用 Permission Policy allow="" 屬性，將 API 存取權委派給跨來源 iframe：

<!--
  The hosting site at https://main.example.com can grant a cross-origin iframe
  at https://cross-origin.example.com/ access to the Prompt API by
  setting the `allow="language-model"` attribute.
-->
<iframe src="https://cross-origin.example.com/" allow="language-model"></iframe>

由於要為每個 Worker 建立負責的文件，以檢查權限政策狀態，因此目前 Web Worker 無法使用 Prompt API。

您的意見將直接影響我們建構及實作這個 API 和所有內建 AI API 的未來版本。

如要提供 Chrome 實作方面的意見，請提出錯誤報告或功能要求。
如要分享對 API 形式的意見，請在現有問題中留言，或在 Prompt API GitHub 存放區中開啟新問題。
加入搶先體驗計畫。