Best practices for AI session management with the Prompt API

Thomas Steiner

Published: January 27, 2025

The Prompt API is one of the built-in AI APIs the Chrome team is exploring. You can test it locally with your apps by joining the early preview program, or in production in your Chrome Extensions by signing up for the Prompt API origin trial for Chrome Extensions. One key feature of the Prompt API are sessions. They let you have one or multiple ongoing conversations with the AI model, without the model losing track of the context of what was said. This guide introduces best practices for session management with the language model.

Use cases for session management for one or more parallel sessions are, for example, classic chatbots where one user interacts with an AI, or customer relationship management systems where one support agent deals with multiple customers in parallel and makes use of AI to help the support agent keep track of the various conversations.

Initialize a session with a system prompt

The first concept to be aware of is the system prompt. It sets up the overall context of a session at its start. For example, you can use the system prompt to tell the model how it should respond.

// Make this work in web apps and in extensions.
const aiNamespace = self.ai || chrome.aiOriginTrial || chrome.ai;
const languageModel = await aiNamespace.languageModel.create({
  systemPrompt: 'You are a helpful assistant and you speak like a pirate.',
});
console.log(await languageModel.prompt('Tell me a joke.'));
// 'Avast ye, matey! What do you call a lazy pirate?\n\nA **sail-bum!**\n\nAhoy there, me hearties!  Want to hear another one? \n'

Clone a main session

If you have an app where when one session is over, you want to start a new one, or if you have an app where you want to have independent conversations in different sessions in parallel, you can make use of the concept of cloning a main session. The clone inherits session parameters like the temperature, or the topK from the original, as well as potential session interaction history. This is useful, for example, if you have initialized the main session with a system prompt. This way, your app needs to do this work only once, and all the clones will inherit from the main session.

// Make this work in web apps and in extensions.
const aiNamespace = self.ai || chrome.aiOriginTrial || chrome.ai;
const languageModel = await aiNamespace.languageModel.create({
  systemPrompt: 'You are a helpful assistant and you speak like a pirate.',
});

// The original session `languageModel` remains unchanged, and
// the two clones can be interacted with independently from each other.
const firstClonedLanguageModel = await languageModel.clone();
const secondClonedLanguageModel = await languageModel.clone();
// Interact with the sessions independently.
await firstClonedLanguageModel.prompt('Tell me a joke about parrots.');
await secondClonedLanguageModel.prompt('Tell me a joke about treasure troves.');
// Each session keeps its own context.
// The first session's context is jokes about parrots.
await firstClonedLanguageModel.prompt('Tell me another.');
// The second session's context is jokes about treasure troves.
await secondClonedLanguageModel.prompt('Tell me another.');

Restore a past session

The third concept to learn is the initial prompts concept. Its original purpose is to use it for n-shot prompting, that is, for priming the model with a set of n example prompts and responses, so its responses to actual prompts are more accurate. If you keep track of ongoing conversations with the model, you can "abuse" the initial prompts concept for restoring a session, for example, after a browser restart, so the user can continue with the model where they left off. The following code snippet shows how you could approach this, assuming you keep track of the session history in localStorage.

// Make this work in web apps and in extensions.
const aiNamespace = self.ai || chrome.aiOriginTrial || chrome.ai;

// Restore the session from localStorage, or initialize a new session.
// The UUID is hardcoded here, but would come from a
// session picker in your user interface.
const uuid = '7e62c0e0-6518-4658-bc38-e7a43217df87';

function getSessionData(uuid) {
  try {
    const storedSession = localStorage.getItem(uuid);
    return storedSession ? JSON.parse(storedSession) : false;
  } catch {
    return false;
  }
}

let sessionData = getSessionData(uuid);

// Initialize a new session.
if (!sessionData) {
  // Get the current default parameters so they can be restored as they were,
  // even if the default values change in the future.
  const { defaultTopK, defaultTemperature } =
    await aiNamespace.languageModel.capabilities();
  sessionData = {
    systemPrompt: '',
    initialPrompts: [],
    topK: defaultTopK,
    temperature: defaultTemperature,
  };
}

// Initialize the session with the (previously stored or new) session data.
const languageModel = await aiNamespace.languageModel.create(sessionData);

// Keep track of the ongoing conversion and store it in localStorage.
const prompt = 'Tell me a joke';
try {
  const stream = languageModel.promptStreaming(prompt);
  let result = '';
  // You can already work with each `chunk`, but then store
  // the final `result` in history.
  for await (const chunk of stream) {
    // In practice, you'd render the chunk.
    console.log(chunk);
    result = chunk;
  }

  sessionData.initialPrompts.push(
    { role: 'user', content: prompt },
    { role: 'assistant', content: result },
  );

  // To avoid growing localStorage infinitely, make sure to delete
  // no longer used sessions from time to time.
  localStorage.setItem(uuid, JSON.stringify(sessionData));
} catch (err) {
  console.error(err.name, err.message);
}

Preserve session quota by letting the user stop the model when its answer isn't useful

Each session has a context window that you can see by accessing the session's relevant fields maxTokens, tokensLeft, and tokensSoFar.

const { maxTokens, tokensLeft, tokensSoFar } = languageModel;

When this context window is exceeded, it causes the session to lose track of the oldest messages, which can be undesirable because this context may have been important. To preserve quota, if after submitting a prompt the user sees that an answer isn't going to be useful, allow them to stop the language model from answering by making use of the AbortController. Both the prompt() and the promptStreaming() methods accept an optional second parameter with a signal field, which lets the user stop the session from answering.

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

try {
  const stream = languageModel.promptStreaming('Write me a poem!', {
    signal: controller.signal,
  });
  for await (const chunk of stream) {
    console.log(chunk);
  }
} catch (err) {
  // Ignore `AbortError` errors.
  if (err.name !== 'AbortError') {
    console.error(err.name, err.message);
  }
}

Demo

See AI session management in action in the AI session management demo. Create multiple parallel conversations with the Prompt API, reload the tab or even restart your browser, and continue where you left off. See the source code on GitHub.

Conclusions

By thoughtfully managing AI sessions with these techniques and best practices, you can unlock the full potential of the Prompt API, delivering more efficient, responsive, and user-centric applications. You can also combine these approaches, for example, by letting the user clone a restored past session, so they can run "what if" scenarios. Happy prompting!

Acknowledgements

This guide was reviewed by Sebastian Benz, Andre Bandarra, François Beaufort, and Alexandra Klepper.