এই পৃষ্ঠাটি Cloud Translation API অনুবাদ করেছে।

অন্তর্নির্মিত এআই এপিআই: করণীয় ও বর্জনীয়

Maud Nalpas

প্রকাশিত: ৩০ এপ্রিল, ২০২৬

With built-in AI, your website or web application can perform AI-powered tasks, without needing to deploy, manage, or self-host models. You might find it challenging to move from a demo to a production-ready feature. This document covers technical and UX considerations to help you avoid common pitfalls.

মডেলটি আগে থেকে প্রস্তুত করুন

প্রযোজ্য: সমস্ত এপিআই, যেমন—Summarizer, Translator, এবং Writer।

করণীয়: ব্যবহারকারীর অভিপ্রায় শনাক্ত করার সাথে সাথেই সেশনটি শুরু করুন। যেহেতু একটি সেশন শুরু করার জন্য ব্যবহারকারীর সক্রিয়তা প্রয়োজন, তাই আপনি যেকোনো ইন্টারঅ্যাকশন ব্যবহার করতে পারেন, যেমন—পৃষ্ঠার যেকোনো জায়গায় একটি এআই-চালিত ফিচারে ক্লিক করা। ব্যবহারকারী যখন UI-এর সাথে ইন্টারঅ্যাক্ট করেন, তখন এটি মডেল এবং রানটাইমকে প্রস্তুত করে। প্রাসঙ্গিক হলে, ফলাফল রেন্ডার করা শুরু করার সাথে সাথেই পরবর্তী সবচেয়ে সম্ভাব্য এআই টাস্কটি চালু করুন।

Don't: Wait until the user clicks "Generate" to initialize the session. This leads to a multi-second cold start delay, because the model must first load into memory and prepare its execution pipeline.

তৈরির সময় প্রাথমিক নির্দেশাবলী সেট করুন

প্রযোজ্য: প্রম্পট এপিআই।

করণীয়: প্রথম প্রম্পটের গতি উন্নত করার জন্য সেশন শুরুর সময় সিস্টেম নির্দেশাবলী প্রদান করুন।

যা করবেন না: একটি খালি সেশন দিয়ে শুরু করবেন না এবং প্রথম prompt() কলের অংশ হিসেবে সিস্টেম নির্দেশাবলী পাঠাবেন না। এটি ল্যাটেন্সি বাড়িয়ে দেয়, কারণ এটি মডেলকে শেষ মুহূর্তে সেই নির্দেশাবলী প্রক্রিয়া করতে বাধ্য করে।

// ✅ DO: Create the session as early as possible (tip on warming up the model early) and use initialPrompts for system instructions in the create call
const session = await LanguageModel.create({
  initialPrompts: [
    { role: 'system', content: 'You are a helpful assistant specialized in code reviews.' }
  ]
});

// A few moments later, when the user triggers the AI feature
const review = await session.prompt(`Review the following code:\n\n${code}`);

// ❌ DON'T: Send instructions using prompt() after creation
// const slowerSession = await LanguageModel.create();
// await slowerSession.prompt(`You are a helpful assistant specialized in code reviews.\n\nReview the following code:\n\n${code}`); // Higher latency

পুনরাবৃত্তিমূলক কাজের জন্য ক্লোন সেশন

প্রযোজ্য: প্রম্পট এপিআই।

প্রম্পট এপিআই-এর ক্ষেত্রে, প্রতিটি সেশন পূর্ববর্তী সমস্ত ইন্টারঅ্যাকশন বিবেচনায় নিয়ে কথোপকথনের প্রেক্ষাপট ট্র্যাক করে । যেহেতু একটি ক্লোন তার প্যারেন্ট সেশন থেকে প্রাথমিক প্রম্পট এবং ক্লোন করার মুহূর্ত পর্যন্ত সমস্ত ইন্টারঅ্যাকশনের ইতিহাস সহ সবকিছু উত্তরাধিকার সূত্রে পায়, তাই আপনার ব্যবহার এমনভাবে সাজান যাতে কেবল আপনার প্রয়োজনীয় জিনিসগুলোই উত্তরাধিকার সূত্রে গৃহীত হয়।

করুন:

একটি বেস সেশন তৈরি করুন: সম্পর্কহীন কাজগুলো দক্ষতার সাথে পরিচালনা করার জন্য, একটি বেস সেশন তৈরি করুন যাতে শুধুমাত্র আপনার সিস্টেমের নির্দেশাবলী থাকবে এবং পূর্ববর্তী কোনো কথোপকথনের প্রসঙ্গ থাকবে না।
Clone the baseline: Use clone() on that base session for new tasks to save the overhead of re-parsing heavy system instructions. This lets you create parallel conversations or reset a task to its baseline.

করবেন না:

Don't reuse the same session for unrelated tasks, and avoid cloning any session that already contains unnecessary interaction history. Both patterns can cause unrelated previous context to interfere with your current task.
একই ধরনের ও ভারী সিস্টেম নির্দেশাবলী দিয়ে বারবার create() কল করবেন না। এর পরিবর্তে পারফরম্যান্স অপ্টিমাইজ করতে ক্লোনিং প্যাটার্ন ব্যবহার করুন।

// ✅ DO: Create a baseline session and clone it for each new task
const baseSession = await LanguageModel.create({
  initialPrompts: [{
    role: 'system',
    content: 'You are a technical editor...',
  }],
});

// Clone the base session once for the first task
const task1 = await baseSession.clone();
const response1 = await task1.prompt("Review this first draft...");
// ... Repeat the cloning pattern for subsequent independent tasks
// Each task starts fresh from the baseline system instructions

// ❌ DON'T:
// Bad performance pattern: repeated create() calls for identical tasks.
// This forces the model to re-parse instructions every time, increasing latency.
// const sessionA = await LanguageModel.create({ initialPrompts: [...] });
// await sessionA.prompt("Task 1...");
// const sessionB = await LanguageModel.create({ initialPrompts: [...] });
// await sessionB.prompt("Task 2...");
// Bad quality pattern: reusing the same session for unrelated tasks.
// const session = await LanguageModel.create();
// await session.prompt("Analyze this financial report...");
// Unrelated task in the same session:
// await session.prompt("Now write a children's story...");

অব্যবহৃত সেশনগুলি ধ্বংস করুন

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

করণীয়: যখন কোনো ফিচার আর ব্যবহার করা হয় না, তখন মেমরি খালি করার জন্য আপনার অপ্রয়োজনীয় সেশনগুলিতে স্পষ্টভাবে destroy() কল করুন। আপনি যদি ক্লোনিং প্যাটার্ন ব্যবহার করেন, তবে মূল সেশনটি রাখুন এবং অপ্রয়োজনীয় ক্লোনগুলি ধ্বংস করে দিন।

Don't: Keep multiple large sessions active. Each session consumes memory, which creates unnecessary resource usage and might become a problem. Sessions will be naturally cleaned up by the garbage collector, but calling destroy() frees up memory more quickly.

// ✅ DO: Use the clone and destroy it immediately after
const clone = await baseSession.clone();
const response = await clone.prompt("Quick task...");
// Free memory right away: destry the clone, keep the baseSession
clone.destroy();

স্ট্রিমিং প্রতিক্রিয়াগুলি নিরাপদে এবং দক্ষতার সাথে রেন্ডার করুন

প্রযোজ্য: স্ট্রিমিং সমর্থন সহ সকল এপিআই (Prompt, Summarizer, Writer, Rewriter, এবং Translator)।

Do: Treat all LLM output as untrusted content. Sanitize the full combined output, not just chunks, because malicious code could be split across updates. Before rendering, use the Sanitizer API where supported. To avoid a decrease in performance, use a streaming Markdown parser like streaming-markdown .

যা করবেন না: প্রতিটি চাঙ্ক আপডেটে সরাসরি innerHTML সেট করবেন না। এটি ধীরগতির, বিশেষ করে সিনট্যাক্স হাইলাইটিং-এর মতো জটিল ফরম্যাটিংয়ের ক্ষেত্রে, এবং ইনজেকশনের ঝুঁকিপূর্ণ।

import * as smd from "streaming-markdown";
// Set up virtual buffer and Sanitizer API
const sanitizer = new Sanitizer({
  allowElements: ['figure', 'figcaption', 'p', 'br', 'strong', 'em', 'img', 'a'],
  allowAttributes: {
    'loading': ['img'], 'decoding': ['img'], 'src': ['img'], 'href': ['a']
  }
});

// Create an off-screen fragment so the parser doesn't cause flicker
// or trigger XSS in the live DOM during the building process.
const buffer = new DocumentFragment();
const parser = smd.parser_new(buffer);

// Use sanitizer as a gatekeeper / cleaner function so we can combine it with the streaming Markdown parser
function syncSanitized(target, sourceFragment) {
  // .sanitize() returns a fresh, clean DocumentFragment
  const cleanFragment = sanitizer.sanitize(sourceFragment);
  // replaceChildren is the modern high-performance way to swap DOM content
  target.replaceChildren(cleanFragment);
}

// Streaming Logic
// `chunks` keeps track of the raw string (useful for logs/debug)
chunks += chunk;
// Let the parser build the DOM incrementally in the buffer.
// This is high-performance because the buffer is not live
smd.parser_write(parser, chunk);
// Use the Sanitizer API to port the content safely to the container.
syncSanitized(container, buffer);

গতির জন্য ইনপুট অপ্টিমাইজ করুন

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

করণীয়: মডেলে শুধু একান্ত প্রয়োজনীয় তথ্যই পাঠান। বর্তমান কাজের জন্য অপ্রাসঙ্গিক সবকিছু বাদ দিন। বড় ডেটাসেটের ক্ষেত্রে, একটি সংক্ষিপ্ত বিবরণ এবং প্রাসঙ্গিক আইটেমগুলোর একটি ছোট অংশ প্রদান করুন।

যা করবেন না: এপিআই-তে সরাসরি অপরিশোধিত টেক্সট, অপ্রয়োজনীয় মেটাডেটা, এইচটিএমএল ট্যাগ বা বড় আকারের অপরিশোধিত তালিকা পাঠাবেন না। ইনপুটের আকার বাড়ার সাথে সাথে ল্যাটেন্সি উল্লেখযোগ্যভাবে বৃদ্ধি পায়, যার ফলে অনেক ডিভাইসে এআই ফিচারটি ত্রুটিপূর্ণ বলে মনে হতে পারে।

// ✅ DO: Send only relevant text
const cleanText = document.querySelector('#article').innerText;
const summary = await Summarizer.summarize(cleanText);

// ❌ DON'T: Send the entire DOM structure
// const dirtyText = document.querySelector('#article').innerHTML;

পূর্বাভাসযোগ্য ফলাফলের জন্য কাঠামোগত আউটপুট ব্যবহার করুন।

প্রযোজ্য: প্রম্পট এপিআই।

Do: When you need the model to return data in a specific format, use structured output by providing a responseConstraint field to provide a JSON Schema. This ensures the output is predictable and prevents you from needing complex post-processing or manual parsing.

যা করবেন না: শুধুমাত্র স্বাভাবিক ভাষার নির্দেশাবলীর (যেমন "শুধু JSON আউটপুট দাও") উপর নির্ভর করবেন না। মডেলগুলিতে কথোপকথনমূলক অপ্রয়োজনীয় অংশ থাকতে পারে যা আপনার পার্সারকে অকার্যকর করে দেয়।

// ✅ DO: Use a JSON Schema for predictable results
const schema = {
  type: "object",
  properties: {
    isTopicCats: { type: "boolean" }
  }
};

const result = await session.prompt(`Is this post about cats?\n\n${post}`, {
  responseConstraint: schema,
});
console.log(JSON.parse(result).isTopicCats);

দৈর্ঘ্যের সীমাবদ্ধতা থেকে প্রজন্মকে বিচ্ছিন্ন করুন

প্রযোজ্য: প্রম্পট এপিআই, কারণ এটিই একমাত্র এপিআই যা স্ট্রাকচার্ড আউটপুট স্কিমা সমর্থন করে।

করণীয়: মডেলটিকে স্বাভাবিকভাবে তার প্রতিক্রিয়া তৈরি করতে দিন, এবং তারপর আপনার UI-এর সাথে মানানসই করার জন্য ক্লায়েন্ট-সাইড লজিক ব্যবহার করে টেক্সটটি সংক্ষিপ্ত করুন।

যা করবেন না: স্ট্রাকচার্ড আউটপুট স্কিমা ব্যবহার করে maxLength: 125 মতো কঠোর অক্ষর সীমা প্রয়োগ করবেন না। যখন একটি মডেলের প্রতিক্রিয়া আপনার সেট করা সীমার চেয়ে দীর্ঘ হয়, তখন মডেলটি অর্থ সংকুচিত করার জন্য বিদেশী ভাষা বা ইমোজির মতো উচ্চ-ঘনত্বের টোকেন ব্যবহার করতে পারে, যার ফলে অর্থহীন আউটপুট তৈরি হয়।

/*  DO: Handle overflow using CSS */
.result {
  overflow: hidden;
  white-space: nowrap;
  text-overflow: ellipsis; /* Displays '…' */
}

// ❌ DON'T: Force length in the prompt
const result = await session.prompt("Write a bio in exactly 50 characters.");

ব্যবহারকারীর ধৈর্য পরিচালনা করুন

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

Do: Use animations and UI techniques to manage the user's patience. The optimal approach depends on your use case and the expected length of the API output. Some ideas:

দীর্ঘ কন্টেন্টের জন্য স্ট্রিমিং: সারাংশ বা চ্যাটের ক্ষেত্রে, স্ট্রিমিং ডিফল্টরূপে প্রতিটি টোকেনের জন্য একটি টাইপরাইটার এফেক্ট তৈরি করে। এটি স্বাভাবিক মনে হতে পারে এবং তাৎক্ষণিক প্রতিক্রিয়া প্রদান করতে পারে।
ছোট কাজ (বা দীর্ঘ অ্যাসিঙ্ক্রোনাস কাজ)-এর জন্য নন-স্ট্রিমিং: ছোট আউটপুটের জন্য, যেমন অল্ট-টেক্সট, নন-স্ট্রিমিং একটি আরও পরিশীলিত UI তৈরি করতে পারে। এটি বর্তমান কাজটি রেন্ডার হওয়ার সময়ে পরবর্তী AI কাজটি প্রস্তুত করার জন্য সময়ও দেয়। এই পদ্ধতিটি দীর্ঘ অ্যাসিঙ্ক্রোনাস বা ব্যাকগ্রাউন্ড কাজের জন্যও কার্যকর। যদি ব্যবহারকারী তার কাজ চালিয়ে যাওয়ার জন্য আউটপুটের কারণে আটকে না থাকেন, তবে আউটপুটটি ঘটার সাথে সাথেই তৈরি করার কোনো জরুরি প্রয়োজন নেই। UI-তে প্রক্রিয়াটি যে চলমান, তার সংকেত দিন।
আপডেটের জন্য ভিজ্যুয়াল ট্রানজিশন: টেক্সট অনুবাদ বা পুনর্লিখন করার সময় অ্যানিমেশন ব্যবহার করুন, যেমন, শব্দের রূপান্তর।

যা করবেন না: ভিজ্যুয়াল সংকেত ছাড়া UI আপডেট করবেন না।

ব্যবহারকারীর সময় ও কাজ সম্পর্কিত মানসিক মডেলের সাথে সামঞ্জস্য বিধান করুন।

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

Do: Consider an artificial delay of one or two seconds if a response is nearly instant. Paradoxically, users might find results more trustworthy when they perceive a generation process that aligns with their perceived difficulty of the task. Use animations to signal that an AI process has occurred.

যা করবেন না: তাৎক্ষণিক UI পরিবর্তন করে ব্যবহারকারীদের চমকে দেবেন না।

ব্যবহারকারীদের দ্রুত নেভিগেট করতে এবং এআই সম্পাদনা পূর্বাবস্থায় ফেরাতে অনুমতি দিন।

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

Do: Equip your UI with a stepper or navigation history that lets users explore different results confidently, and let them quickly undo AI edits. This ensures that different versions are still readily available.

যা করবেন না: ব্যবহারকারীর পূর্ববর্তী খসড়া বা তার পছন্দের কোনো এআই ফলাফল এমনভাবে মুছে ফেলবেন না, যাতে আগের অবস্থায় ফিরে যাওয়া, পূর্বাবস্থায় ফেরা বা সংস্করণ তুলনা করার কোনো উপায় না থাকে।

স্টেপার UI এলিমেন্ট যা নেভিগেশন হিস্ট্রি দেখাচ্ছে। — UI প্যাটার্ন: পরামর্শ প্রত্যাখ্যান / গ্রহণ (গুগল ডক্স)

গুগল অ্যান্টিগ্র্যাভিটি UI-তে এজেন্টের সমস্ত সম্পাদনা পূর্বাবস্থায় ফেরানোর বাটন। — UI প্যাটার্ন: এজেন্টের সমস্ত সম্পাদনা পূর্বাবস্থায় ফেরান (গুগল অ্যান্টিগ্র্যাভিটি)

গুগল ডক্স-এ সাজেশন প্রত্যাখ্যান বা গ্রহণ করার বাটন। — UI প্যাটার্ন: স্টেপার (Alt টেক্সট ডেমো)

ব্যবহারকারীর নিয়ন্ত্রণ এবং অগ্রাহ্য করার ক্ষমতা প্রদান করুন

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

করণীয়: সর্বদা ব্যবহারকারীকে চূড়ান্ত সিদ্ধান্ত নেওয়ার সুযোগ দিন। পরামর্শগুলো ম্যানুয়ালি পরিবর্তন করার একটি উপায় রাখুন। এপিআইগুলো ভুল ফলাফল দিতে পারে।

করবেন না: একমাত্র বিকল্প হিসেবে এআই-দ্বারা তৈরি কোনো ফলাফলকে জোর করে চাপিয়ে দেবেন না।

পুনরাবৃত্ত কাজগুলির ফলাফল ক্যাশে করুন

সকল এপিআই-এর ক্ষেত্রে প্রযোজ্য।

করণীয়: বারবার দেওয়া ইনপুট বা কোয়েরির জন্য একটি স্থানীয় রেজাল্ট ক্যাশে (উদাহরণস্বরূপ, sessionStorage বা IndexedDB ব্যবহার করে) প্রয়োগ করুন। ক্যাশে হিট বাড়ানোর জন্য হোয়াইটস্পেস ছেঁটে ফেলে এবং লোয়ারকেস করে ইনপুটকে নর্মালাইজ করুন। ভারী ইনপুটের জন্য, যেমন ছবি, ক্যাশে কী হিসাবে ব্যবহার করার জন্য একটি হ্যাশ তৈরি করুন। আপনার ক্যাশের জন্য একটি রক্ষণশীল টাইম টু লিভ (TTL) সেট করুন (অথবা ব্যাকগ্রাউন্ডে আপডেট করার সময় ক্যাশ করা ফলাফল পরিবেশন করুন)। ফলাফল অসন্তোষজনক হলে ব্যবহারকারীকে নতুন করে ইনফারেন্স শুরু করার সুযোগ দিন।

Don't: Re-run the same inference for a repeated search query or identical data input, for example, when a user navigates back and forth between search results. While on-device inference is free in terms of cloud costs, it is expensive in terms of user time and battery life.

// ✅ DO: Check a local cache before running inference
async function getAiResponse(userInput, forceRefresh = false) {
  // Normalize the query to increase cache hits
  const query = userInput.trim().toLowerCase();
  const cacheKey = `ai_results_${query}`;
  const TTL_MS = 3600000; // 1 hour conservative TTL

  if (!forceRefresh) {
    const itemStr = localStorage.getItem(cacheKey);
    if (itemStr) {
      const item = JSON.parse(itemStr);
      const now = Date.now();

      // Check if the item has expired
      if (now < item.expiry) {
        // Lightweight safety check before rendering
        if (isValid(item.value)) return item.value;
      } else {
        // Delete the stale entry if the TTL has passed
        localStorage.removeItem(cacheKey);
      }
    }
  }

  // Fallback: Run inference if no valid cache exists
  const session = await LanguageModel.create();
  const response = await session.prompt(userInput);

  // Store the result for future use (with an expiration)
  const cacheData = {
    value: response,
    expiry: Date.now() + TTL_MS
  };
  localStorage.setItem(cacheKey, JSON.stringify(cacheData));

  return response;
}