این صفحه به‌وسیله ‏Cloud Translation API‏ ترجمه شده است.

با Prompt API یک بازی حدس زدن بسازید

Brecht De Ruyte

تاریخ انتشار: 10 اکتبر 2025

کودکان در سن مدرسه که در سال 2014 بازی حدس بزن چه کسی را بازی می کنند.

بازی رومیزی کلاسیک، حدس بزنید چه کسی؟ ، یک استاد کلاس در استدلال قیاسی است. هر بازیکن با صفحه ای از چهره ها شروع می کند و از طریق یک سری سوالات بله یا خیر، احتمالات را محدود می کند تا زمانی که بتوانید با اطمینان شخصیت مخفی حریف خود را شناسایی کنید.

پس از دیدن یک نسخه نمایشی از هوش مصنوعی داخلی در Google I/O Connect، از خودم پرسیدم: اگر بتوانم یک Guess Who را بازی کنم چه می‌شد؟ بازی در برابر هوش مصنوعی که در مرورگر زندگی می کند؟ با هوش مصنوعی سمت مشتری، عکس‌ها به صورت محلی تفسیر می‌شوند، بنابراین یک حدس بزنید چه کسی سفارشی است؟ دوستان و خانواده در دستگاه من خصوصی و امن باقی می ماند.

پیشینه من در درجه اول در توسعه UI و UX است و من به ایجاد تجربیات پیکسلی بی نقص عادت کرده ام. امیدوارم بتوانم دقیقاً این کار را با تفسیرم انجام دهم.

برنامه من، هوش مصنوعی حدس بزن چه کسی؟ ، با React ساخته شده است و از Prompt API و یک مدل داخلی مرورگر برای ایجاد یک حریف شگفت آور توانا استفاده می کند. در این فرآیند، من متوجه شدم که گرفتن نتایج "پیکسل کامل" چندان ساده نیست. اما، این نرم افزار نشان می دهد که چگونه می توان از هوش مصنوعی برای ساخت منطق بازی متفکرانه استفاده کرد و اهمیت مهندسی سریع برای اصلاح این منطق و به دست آوردن نتایج مورد انتظار شما را نشان می دهد.

به خواندن ادامه دهید تا در مورد ادغام هوش مصنوعی داخلی، چالش‌هایی که با آن‌ها روبرو بودم و راه‌حل‌هایی که به آن‌ها دست یافتم، بیاموزید. می توانید بازی را انجام دهید و کد منبع را در GitHub پیدا کنید.

پایه بازی: یک برنامه React

قبل از اینکه به پیاده سازی هوش مصنوعی نگاه کنید، ساختار برنامه را بررسی می کنیم. من یک برنامه استاندارد React با TypeScript ساختم، با یک فایل App.tsx مرکزی که به عنوان هادی بازی عمل می کند. این فایل حاوی:

وضعیت بازی : شماره‌ای که مرحله فعلی بازی را ردیابی می‌کند (مانند PLAYER_TURN_ASKING ، AI_TURN ، GAME_OVER ). این مهم‌ترین حالت است، زیرا تعیین می‌کند که رابط چه چیزی را نمایش دهد و چه اقداماتی در دسترس بازیکن باشد.
لیست شخصیت‌ها : لیست‌های متعددی وجود دارد که شخصیت‌های فعال، شخصیت مخفی هر بازیکن و شخصیت‌هایی که از صفحه حذف شده‌اند را مشخص می‌کند.
گپ بازی : گزارشی از سوالات، پاسخ‌ها و پیام‌های سیستمی.

رابط به اجزای منطقی تقسیم می شود:

GameBoard شبکه ای از کاراکترها و کنترل های چت را برای مدیریت تمام ورودی های کاربر نمایش می دهد.

با افزایش ویژگی های بازی، پیچیدگی آن نیز افزایش یافت. در ابتدا، کل منطق بازی در یک React Hook سفارشی بزرگ، useGameLogic مدیریت می‌شد، اما به سرعت برای پیمایش و اشکال‌زدایی بیش از حد بزرگ شد. برای بهبود قابلیت نگهداری، من این قلاب را به چند قلاب تبدیل کردم که هر کدام یک مسئولیت دارند. به عنوان مثال:

useGameState حالت اصلی را مدیریت می کند
usePlayerActions برای نوبت بازیکن است
useAIActions برای منطق هوش مصنوعی است

قلاب اصلی useGameLogic اکنون به عنوان یک آهنگساز تمیز عمل می کند و این قلاب های کوچکتر را در کنار هم قرار می دهد. این تغییر معماری عملکرد بازی را تغییر نداد، اما پایگاه کد را بسیار تمیزتر کرد.

منطق بازی با Prompt API

هسته اصلی این پروژه استفاده از Prompt API است.

من منطق بازی هوش مصنوعی را به builtInAIService.ts اضافه کردم. اینها وظایف کلیدی آن است:

پاسخ های محدود کننده و باینری را مجاز کنید.
آموزش استراتژی بازی مدل.
آنالیز مدل را آموزش دهید.
به مدل فراموشی بدهید.

پاسخ های محدود کننده و باینری را مجاز کنید

چگونه بازیکن با هوش مصنوعی تعامل می کند؟ وقتی بازیکنی می پرسد «آیا شخصیت شما کلاه دارد؟»، هوش مصنوعی باید به تصویر شخصیت مخفی خود نگاه کند و پاسخ روشنی بدهد.

اولین تلاش های من بی نظم بود. پاسخ محاوره‌ای بود: «نه، شخصیتی که به او فکر می‌کنم، ایزابلا، به نظر نمی‌رسد که کلاه بر سر دارد»، به‌جای ارائه یک بله یا خیر باینری. در ابتدا، من این را با یک دستور بسیار دقیق حل کردم، اساساً به مدل دیکته کردم که فقط با "بله" یا "خیر" پاسخ دهد.

در حالی که این کار می کرد، من از راه بهتری با استفاده از خروجی ساختاریافته یاد گرفتم. با ارائه طرحواره JSON به مدل، می توانم پاسخ درست یا نادرست را تضمین کنم.

const schema = { type: "boolean" };
const result = session.prompt(prompt, { responseConstraint: schema });

این به من این امکان را داد که درخواست را ساده کنم و به کد من اجازه بدهم که پاسخ را به طور قابل اعتماد مدیریت کند:

JSON.parse(result) ? "Yes" : "No"

آموزش استراتژی بازی مدل

گفتن به مدل برای پاسخ دادن به یک سوال بسیار ساده تر از این است که مدل شروع کند و سؤال بپرسد. یک حدس خوب چه کسی؟ بازیکن سوالات تصادفی نمی پرسد آنها سؤالاتی می پرسند که اکثر شخصیت ها را به یکباره حذف می کند. یک سوال ایده آل با استفاده از سوالات باینری، کاراکترهای باقیمانده احتمالی را به نصف کاهش می دهد.

چگونه به یک مدل آن استراتژی را آموزش می دهید؟ باز هم مهندسی سریع. دستور generateAIQuestion() در واقع یک درس مختصر در Guess Who است؟ نظریه بازی

در ابتدا، از مدل خواستم «یک سؤال خوب بپرسد». نتایج غیر قابل پیش بینی بود. برای بهبود نتایج، محدودیت‌های منفی اضافه کردم. اعلان اکنون شامل دستورالعمل‌هایی شبیه به موارد زیر است:

"مهم: فقط در مورد ویژگی های موجود بپرسید"
انتقادی: اصیل باشید. سوالی را تکرار نکنید.

این محدودیت ها تمرکز مدل را محدود می کند، از پرسیدن سؤالات نامربوط جلوگیری می کند، که آن را به حریف بسیار لذت بخشی تبدیل می کند. می‌توانید فایل درخواستی کامل را در GitHub مرور کنید.

تحلیل مدل را آموزش دهید

این سخت ترین و مهم ترین چالش بود. وقتی مدل سؤالی مانند "آیا شخصیت شما کلاه دارد" می پرسد و بازیکن پاسخ منفی می دهد، مدل چگونه می داند که چه شخصیت هایی در تابلوی آنها حذف شده اند؟

مدل باید همه را با کلاه حذف کند. تلاش های اولیه من با خطاهای منطقی همراه بود و گاهی اوقات مدل شخصیت های اشتباه را حذف می کرد یا هیچ شخصیتی را نداشت. همچنین، "کلاه" چیست؟ آیا یک "قطعه" به عنوان "کلاه" به حساب می آید؟ بیایید صادق باشیم، این نیز چیزی است که می تواند در یک بحث انسانی اتفاق بیفتد. و البته اشتباهات کلی اتفاق می افتد. مو از دیدگاه هوش مصنوعی می تواند شبیه کلاه باشد.

من معماری را دوباره طراحی کردم تا درک را از کسر کد جدا کنم:

هوش مصنوعی مسئول تجزیه و تحلیل بصری است . مدل ها در تحلیل بصری برتری دارند. من به مدل دستور دادم که سؤال خود و تجزیه و تحلیل دقیق را در یک طرح سختگیرانه JSON برگرداند. مدل هر شخصیت را در تابلوی خود تجزیه و تحلیل می کند و به این سوال پاسخ می دهد که "آیا این شخصیت این ویژگی را دارد؟" مدل یک شی JSON ساختار یافته را برمی گرداند:
```
{ "character_id": "...", "has_feature": true }
```
یک بار دیگر، داده های ساختاریافته کلید یک نتیجه موفقیت آمیز هستند.
کد بازی از تجزیه و تحلیل برای تصمیم گیری نهایی استفاده می کند . کد برنامه پاسخ بازیکن ("بله" یا "خیر") را بررسی می کند و از طریق تجزیه و تحلیل هوش مصنوعی تکرار می شود. اگر بازیکن گفت "نه"، کد می داند که هر کاراکتری را که has_feature true است حذف می کند.

من متوجه شدم که این تقسیم کار کلیدی برای ساخت برنامه های کاربردی هوش مصنوعی قابل اعتماد است. از هوش مصنوعی برای قابلیت های تحلیلی آن استفاده کنید و تصمیمات باینری را به کد برنامه خود بسپارید.

برای بررسی درک مدل، تصویری از این تحلیل ساختم. این امر تأیید صحیح بودن برداشت مدل را آسان‌تر کرد.

مهندسی سریع

با این حال، حتی با این جدایی، متوجه شدم که درک مدل هنوز می تواند ناقص باشد. ممکن است قضاوت اشتباهی داشته باشد که آیا شخصیتی عینک زده است یا خیر، که منجر به حذف نادرست و ناامیدکننده شود. برای مبارزه با این، یک فرآیند دو مرحله‌ای را آزمایش کردم: هوش مصنوعی سؤال خود را می‌پرسید. پس از دریافت پاسخ بازیکن، آنالیز دوم و تازه را با پاسخ به عنوان زمینه انجام می دهد. تئوری این بود که نگاه دوم ممکن است خطاهایی را از نگاه اول بگیرد.

در اینجا نحوه عملکرد آن جریان آمده است:

چرخش هوش مصنوعی (تماس 1 API) : هوش مصنوعی می پرسد، "آیا شخصیت شما ریش دارد؟"
نوبت بازیکن : بازیکن به شخصیت مخفی خود که تراشیده شده است نگاه می کند و پاسخ می دهد: "نه".
چرخش هوش مصنوعی (تماس 2 API) : هوش مصنوعی عملاً از خود می‌خواهد دوباره به همه کاراکترهای باقیمانده خود نگاه کند و بر اساس پاسخ بازیکن تعیین کند که کدام یک را حذف کند.

در مرحله دو، مدل هنوز ممکن است شخصیتی با ته ریش سبک را به‌عنوان «ریش نداشتن» اشتباه درک کند و نتواند آنها را حذف کند، حتی اگر کاربر انتظار داشته باشد. خطای ادراک اصلی برطرف نشد و مرحله اضافی فقط نتایج را به تاخیر انداخت. هنگام بازی با حریف انسانی، می‌توانیم توافق یا توضیحی را در این مورد مشخص کنیم. در تنظیمات فعلی با حریف هوش مصنوعی ما، این مورد نیست.

این فرآیند تاخیر را از یک تماس API دوم اضافه کرد، بدون اینکه افزایش قابل توجهی در دقت به دست آورد. اگر مدل بار اول اشتباه بود، بار دوم نیز اغلب اشتباه بود. من فقط یک بار درخواست بازبینی را برگرداندم.

به جای افزودن تحلیل بیشتر، بهبود ببخشید

من بر یک اصل UX تکیه کردم: راه حل تجزیه و تحلیل بیشتر نبود، بلکه تجزیه و تحلیل بهتر بود .

من سرمایه‌گذاری زیادی برای اصلاح سریع انجام دادم و دستورالعمل‌های صریح را برای مدل اضافه کردم تا کار آن را دوباره بررسی کرده و روی ویژگی‌های متمایز تمرکز کنم، که ثابت شد استراتژی مؤثرتری برای بهبود دقت است. در اینجا نحوه عملکرد جریان فعلی و قابل اطمینان تر آمده است:

چرخش هوش مصنوعی (تماس API) : از مدل خواسته می‌شود تا هم سؤال و هم تجزیه و تحلیل داخلی خود را همزمان ایجاد کند و یک شی JSON را برگرداند.
1. سوال : آیا شخصیت شما عینک می زند؟
2. تجزیه و تحلیل (داده ها) :
```
[
  {character_id: 'brad', has_feature: true},
  {character_id: 'alex', has_feature: false},
  {character_id: 'gina', has_feature: true},
  ...
]
```
نوبت بازیکن : شخصیت مخفی بازیکن الکس است (بدون عینک)، بنابراین آنها پاسخ می دهند: "نه."
پایان های دور : کد جاوا اسکریپت برنامه کنترل می شود. نیازی نیست چیز دیگری از هوش مصنوعی بپرسد. از طریق داده های تجزیه و تحلیل مرحله 1 تکرار می شود.
1. بازیکن گفت: نه.
2. کد به دنبال هر کاراکتری است که در آن has_feature درست است.
3. براد و جینا را پایین می‌آورد. منطق قطعی و فوری است.

این آزمایش بسیار مهم بود، اما نیاز به آزمون و خطای زیادی داشت. نمیدونستم قراره بهتر بشه یا نه. گاهی حتی بدتر هم می شد. تعیین چگونگی به دست آوردن منسجم ترین نتایج یک علم دقیق نیست (با این حال، اگر هرگز...).

اما پس از چند دور با حریف جدید هوش مصنوعی من، یک مسئله جدید فوق العاده ظاهر شد: یک بن بست.

فرار از بن بست

وقتی فقط دو یا سه شخصیت بسیار مشابه باقی می ماند، مدل در یک حلقه گیر می کرد. این سوال در مورد ویژگی مشترکی که همه آنها به اشتراک گذاشته بودند می پرسد، مانند "آیا شخصیت شما کلاه می گذارد؟"

کد من به درستی این را به عنوان یک چرخش بیهوده شناسایی می‌کند، و هوش مصنوعی ویژگی دیگری به همان اندازه گسترده را که همه کاراکترها به اشتراک گذاشته‌اند، امتحان می‌کند، مانند «آیا شخصیت شما عینک می‌زند؟»

من دستور را با یک قانون جدید افزایش دادم: اگر تلاش برای ایجاد سوال با شکست مواجه شود و سه یا کمتر کاراکتر باقی بماند، استراتژی تغییر می‌کند.

دستورالعمل جدید صریح است: "به جای یک ویژگی گسترده، باید در مورد یک ویژگی بصری خاص، منحصر به فرد یا ترکیبی تر بپرسید تا تفاوت پیدا کنید." به‌عنوان مثال، به‌جای اینکه بپرسیم آیا شخصیت کلاه بر سر دارد، از او خواسته می‌شود که آیا کلاه بیسبال به سر دارد یا خیر.

این مدل را مجبور می‌کند تا به تصاویر بسیار دقیق‌تر نگاه کند تا جزییات کوچکی را بیابد که در نهایت می‌تواند منجر به پیشرفت شود و باعث می‌شود استراتژی اواخر بازی آن در بیشتر مواقع کمی بهتر عمل کند.

به مدل فراموشی بدهید

بزرگترین نقطه قوت یک مدل زبان حافظه آن است. اما در این بازی بزرگترین نقطه قوت آن به نقطه ضعف تبدیل شد. وقتی بازی دوم را شروع می کردم، سوالات گیج کننده یا بی ربط می پرسید. البته حریف باهوش هوش مصنوعی من کل تاریخچه چت بازی قبلی را حفظ می کرد. سعی می‌کرد دو (یا حتی بیشتر) بازی را همزمان معنا کند.

به جای استفاده مجدد از همان جلسه هوش مصنوعی، اکنون به صراحت آن را در پایان هر بازی از بین می برم و اساساً باعث فراموشی هوش مصنوعی می شوم.

وقتی روی Play Again کلیک می کنید، تابع startNewGameSession() برد را بازنشانی می کند و یک جلسه هوش مصنوعی کاملاً جدید ایجاد می کند. این یک درس جالب در مدیریت وضعیت جلسه نه تنها در برنامه، بلکه در خود مدل هوش مصنوعی بود.

زنگ‌ها و سوت‌ها: بازی‌های سفارشی و ورودی صوتی

برای جذاب‌تر کردن تجربه، دو ویژگی اضافی اضافه کردم:

کاراکترهای سفارشی : با getUserMedia() بازیکنان می توانند از دوربین خود برای ایجاد مجموعه 5 کاراکتری خود استفاده کنند. من از IndexedDB برای ذخیره کاراکترها استفاده کردم، یک پایگاه داده مرورگر مناسب برای ذخیره داده های باینری مانند حباب های تصویر. هنگامی که یک مجموعه سفارشی ایجاد می کنید، در مرورگر شما ذخیره می شود و گزینه پخش مجدد در منوی اصلی ظاهر می شود.
توجه: این مدل احتمالاً در تفسیر عکس‌های تولید شده توسط کاربر چالش‌های بیشتری خواهد داشت، زیرا ممکن است نور، پس‌زمینه و زوایای مختلفی داشته باشند. عکس‌های پیش‌فرض، پرتره‌های ثابت و با کیفیت استودیو هستند.
ورودی صوتی : مدل سمت مشتری چند وجهی است . می تواند متن، تصاویر و همچنین صدا را مدیریت کند. با استفاده از MediaRecorder API برای گرفتن ورودی میکروفون، می‌توانم حباب صوتی حاصل را با یک اعلان به مدل وارد کنم: "آوایی زیر را رونویسی کن...". این یک راه سرگرم کننده برای بازی اضافه می کند (و یک روش سرگرم کننده برای دیدن اینکه چگونه لهجه فلاندری من را تفسیر می کند). من این را بیشتر برای نشان دادن تطبیق پذیری این قابلیت جدید وب ایجاد کردم، اما راستش را بگویم، از تایپ کردن بارها و بارها سؤالات خسته شده بودم.

افکار نهایی

ساختن "AI Guess Who?" قطعا یک چالش بود اما با کمی کمک از خواندن اسناد و مقداری هوش مصنوعی برای رفع اشکال هوش مصنوعی (آره... من این کار را انجام دادم)، آزمایش جالبی بود. این پتانسیل بسیار زیاد اجرای یک مدل در مرورگر برای ایجاد یک تجربه خصوصی، سریع و بدون نیاز به اینترنت را برجسته کرد. این هنوز یک آزمایش است و گاهی اوقات حریف کاملاً بازی نمی کند. پیکسل کامل یا منطق کامل نیست. با هوش مصنوعی مولد، نتایج به مدل وابسته است.

به جای تلاش برای کمال، هدفم بهبود نتیجه است.

این پروژه همچنین بر چالش های دائمی مهندسی سریع تاکید کرد. این تحریک واقعاً بخش بزرگی از آن شد، و نه همیشه سرگرم کننده ترین بخش. اما مهم‌ترین درسی که آموختم، طراحی اپلیکیشن برای جدا کردن ادراک از کسر، تقسیم قابلیت‌های هوش مصنوعی و کد بود. حتی با وجود آن جدایی، متوجه شدم که هوش مصنوعی همچنان می‌تواند اشتباهات آشکاری (برای انسان) مرتکب شود، مانند خالکوبی‌های گیج‌کننده برای آرایش یا از دست دادن ردیابی شخصیت مخفی کسی که مورد بحث قرار گرفته است.

هر بار، راه‌حل این بود که اعلان‌ها را واضح‌تر نشان دهیم و دستورالعمل‌هایی را اضافه کنیم که برای انسان واضح است اما برای مدل ضروری است.

گاهی اوقات، بازی ناعادلانه به نظر می رسید. گاهی اوقات، من احساس می‌کردم که هوش مصنوعی از قبل شخصیت مخفی را می‌شناسد، حتی اگر کد هرگز آن اطلاعات را به صراحت به اشتراک نمی‌گذارد. این بخش مهمی از انسان در مقابل ماشین را نشان می دهد:

رفتار هوش مصنوعی فقط لازم نیست درست باشد. نیاز به احساس عادلانه دارد.

به همین دلیل است که من دستورات را با دستورالعمل‌های صریح به‌روزرسانی کردم، مانند «شما نمی‌دانید کدام شخصیت را انتخاب کرده‌ام» و «بدون تقلب». من یاد گرفتم که هنگام ساخت عوامل هوش مصنوعی، باید زمانی را صرف تعریف محدودیت ها کنید، احتمالاً بیشتر از دستورالعمل ها.

تعامل با مدل می تواند همچنان بهبود یابد. با کار با یک مدل داخلی، مقداری از قدرت و قابلیت اطمینان یک مدل عظیم سمت سرور را از دست می‌دهید، اما حریم خصوصی، سرعت و قابلیت آفلاین را به دست می‌آورید. برای چنین بازی ای، آن معاوضه واقعا ارزش تجربه کردن را داشت. آینده هوش مصنوعی سمت مشتری روز به روز بهتر می‌شود، مدل‌ها نیز کوچک‌تر می‌شوند، و من نمی‌توانم منتظر بمانم تا ببینم در آینده چه چیزی می‌توانیم بسازیم.

Brecht De Ruyte

تاریخ انتشار: 10 اکتبر 2025

کودکان در سن مدرسه که در سال 2014 بازی حدس بزن چه کسی را بازی می کنند.

پایه بازی: یک برنامه React

وضعیت بازی : شماره‌ای که مرحله فعلی بازی را ردیابی می‌کند (مانند PLAYER_TURN_ASKING ، AI_TURN ، GAME_OVER ). این مهم‌ترین حالت است، زیرا تعیین می‌کند که رابط چه چیزی را نمایش دهد و چه اقداماتی در دسترس بازیکن باشد.
لیست شخصیت‌ها : لیست‌های متعددی وجود دارد که شخصیت‌های فعال، شخصیت مخفی هر بازیکن و شخصیت‌هایی که از صفحه حذف شده‌اند را مشخص می‌کند.
گپ بازی : گزارشی از سوالات، پاسخ‌ها و پیام‌های سیستمی.

رابط به اجزای منطقی تقسیم می شود:

useGameState حالت اصلی را مدیریت می کند
usePlayerActions برای نوبت بازیکن است
useAIActions برای منطق هوش مصنوعی است

منطق بازی با Prompt API

هسته اصلی این پروژه استفاده از Prompt API است.

من منطق بازی هوش مصنوعی را به builtInAIService.ts اضافه کردم. اینها وظایف کلیدی آن است:

پاسخ های محدود کننده و باینری را مجاز کنید.
آموزش استراتژی بازی مدل.
آنالیز مدل را آموزش دهید.
به مدل فراموشی بدهید.

پاسخ های محدود کننده و باینری را مجاز کنید

const schema = { type: "boolean" };
const result = session.prompt(prompt, { responseConstraint: schema });

این به من این امکان را داد که درخواست را ساده کنم و به کد من اجازه بدهم که پاسخ را به طور قابل اعتماد مدیریت کند:

JSON.parse(result) ? "Yes" : "No"

آموزش استراتژی بازی مدل

"مهم: فقط در مورد ویژگی های موجود بپرسید"
انتقادی: اصیل باشید. سوالی را تکرار نکنید.

تحلیل مدل را آموزش دهید

من معماری را دوباره طراحی کردم تا درک را از کسر کد جدا کنم:

هوش مصنوعی مسئول تجزیه و تحلیل بصری است . مدل ها در تحلیل بصری برتری دارند. من به مدل دستور دادم که سؤال خود و تجزیه و تحلیل دقیق را در یک طرح سختگیرانه JSON برگرداند. مدل هر شخصیت را در تابلوی خود تجزیه و تحلیل می کند و به این سوال پاسخ می دهد که "آیا این شخصیت این ویژگی را دارد؟" مدل یک شی JSON ساختار یافته را برمی گرداند:
```
{ "character_id": "...", "has_feature": true }
```
یک بار دیگر، داده های ساختاریافته کلید یک نتیجه موفقیت آمیز هستند.
کد بازی از تجزیه و تحلیل برای تصمیم گیری نهایی استفاده می کند . کد برنامه پاسخ بازیکن ("بله" یا "خیر") را بررسی می کند و از طریق تجزیه و تحلیل هوش مصنوعی تکرار می شود. اگر بازیکن گفت "نه"، کد می داند که هر کاراکتری را که has_feature true است حذف می کند.

برای بررسی درک مدل، تصویری از این تحلیل ساختم. این امر تأیید صحیح بودن برداشت مدل را آسان‌تر کرد.

مهندسی سریع

در اینجا نحوه عملکرد آن جریان آمده است:

چرخش هوش مصنوعی (تماس 1 API) : هوش مصنوعی می پرسد، "آیا شخصیت شما ریش دارد؟"
نوبت بازیکن : بازیکن به شخصیت مخفی خود که تراشیده شده است نگاه می کند و پاسخ می دهد: "نه".
چرخش هوش مصنوعی (تماس 2 API) : هوش مصنوعی عملاً از خود می‌خواهد دوباره به همه کاراکترهای باقیمانده خود نگاه کند و بر اساس پاسخ بازیکن تعیین کند که کدام یک را حذف کند.

به جای افزودن تحلیل بیشتر، بهبود ببخشید

من بر یک اصل UX تکیه کردم: راه حل تجزیه و تحلیل بیشتر نبود، بلکه تجزیه و تحلیل بهتر بود .

چرخش هوش مصنوعی (تماس API) : از مدل خواسته می‌شود تا هم سؤال و هم تجزیه و تحلیل داخلی خود را همزمان ایجاد کند و یک شی JSON را برگرداند.
1. سوال : آیا شخصیت شما عینک می زند؟
2. تجزیه و تحلیل (داده ها) :
```
[
  {character_id: 'brad', has_feature: true},
  {character_id: 'alex', has_feature: false},
  {character_id: 'gina', has_feature: true},
  ...
]
```
نوبت بازیکن : شخصیت مخفی بازیکن الکس است (بدون عینک)، بنابراین آنها پاسخ می دهند: "نه."
پایان های دور : کد جاوا اسکریپت برنامه کنترل می شود. نیازی نیست چیز دیگری از هوش مصنوعی بپرسد. از طریق داده های تجزیه و تحلیل مرحله 1 تکرار می شود.
1. بازیکن گفت: نه.
2. کد به دنبال هر کاراکتری است که در آن has_feature درست است.
3. براد و جینا را پایین می‌آورد. منطق قطعی و فوری است.

اما پس از چند دور با حریف جدید هوش مصنوعی من، یک مسئله جدید فوق العاده ظاهر شد: یک بن بست.

فرار از بن بست

به مدل فراموشی بدهید

زنگ‌ها و سوت‌ها: بازی‌های سفارشی و ورودی صوتی

برای جذاب‌تر کردن تجربه، دو ویژگی اضافی اضافه کردم:

کاراکترهای سفارشی : با getUserMedia() بازیکنان می توانند از دوربین خود برای ایجاد مجموعه 5 کاراکتری خود استفاده کنند. من از IndexedDB برای ذخیره کاراکترها استفاده کردم، یک پایگاه داده مرورگر مناسب برای ذخیره داده های باینری مانند حباب های تصویر. هنگامی که یک مجموعه سفارشی ایجاد می کنید، در مرورگر شما ذخیره می شود و گزینه پخش مجدد در منوی اصلی ظاهر می شود.
توجه: این مدل احتمالاً در تفسیر عکس‌های تولید شده توسط کاربر چالش‌های بیشتری خواهد داشت، زیرا ممکن است نور، پس‌زمینه و زوایای مختلفی داشته باشند. عکس‌های پیش‌فرض، پرتره‌های ثابت و با کیفیت استودیو هستند.
ورودی صوتی : مدل سمت مشتری چند وجهی است . می تواند متن، تصاویر و همچنین صدا را مدیریت کند. با استفاده از MediaRecorder API برای گرفتن ورودی میکروفون، می‌توانم حباب صوتی حاصل را با یک اعلان به مدل وارد کنم: "آوایی زیر را رونویسی کن...". این یک راه سرگرم کننده برای بازی اضافه می کند (و یک روش سرگرم کننده برای دیدن اینکه چگونه لهجه فلاندری من را تفسیر می کند). من این را بیشتر برای نشان دادن تطبیق پذیری این قابلیت جدید وب ایجاد کردم، اما راستش را بگویم، از تایپ کردن بارها و بارها سؤالات خسته شده بودم.

افکار نهایی

به جای تلاش برای کمال، هدفم بهبود نتیجه است.

رفتار هوش مصنوعی فقط لازم نیست درست باشد. نیاز به احساس عادلانه دارد.

Brecht De Ruyte

تاریخ انتشار: 10 اکتبر 2025

کودکان در سن مدرسه که در سال 2014 بازی حدس بزن چه کسی را بازی می کنند.

پایه بازی: یک برنامه React

وضعیت بازی : شماره‌ای که مرحله فعلی بازی را ردیابی می‌کند (مانند PLAYER_TURN_ASKING ، AI_TURN ، GAME_OVER ). این مهم‌ترین حالت است، زیرا تعیین می‌کند که رابط چه چیزی را نمایش دهد و چه اقداماتی در دسترس بازیکن باشد.
لیست شخصیت‌ها : لیست‌های متعددی وجود دارد که شخصیت‌های فعال، شخصیت مخفی هر بازیکن و شخصیت‌هایی که از صفحه حذف شده‌اند را مشخص می‌کند.
Game chat : A running log of questions, answers, and system messages.

The interface is broken down into logical components:

GameBoard displays the grid of characters and chat controls to handle all user input.

As the game's features grew, so did its complexity. Initially, the entire game's logic was managed within a single, large custom React hook , useGameLogic , but it quickly became too large to navigate and debug. To improve maintainability, I refactored this hook into multiple hooks, each with a single responsibility. به عنوان مثال:

useGameState manages the core state
usePlayerActions is for the player's turn
useAIActions is for the AI's logic

The main useGameLogic hook now acts as a clean composer, placing these smaller hooks together. This architectural change didn't alter the game's functionality, but it made the codebase a whole lot cleaner.

Game logic with the Prompt API

The core of this project is the use of the Prompt API.

I added the AI game logic to builtInAIService.ts . These are its key responsibilities:

Allow restrictive, binary answers.
Teach the model game strategy.
Teach the model analysis.
Give the model amnesia.

Allow restrictive, binary answers

How does the player interact with the AI? When a player asks, "Does your character have a hat?", the AI needs to "look" at its secret character's image and give a clear answer.

My first attempts were a mess. The response was conversational: "No, the character I'm thinking of, Isabella, does not appear to be wearing a hat," instead of offering a binary yes or no. Initially, I solved this with a very strict prompt, essentially dictating to the model to only respond with "Yes" or "No".

While this worked, I learned of an even better way using structured output . By providing the JSON Schema to the model, I could guarantee a true or false response.

const schema = { type: "boolean" };
const result = session.prompt(prompt, { responseConstraint: schema });

This allowed me to simplify the prompt and let my code reliably handle the response:

JSON.parse(result) ? "Yes" : "No"

Teach the model game strategy

Telling the model to answer a question is much simpler than having the model initiate and ask questions. A good Guess Who? player doesn't ask random questions. They ask questions that eliminate the most characters at once. An ideal question reduces the possible remaining characters in half using binary questions.

How do you teach a model that strategy? Again, prompt engineering. The prompt for generateAIQuestion() is actually a concise lesson in Guess Who? نظریه بازی

Initially, I asked the model to "ask a good question." The results were unpredictable. To improve the results, I added negative constraints. The prompt now includes instructions similar to:

"CRITICAL: Ask about existing features ONLY"
"CRITICAL: Be original. Do NOT repeat a question".

These constraints narrow the model's focus, prevent it from asking irrelevant questions, which make it a much more enjoyable opponent. You can review the full prompt file on GitHub .

Teach the model analysis

This was, by far, the most difficult and important challenge. When the model asks a question, such as, "Does your character have a hat," and the player responds no, how does the model know what characters on their board are eliminated?

The model should eliminate everyone with a hat. My early attempts were plagued with logical errors, and sometimes the model eliminated the wrong characters or no characters. Also, what is a "hat"? Does a "beanie" count as a "hat"? This is, let's be honest, also something that can happen in a human debate. And of course, general mistakes happen. Hair can look like a hat from an AI perspective.

I redesigned the architecture to separate perception from code deduction:

AI is responsible for visual analysis . Models excel at visual analysis. I instructed the model to return its question and a detailed analysis in a strict JSON schema. The model analyzes each character on its board and answers the question, "Does this character have this feature?" The model returns a structured JSON object:
```
{ "character_id": "...", "has_feature": true }
```
Once again, structured data is key to a successful outcome.
Game code uses the analysis to make the final decision . The application code checks the player's answer ("Yes" or "No") and iterates through the AI's analysis. If the player said "No," the code knows to eliminate every character where has_feature is true .

I found this division of labor is key to building reliable AI applications. Use the AI for its analytic capabilities, and leave binary decisions to your application code.

To check the model's perception, I built a visualization of this analysis. This made it easier to confirm if the model's perception was correct.

مهندسی سریع

However, even with this separation, I noticed the model's perception could still be flawed. It might misjudge whether a character wore glasses, leading to a frustrating, incorrect elimination. To combat this, I experimented with a two-step process: the AI would ask its question. After receiving the player's answer, it would perform a second, fresh analysis with the answer as context. The theory was that a second look might catch errors from the first.

Here's how that flow would have worked:

AI turn (API call 1) : AI asks, "Does your character have a beard?"
Player's turn : The player looks at their secret character, who is clean-shaven, and answers, "No."
AI turn (API call 2) : The AI effectively asks itself to look at all of its remaining characters, again, and determine which ones to eliminate based on the player's answer.

In step two, the model might still misperceive a character with a light stubble as "not having a beard" and fail to eliminate them, even though the user expected it to. The core perception error wasn't fixed, and the extra step just delayed the results. When playing against a human opponent, we can specify an agreement or clarification on this; in the current setup with our AI opponent, this isn't the case.

This process added latency from a second API call, without gaining a significant boost in accuracy. If the model was wrong the first time, it was often wrong the second time, too. I reverted the prompt to review just once.

Improve instead of adding more analysis

I relied on a UX principle: The solution wasn't more analysis, but better analysis.

I invested heavily in refining the prompt, adding explicit instructions for the model to double-check its work and focus on distinct features, which proved to be a more effective strategy for improving accuracy. Here's how the current, more reliable flow works:

AI turn (API call) : The model is prompted to generate both its question and its internal analysis at the same time, returning a single JSON object.
1. Question : "Does your character wear glasses?"
2. Analysis (data) :
```
[
  {character_id: 'brad', has_feature: true},
  {character_id: 'alex', has_feature: false},
  {character_id: 'gina', has_feature: true},
  ...
]
```
Player's turn : The player's secret character is Alex (no glasses), so they answer, "No."
Round ends : The application's JavaScript code takes over. It doesn't need to ask the AI anything else. It iterates through the analysis data from step 1.
1. The player said "No."
2. The code looks for every character where has_feature is true.
3. It flips down Brad and Gina. The logic is deterministic and instant.

This experimentation was crucial, but required a lot of trial and error. I had no idea if it was going to get better. Sometimes, it got even worse. Determining how to get the most consistent results isn't an exact science (yet, if ever...).

But after a few rounds with my new AI opponent, a fantastic new issue appeared: a stalemate.

Escape deadlock

When only two or three very similar characters remained, the model would get stuck in a loop. It would ask a question about a feature they all shared, such as, "Does your character wear a hat?"

My code would correctly identify this as a wasted turn, and the AI would try another, equally broad feature the characters also all shared, such as, "Does your character wear glasses?"

I enhanced the prompt with a new rule: if a question generation attempt fails and there are three or fewer characters left, the strategy changes.

The new instruction is explicit: "Instead of a broad feature, you must ask about a more specific, unique, or combined visual feature to find a difference." For example, instead of asking if the character wears a hat, it's prompted to ask if they're wearing a baseball cap.

This forces the model to look much closer at the images to find the one small detail that can finally lead to a breakthrough, making its late-game strategy work a little better, most of the time.

Give the model amnesia

A language model's greatest strength is its memory. But in this game, its greatest strength became a weakness. When I started a second game, it would ask confusing or irrelevant questions. Of course, my smart AI opponent was retaining the entire chat history from the previous game. It was trying to make sense of two (or even more) games at once.

Instead of reusing the same AI session, I now explicitly destroy it at the end of each game, essentially giving the AI amnesia.

When you click Play Again , the startNewGameSession() function resets the board and creates a brand new AI session. This was an interesting lesson in managing session state not just in the app, but within the AI model itself.

Bells and whistles: Custom games and voice input

To make the experience more engaging, I added two extra features:

Custom characters : With getUserMedia() , players can use their camera to create their own 5-character set. I used IndexedDB to save the characters, a browser database perfect for storing binary data like image blobs. When you create a custom set, it's saved to your browser, and a replay option appears in the main menu.
Note: The model will likely have more challenges interpreting user-generated photos, as they may have varied lighting, backgrounds, and angles. The default photos are consistent, studio-quality portraits.
Voice input : The client-side model is multi-modal . It can handle text, images, and also audio. Using the MediaRecorder API to capture microphone input, I could feed the resulting audio blob to the model with a prompt: "Transcribe the following audio...". This adds a fun way to play (and a fun way to see how it interprets my Flemish accent). I created this mostly to show the versatility of this new web capability, but truth be told, I was sick of typing questions over and over again.

افکار نهایی

Building "AI Guess Who?" was definitely a challenge. But with a bit of help from reading docs and some AI to debug AI (yeah... I did that), it turned out to be a fun experiment. It highlighted the immense potential of running a model in the browser for creating a private, fast, no-internet-required experience. This is still an experiment, and sometimes the opponent just doesn't play perfectly. It's not pixel-perfect or logic-perfect. With generative AI, the results are model-dependent.

Instead of striving for perfection, I'll aim for improving the outcome.

This project also underscored the constant challenges of prompt engineering. That prompting really became a huge part of it, and not always the most fun part. But the most critical lesson I learned was architecting the application to separate perception from deduction, dividing capabilities of AI and code. Even with that separation, I found that the AI could still make (to a human) obvious mistakes, like confusing tattoos for make-up or losing track of whose secret character was being discussed.

Each time, the solution was to make the prompts even more explicit, adding instructions that feel obvious to a human but are essential for the model.

Sometimes, the game felt unfair. Occasionally, I felt like the AI "knew" the secret character ahead of time, even though the code never explicitly shared that information. This shows a crucial part of human versus machine:

An AI's behavior doesn't just need to be correct; it needs to feel fair.

This is why I updated the prompts with blunt instructions, such as, "You do NOT know which character I have picked," and "No cheating." I learned that when building AI agents, you should spend time defining limitations, probably even more than instructions.

The interaction with the model could continue to be improved. By working with a built-in model, you lose some of the power and reliability of a massive server-side model, but you gain privacy, speed, and offline capability. For a game like this, that tradeoff was really worth experimenting with. The future of client-side AI is getting better by the day, models are getting smaller as well, and I can't wait to see what we'll be able to build next.

Brecht De Ruyte

تاریخ انتشار: 10 اکتبر 2025

School age children playing the game Guess Who in 2014.

The classic board game, Guess Who? , is a masterclass in deductive reasoning. Each player starts with a board of faces and, through a series of yes or no questions, narrows down the possibilities until you can confidently identify your opponent's secret character.

After seeing a demo of built-in AI at Google I/O Connect, I wondered: what if I could play a Guess Who? game against AI that lives in the browser? With client-side AI, the photos would be interpreted locally, so a custom Guess Who? of friends and family would remain private and secure on my device.

My background is primarily in UI and UX development, and I'm used to building pixel-perfect experiences. I hoped I could do exactly that with my interpretation.

My application, AI Guess Who? , is built with React and uses the Prompt API and a browser built-in model to create a surprisingly capable opponent. In this process, I discovered it's not so simple to get "pixel-perfect" results. But, this application demonstrates how AI can be used to build thoughtful game logic, and the importance of prompt engineering to refine this logic and get the outcomes you expect.

Keep reading to learn about the built-in AI integration, challenges I faced, and the solutions I landed on. You can play the game and find the source code on GitHub .

Game foundation: A React app

Before you look at the AI implementation, we'll review the application's structure. I built a standard React application with TypeScript, with a central App.tsx file to act as the game's conductor. This file holds:

Game state : An enum that tracks the current phase of the game (such as PLAYER_TURN_ASKING , AI_TURN , GAME_OVER ). This is the most important piece of state, as it dictates what the interface displays and what actions are available to the player.
Character lists : There are multiple lists that designate the active characters, each players' secret character, and which characters have been eliminated from the board.
Game chat : A running log of questions, answers, and system messages.

The interface is broken down into logical components:

useGameState manages the core state
usePlayerActions is for the player's turn
useAIActions is for the AI's logic

Game logic with the Prompt API

The core of this project is the use of the Prompt API.

I added the AI game logic to builtInAIService.ts . These are its key responsibilities:

Allow restrictive, binary answers.
Teach the model game strategy.
Teach the model analysis.
Give the model amnesia.

Allow restrictive, binary answers

How does the player interact with the AI? When a player asks, "Does your character have a hat?", the AI needs to "look" at its secret character's image and give a clear answer.

While this worked, I learned of an even better way using structured output . By providing the JSON Schema to the model, I could guarantee a true or false response.

const schema = { type: "boolean" };
const result = session.prompt(prompt, { responseConstraint: schema });

This allowed me to simplify the prompt and let my code reliably handle the response:

JSON.parse(result) ? "Yes" : "No"

Teach the model game strategy

How do you teach a model that strategy? Again, prompt engineering. The prompt for generateAIQuestion() is actually a concise lesson in Guess Who? نظریه بازی

Initially, I asked the model to "ask a good question." The results were unpredictable. To improve the results, I added negative constraints. The prompt now includes instructions similar to:

"CRITICAL: Ask about existing features ONLY"
"CRITICAL: Be original. Do NOT repeat a question".

These constraints narrow the model's focus, prevent it from asking irrelevant questions, which make it a much more enjoyable opponent. You can review the full prompt file on GitHub .

Teach the model analysis

I redesigned the architecture to separate perception from code deduction:

AI is responsible for visual analysis . Models excel at visual analysis. I instructed the model to return its question and a detailed analysis in a strict JSON schema. The model analyzes each character on its board and answers the question, "Does this character have this feature?" The model returns a structured JSON object:
```
{ "character_id": "...", "has_feature": true }
```
Once again, structured data is key to a successful outcome.
Game code uses the analysis to make the final decision . The application code checks the player's answer ("Yes" or "No") and iterates through the AI's analysis. If the player said "No," the code knows to eliminate every character where has_feature is true .

I found this division of labor is key to building reliable AI applications. Use the AI for its analytic capabilities, and leave binary decisions to your application code.

To check the model's perception, I built a visualization of this analysis. This made it easier to confirm if the model's perception was correct.

مهندسی سریع

Here's how that flow would have worked:

AI turn (API call 1) : AI asks, "Does your character have a beard?"
Player's turn : The player looks at their secret character, who is clean-shaven, and answers, "No."
AI turn (API call 2) : The AI effectively asks itself to look at all of its remaining characters, again, and determine which ones to eliminate based on the player's answer.

Improve instead of adding more analysis

I relied on a UX principle: The solution wasn't more analysis, but better analysis.

AI turn (API call) : The model is prompted to generate both its question and its internal analysis at the same time, returning a single JSON object.
1. Question : "Does your character wear glasses?"
2. Analysis (data) :
```
[
  {character_id: 'brad', has_feature: true},
  {character_id: 'alex', has_feature: false},
  {character_id: 'gina', has_feature: true},
  ...
]
```
Player's turn : The player's secret character is Alex (no glasses), so they answer, "No."
Round ends : The application's JavaScript code takes over. It doesn't need to ask the AI anything else. It iterates through the analysis data from step 1.
1. The player said "No."
2. The code looks for every character where has_feature is true.
3. It flips down Brad and Gina. The logic is deterministic and instant.

But after a few rounds with my new AI opponent, a fantastic new issue appeared: a stalemate.

Escape deadlock

When only two or three very similar characters remained, the model would get stuck in a loop. It would ask a question about a feature they all shared, such as, "Does your character wear a hat?"

My code would correctly identify this as a wasted turn, and the AI would try another, equally broad feature the characters also all shared, such as, "Does your character wear glasses?"

I enhanced the prompt with a new rule: if a question generation attempt fails and there are three or fewer characters left, the strategy changes.

This forces the model to look much closer at the images to find the one small detail that can finally lead to a breakthrough, making its late-game strategy work a little better, most of the time.

Give the model amnesia

Instead of reusing the same AI session, I now explicitly destroy it at the end of each game, essentially giving the AI amnesia.

Bells and whistles: Custom games and voice input

To make the experience more engaging, I added two extra features:

Custom characters : With getUserMedia() , players can use their camera to create their own 5-character set. I used IndexedDB to save the characters, a browser database perfect for storing binary data like image blobs. When you create a custom set, it's saved to your browser, and a replay option appears in the main menu.
Note: The model will likely have more challenges interpreting user-generated photos, as they may have varied lighting, backgrounds, and angles. The default photos are consistent, studio-quality portraits.
Voice input : The client-side model is multi-modal . It can handle text, images, and also audio. Using the MediaRecorder API to capture microphone input, I could feed the resulting audio blob to the model with a prompt: "Transcribe the following audio...". This adds a fun way to play (and a fun way to see how it interprets my Flemish accent). I created this mostly to show the versatility of this new web capability, but truth be told, I was sick of typing questions over and over again.

افکار نهایی

Instead of striving for perfection, I'll aim for improving the outcome.

Each time, the solution was to make the prompts even more explicit, adding instructions that feel obvious to a human but are essential for the model.

An AI's behavior doesn't just need to be correct; it needs to feel fair.

با Prompt API یک بازی حدس زدن بسازید با مجموعه‌ها، منظم بمانید ذخیره و طبقه‌بندی محتوا براساس اولویت‌های شما.

پایه بازی: یک برنامه React

منطق بازی با Prompt API

پاسخ های محدود کننده و باینری را مجاز کنید

آموزش استراتژی بازی مدل

تحلیل مدل را آموزش دهید

مهندسی سریع

به جای افزودن تحلیل بیشتر، بهبود ببخشید

فرار از بن بست

به مدل فراموشی بدهید

زنگ‌ها و سوت‌ها: بازی‌های سفارشی و ورودی صوتی

افکار نهایی

پایه بازی: یک برنامه React

منطق بازی با Prompt API

پاسخ های محدود کننده و باینری را مجاز کنید

آموزش استراتژی بازی مدل

تحلیل مدل را آموزش دهید

مهندسی سریع

به جای افزودن تحلیل بیشتر، بهبود ببخشید

فرار از بن بست

به مدل فراموشی بدهید

زنگ‌ها و سوت‌ها: بازی‌های سفارشی و ورودی صوتی

افکار نهایی

پایه بازی: یک برنامه React

Game logic with the Prompt API

Allow restrictive, binary answers

Teach the model game strategy

Teach the model analysis

مهندسی سریع

Improve instead of adding more analysis

Escape deadlock

Give the model amnesia

Bells and whistles: Custom games and voice input

افکار نهایی

Game foundation: A React app

Game logic with the Prompt API

Allow restrictive, binary answers

Teach the model game strategy

Teach the model analysis

مهندسی سریع

Improve instead of adding more analysis

Escape deadlock

Give the model amnesia

Bells and whistles: Custom games and voice input

افکار نهایی

با Prompt API یک بازی حدس زدن بسازید