Contact emails

[email protected], [email protected], [email protected],
[email protected]

Explainer

https://github.com/webmachinelearning/prompt-api/blob/main/README.md

Specification

http://webmachinelearning.github.io/prompt-api

Summary

The Prompt API gives web developers direct access to a browser-provided
on-device AI language model. The API design offers fine-grained control,
aligned with cloud API shapes, for progressively enhancing sites with model
interactions tailored to individualized use cases. This compliments
task-based language model APIs (e.g. Summarizer API), and varied APIs and
frameworks for generalized on-device inference with developer-supplied ML
models. The initial implementation supports text, image, and audio inputs,
as well as response constraints that ensure generated text conforms with
predefined regex and JSON schema formats.

This supports a variety of use cases, from generating image captions and
performing visual searches to transcribing audio, classifying sound events,
generating text following specific instructions, and extracting information
or insights from multimodal source material.

This API has already been shipped in Chrome Extensions; this intent tracks
the shipping on the web. An enterprise policy
GenAILocalFoundationalModelSettings
<https://chromeenterprise.google/policies/#GenAILocalFoundationalModelSettings>
is available to disable the underlying model downloading, which would
render this API unavailable. Enterprise admins can also set the
BuiltInAIAPIsEnabled policy to block Built-In AI API usage, while still
permitting other on-device GenAI features.

Language support log:

   -

   Chrome M139 and earlier only supported English ('en')
   -

   Chrome M140 added support for Spanish and Japanese ('es' and 'ja')



Blink component

Blink > AI > Prompt
<https://issues.chromium.org/issues?q=customfield1222907:%22Blink%20%3E%20AI%20%3E%20Prompt%22>

Web Feature ID

https://github.com/web-platform-dx/web-features/issues/3530

Motivation

Direct access to a language model can help web developers accomplish tasks
beyond those with dedicated APIs (e.g. Summarizer API) , and tailor their
usage for site-specific requirements. Compared to the low-level APIs
approach (e.g a custom AI model run via WebGPU, WASM, or WebNN), using the
built-in language model can save the user's bandwidth and disk space, and
has a lower barrier to entry. The design offers simple shorthands for
common patterns (e.g. await session.prompt(‘write a haiku’)), and supports
more complex use cases for handling structured content sequences, streaming
responses, availability checks, session management, and response
constraints.

Initial public proposal

https://github.com/webmachinelearning/charter/pull/9

Search tags

LanguageModel <https://chromestatus.com/features#tags:LanguageModel>, Language
Model <https://chromestatus.com/features#tags:Language%20Model>, Prompt
API, Built-in AI

TAG review

https://github.com/w3ctag/design-reviews/issues/1093

TAG review status

Issues addressed

WebFeature UseCounter name

kLanguageModel_Create

Risks

Interoperability and Compatibility

The Prompt API is designed to provide a stable and interoperable surface
for language model interactions, acknowledging the inherent diversity and
non-deterministic nature of underlying models. Variance in behaviors and
responses is a well understood expectation amongst developers employing
this technology, and this API aims to provide an interoperable framework
for consistent web platform access across browsers and models.

The Prompt API specifically aims to maximize compatibility by:

- Codifying an interoperable API surface for generalized language model
interactions, so developers can write code that works across different
browser engines and models. This surface has demonstrated compatibility
with models from Google and Microsoft, and been polyfilled by extensions
and JS frameworks, using different backends.

- Enforcing objective response conformance with constraints that ensure
output adheres to known JSON schemas or regexes for interoperable
processing of generated text.

- Supporting progressive enhancement patterns, by offering availability
signals that encapsulate device and model support dimensions, and encourage
developers to consider this API as one option among varied compatible AI
offerings, including developer supplied models and cloud-based services.

Shipping this API provides a critical opportunity to broaden real-world
implementation experience, explore future refinements, and collaborate with
the web community on interoperable model diversity within a robust,
predictable platform surface.

Gecko: Negative (https://github.com/mozilla/standards-positions/issues/1213)

WebKit: No signal (https://github.com/WebKit/standards-positions/issues/495)

Web developers: Strongly positive (
https://github.com/webmachinelearning/prompt-api/blob/main/README.md#stakeholder-feedback
)

Other signals: Microsoft Edge developers have been strong collaborators
with notable contributions including structured output, and experimental
tool use enhancements. Edge will be shipping this API using a different
underlying model.

Ergonomics

The API deprecated parameters and renamed identifiers, leaving legacy
access for previously launched extension contexts. We plan to align the web
and extension surfaces through careful additive changes and cautious
deprecation processes. Developers are encouraged to use the new identifier
names in both contexts and observe deprecation messages regarding planned
API alignments.

Activation

This feature would definitely benefit from having polyfills
<https://www.npmjs.com/package/prompt-api-polyfill>, backed by any of:
cloud services, lazily-loaded client-side models using WebGPU/WASM/WebNN,
or the web developer's own server. We anticipate seeing an ecosystem of
polyfills and client frameworks grow as more developers experiment with
this API.

WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that
it has potentially high risk for Android WebView-based applications?

Not Applicable; this API is not available in WebView.


Debuggability

The API surface supports basic DevTools debugging. Perfetto tracing (via
optimization_guide and other events) is useful, and internal debugging
pages which give more detail on the model's status, e.g.
chrome://on-device-internals might be suitable to port into DevTools. The
team is maintaining extensions
<https://chromewebstore.google.com/detail/webai-extension/lmjgpcigjcffnphimblhcoccjfefamcp>
of DevTools panels for improving debuggability. It is possible that giving
more insight into the nondeterministic states of the model, e.g. random
seeds, could help with debugging.

Will this feature be supported on all six Blink platforms (Windows, Mac,
Linux, ChromeOS, Android, and Android WebView)?

No

The initial launch focuses on Windows, Mac, Linux, and ChromeOS (on Chromebook
Plus <https://www.google.com/chromebook/chromebookplus/> devices). An
implementation for Android using that platform’s OS-level built-in language
model is being prototyped and will ship after the initial launch.

Is this feature fully tested by web-platform-tests
<https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md>
?

No

Web platform tests cover the API surface adequately:
https://wpt.fyi/results/ai/language-model These attempt to mitigate
execution environments differences, e.g. stub/full implementations
(content_shell, chrome), and device/model states (unavailable,
downloadable, downloaded). The core responses of real models can be
unpredictable (especially without sampling parameters) and may cause
inconsistent test results, but some facets are more readily testable, e.g.
the adherence to structured output response constraints. Test coverage and
reliability improvements are ongoing, including planning for WebDriver
extensions.

DevTrial instructions

https://developer.chrome.com/docs/ai/prompt-api

Flag name on about://flags

prompt-api-for-gemini-nano-multimodal-input

Finch feature name

AIPromptAPIMultimodalInput

Rollout plan

Will ship enabled for all users

Requires code in //chrome?

True

Tracking bug

https://crbug.com/417526788

Launch bug

https://launch.corp.google.com/launch/4461863

Measurement

The API has use counters for all methods and attributes e.g.:
LanguageModel_Create LanguageModel_Availability LanguageModel_Prompt
LanguageModel_PromptStreaming LanguageModel_Append
LanguageModel_MeasureContextUsage LanguageModel_OnContextOverflow
LanguageModel_ContextUsage LanguageModel_ContextWindow LanguageModel_Clone
LanguageModel_Destroy

Non-OSS dependencies

Does the feature depend on any code or APIs outside the Chromium open
source repository and its open-source dependencies to function?

Yes: this feature depends on a language model, which is bridged to the
open-source parts of the implementation via the interfaces in
//services/on_device_model.

Estimated milestones

Shipping on desktop

148

Origin trial desktop first

139

Origin trial desktop last

144

Origin trial extension 1 end milestone

147

DevTrial on desktop

137


Anticipated spec changes

Open questions about a feature may be a source of future web compat or
interop issues. Please list open issues (e.g. links to known github issues
in the project for the feature specification) whose resolution may
introduce web compat/interop risk (e.g., changing to naming or structure of
the API in a non-backward-compatible way).

Params may be re-added after addressing interop concerns:
https://github.com/webmachinelearning/prompt-api/issues/170 Identifiers
have been renamed for clarity before Web GA launch:
https://github.com/webmachinelearning/prompt-api/issues/177 Any post-launch
additive changes should be backwards compatible: e.g. tool use, multimodal
sampling info/options and outputs, session history access, model
info/options, etc.

Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5134603979063296?gate=5123192519393280

Links to previous Intent discussions

Intent to Prototype:
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra_LXU8KkcVJ0x%3DzYa4h_sC3FaHGdaoM59FNwwtRAsOALQ%40mail.gmail.com

Intent to Experiment:
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAM0wra9oT0jygAYT00WPp0_wtZ-znrB2OdZ6GQb%2B3thFLP19pA%40mail.gmail.com

Intent to Extend Experiment 1:
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAJcT_ZhyheBntZHMEwFJA%3DuhpkWmDx8yFieL5E5g%2Bwp5UA0mzQ%40mail.gmail.com

This intent message was generated by Chrome Platform Status
<https://chromestatus.com/>.

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAJcT_Zj73wjXZfmMcpQRWePp-H%3D5LzxYBOnasViYcn%3DFzY2vVQ%40mail.gmail.com.

Reply via email to