(skywalking) branch master updated: Add SWIP-10: Support Envoy AI Gateway Observability (#13757)

wusheng Wed, 25 Mar 2026 18:51:56 -0700

This is an automated email from the ASF dual-hosted git repository.

wusheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/skywalking.git



The following commit(s) were added to refs/heads/master by this push:
     new 5d85222baf Add SWIP-10: Support Envoy AI Gateway Observability (#13757)
5d85222baf is described below

commit 5d85222bafb35565c2d19b1276f0cc14d6968e18
Author: 吴晟 Wu Sheng <[email protected]>
AuthorDate: Thu Mar 26 09:50:56 2026 +0800

    Add SWIP-10: Support Envoy AI Gateway Observability (#13757)
---
 docs/en/swip/SWIP-10/SWIP.md                  | 767 ++++++++++++++++++++++++++
 docs/en/swip/SWIP-10/kind-test-resources.yaml | 247 +++++++++
 docs/en/swip/SWIP-10/kind-test-setup.sh       | 108 ++++
 docs/en/swip/readme.md                        |   3 +-
 4 files changed, 1124 insertions(+), 1 deletion(-)

diff --git a/docs/en/swip/SWIP-10/SWIP.md b/docs/en/swip/SWIP-10/SWIP.md
new file mode 100644
index 0000000000..1910b140e6
--- /dev/null
+++ b/docs/en/swip/SWIP-10/SWIP.md
@@ -0,0 +1,767 @@
+# SWIP-10 Support Envoy AI Gateway Observability
+
+## Motivation
+[Envoy AI Gateway](https://aigateway.envoyproxy.io/) is a gateway/proxy for 
AI/LLM API traffic (OpenAI, Anthropic,
+AWS Bedrock, Azure OpenAI, Google Gemini, etc.) built on top of Envoy Proxy. 
It provides GenAI-specific observability
+following [OpenTelemetry GenAI Semantic 
Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/), including
+token usage tracking, request latency, time-to-first-token (TTFT), and 
inter-token latency.
+
+SkyWalking should support monitoring Envoy AI Gateway as a first-class 
integration, providing:
+1. **Metrics monitoring** via OTLP push for GenAI metrics.
+2. **Access log collection** via OTLP log sink for per-request AI metadata 
analysis.
+
+This is complementary to [PR 
#13745](https://github.com/apache/skywalking/pull/13745) (agent-based Virtual 
GenAI
+monitoring). The agent-based approach monitors LLM calls from the client 
application side, while this SWIP monitors
+from the gateway (infrastructure) side. Both can coexist — the AI Gateway 
provides infrastructure-level visibility
+regardless of whether the calling application is instrumented.
+
+## Architecture Graph
+
+### Metrics Path (OTLP Push)
+```
+┌─────────────────┐      OTLP gRPC       ┌─────────────────┐
+│  Envoy AI       │  ──────────────────>  │  SkyWalking OAP │
+│  Gateway        │   (push, port 11800)   │  (otel-receiver) │
+│                 │                       │                 │
+│  4 GenAI metrics│                       │  MAL rules      │
+│  + labels       │                       │  → aggregation  │
+└─────────────────┘                       └─────────────────┘
+```
+
+### Access Log Path (OTLP Push)
+```
+┌─────────────────┐      OTLP gRPC       ┌─────────────────┐
+│  Envoy AI       │  ──────────────────>  │  SkyWalking OAP │
+│  Gateway        │   (push, port 11800)   │  (otel-receiver) │
+│                 │                       │                 │
+│  access logs    │                       │  LAL rules      │
+│  with AI meta   │                       │  → analysis     │
+└─────────────────┘                       └─────────────────┘
+```
+The AI Gateway natively supports an OTLP access log sink (via Envoy Gateway's 
OpenTelemetry sink),
+pushing structured access logs directly to the OAP's OTLP receiver. No 
FluentBit or external log
+collector is needed.
+
+## Proposed Changes
+
+### 1. New Layer: `ENVOY_AI_GATEWAY`
+
+Add a new layer in `Layer.java`:
+```java
+/**
+ * Envoy AI Gateway is an AI/LLM traffic gateway built on Envoy Proxy,
+ * providing observability for GenAI API traffic.
+ */
+ENVOY_AI_GATEWAY(46, true),
+```
+
+This is a **normal** layer (`isNormal=true`) because the AI Gateway is a real, 
instrumented infrastructure component
+(similar to `KONG`, `APISIX`, `NGINX`), not a virtual/conjectured service.
+
+### 2. Entity Model
+
+#### `job_name` — Routing Tag for MAL/LAL Rules
+
+SkyWalking's OTel receiver maps the OTLP resource attribute `service.name` to 
the internal tag `job_name`.
+This tag is used by MAL rule filters to route metrics to the correct rule set. 
All Envoy AI Gateway
+deployments must use a fixed `OTEL_SERVICE_NAME` value so that SkyWalking can 
identify the traffic:
+
+```bash
+OTEL_SERVICE_NAME=envoy-ai-gateway
+```
+
+This becomes `job_name=envoy-ai-gateway` in MAL, and the rules filter on it:
+```yaml
+filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
+```
+
+`job_name` is NOT the SkyWalking service name — it is only used for metric/log 
routing.
+
+#### Service and Instance Mapping
+
+| SkyWalking Entity | Source | Example |
+|---|---|---|
+| **Service** | `aigw.service` resource attribute (K8s Deployment/Service 
name, set via CRD) | `envoy-ai-gateway-basic` |
+| **Service Instance** | `service.instance.id` resource attribute (pod name, 
set via CRD + Downward API) | `aigw-pod-7b9f4d8c5` |
+
+Each Kubernetes Gateway deployment is a separate SkyWalking **service**. Each 
pod (ext_proc replica) is a
+**service instance**. Neither attribute is emitted by the AI Gateway by 
default — both must be explicitly
+set via `OTEL_RESOURCE_ATTRIBUTES` in the `GatewayConfig` CRD (see below).
+
+The **layer** (`ENVOY_AI_GATEWAY`) is set by MAL/LAL rules based on the 
`job_name` filter, not by the
+client. This follows the same pattern as other SkyWalking OTel integrations 
(e.g., ActiveMQ, K8s).
+
+Provider and model are **metric-level labels**, not separate entities in this 
layer. They are used for
+fine-grained metric breakdowns within the gateway service dashboards rather 
than being modeled as separate
+services (unlike the agent-based `VIRTUAL_GENAI` layer where provider=service, 
model=instance).
+
+The MAL `expSuffix` uses the `aigw_service` tag (dots converted to underscores 
by OTel receiver) as the
+SkyWalking service name and `service_instance_id` as the instance name:
+```yaml
+expSuffix: service(['aigw_service'], 
Layer.ENVOY_AI_GATEWAY).instance(['aigw_service', 'service_instance_id'])
+```
+
+#### Complete Kubernetes Setup Example
+
+The following example shows a complete Envoy AI Gateway deployment configured 
for SkyWalking
+observability via OTLP metrics and access logs.
+
+```yaml
+# 1. GatewayClass — standard Envoy Gateway controller
+apiVersion: gateway.networking.k8s.io/v1
+kind: GatewayClass
+metadata:
+  name: envoy-ai-gateway
+spec:
+  controllerName: gateway.envoyproxy.io/gatewayclass-controller
+---
+# 2. GatewayConfig — OTLP configuration for SkyWalking
+#    One GatewayConfig per gateway. Sets job_name, service name, instance ID,
+#    and enables OTLP push for both metrics and access logs.
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: GatewayConfig
+metadata:
+  name: my-gateway-config
+  namespace: default
+spec:
+  extProc:
+    kubernetes:
+      env:
+        # job_name — fixed value for MAL/LAL rule routing (same for ALL AI 
Gateway deployments)
+        - name: OTEL_SERVICE_NAME
+          value: "envoy-ai-gateway"
+        # OTLP endpoint — SkyWalking OAP gRPC receiver
+        - name: OTEL_EXPORTER_OTLP_ENDPOINT
+          value: "http://skywalking-oap.skywalking:11800";
+        - name: OTEL_EXPORTER_OTLP_PROTOCOL
+          value: "grpc"
+        # Enable OTLP for both metrics and access logs
+        - name: OTEL_METRICS_EXPORTER
+          value: "otlp"
+        - name: OTEL_LOGS_EXPORTER
+          value: "otlp"
+        # Gateway name = Gateway CRD metadata.name (e.g., "my-ai-gateway")
+        # Read from pod label gateway.envoyproxy.io/owning-gateway-name,
+        # which is auto-set by the Envoy Gateway controller on every envoy pod.
+        - name: GATEWAY_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: 
metadata.labels['gateway.envoyproxy.io/owning-gateway-name']
+        # Pod name (e.g., "envoy-default-my-ai-gateway-76d02f2b-xxx")
+        - name: POD_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        # aigw.service → SkyWalking service name (= Gateway CRD name, 
auto-resolved)
+        # service.instance.id → SkyWalking instance name (= pod name, 
auto-resolved)
+        # $(VAR) substitution references the valueFrom env vars defined above.
+        - name: OTEL_RESOURCE_ATTRIBUTES
+          value: "aigw.service=$(GATEWAY_NAME),service.instance.id=$(POD_NAME)"
+---
+# 3. Gateway — references the GatewayConfig via annotation
+apiVersion: gateway.networking.k8s.io/v1
+kind: Gateway
+metadata:
+  name: my-ai-gateway
+  namespace: default
+  annotations:
+    aigateway.envoyproxy.io/gateway-config: my-gateway-config
+spec:
+  gatewayClassName: envoy-ai-gateway
+  listeners:
+    - name: http
+      protocol: HTTP
+      port: 80
+---
+# 4. AIGatewayRoute — routing rules + token metadata for access logs
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIGatewayRoute
+metadata:
+  name: my-ai-gateway-route
+  namespace: default
+spec:
+  parentRefs:
+    - name: my-ai-gateway
+      kind: Gateway
+      group: gateway.networking.k8s.io
+  # Enable token counts in access logs
+  llmRequestCosts:
+    - metadataKey: llm_input_token
+      type: InputToken
+    - metadataKey: llm_output_token
+      type: OutputToken
+    - metadataKey: llm_total_token
+      type: TotalToken
+  # Route all models to the backend
+  rules:
+    - backendRefs:
+        - name: openai-backend
+---
+# 5. AIServiceBackend + Backend — LLM provider
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIServiceBackend
+metadata:
+  name: openai-backend
+  namespace: default
+spec:
+  schema:
+    name: OpenAI
+  backendRef:
+    name: openai-backend
+    kind: Backend
+    group: gateway.envoyproxy.io
+---
+apiVersion: gateway.envoyproxy.io/v1alpha1
+kind: Backend
+metadata:
+  name: openai-backend
+  namespace: default
+spec:
+  endpoints:
+    - fqdn:
+        hostname: api.openai.com
+        port: 443
+```
+
+**Key env var mapping:**
+
+| Env Var / Resource Attribute | SkyWalking Concept | Example Value |
+|---|---|---|
+| `OTEL_SERVICE_NAME` | `job_name` (MAL/LAL rule routing) | `envoy-ai-gateway` 
(fixed for all deployments) |
+| `aigw.service` | Service name | `my-ai-gateway` (auto-resolved from gateway 
name label) |
+| `service.instance.id` | Instance name | `envoy-default-my-ai-gateway-...` 
(auto-resolved from pod name) |
+
+**No manual per-gateway configuration needed** for service and instance names:
+- `GATEWAY_NAME` is auto-resolved from the pod label 
`gateway.envoyproxy.io/owning-gateway-name`,
+  which is set automatically by the Envoy Gateway controller on every envoy 
pod.
+- `POD_NAME` is auto-resolved from the pod name via the Downward API.
+- Both are injected into `OTEL_RESOURCE_ATTRIBUTES` via standard Kubernetes 
`$(VAR)` substitution.
+
+The `GatewayConfig.spec.extProc.kubernetes.env` field accepts full 
`corev1.EnvVar` objects (including
+`valueFrom`), merged into the ext_proc container by the gateway mutator 
webhook. Verified on Kind
+cluster — the gateway label resolves correctly (e.g., `my-ai-gateway`).
+
+**Important:** The `resource.WithFromEnv()` code path in the AI Gateway 
(`internal/metrics/metrics.go`)
+is conditional — it only executes when `OTEL_EXPORTER_OTLP_ENDPOINT` is set 
(or `OTEL_METRICS_EXPORTER=console`).
+The ext_proc runs in-process (not as a subprocess), so there is no env var 
propagation issue.
+
+### 3. MAL Rules for OTLP Metrics
+
+Create 
`oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/` 
with MAL rules consuming
+the 4 GenAI metrics from Envoy AI Gateway.
+
+All MAL rule files use the `job_name` filter to match only AI Gateway traffic:
+```yaml
+filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
+```
+
+#### Source Metrics from AI Gateway
+
+| Metric | Type | Labels |
+|---|---|---|
+| `gen_ai_client_token_usage` | Histogram (Delta) | `gen_ai.token.type` 
(input/output), `gen_ai.provider.name`, `gen_ai.response.model`, 
`gen_ai.operation.name` |
+| `gen_ai_server_request_duration` | Histogram | `gen_ai.provider.name`, 
`gen_ai.response.model`, `gen_ai.operation.name` |
+| `gen_ai_server_time_to_first_token` | Histogram | `gen_ai.provider.name`, 
`gen_ai.response.model`, `gen_ai.operation.name` |
+| `gen_ai_server_time_per_output_token` | Histogram | `gen_ai.provider.name`, 
`gen_ai.response.model`, `gen_ai.operation.name` |
+
+#### Proposed SkyWalking Metrics
+
+**Gateway-level (Service) metrics:**
+
+| Monitoring Panel | Unit | Metric Name | Description |
+|---|---|---|---|
+| Request CPM | count/min | `meter_envoy_ai_gw_request_cpm` | Requests per 
minute |
+| Request Latency Avg | ms | `meter_envoy_ai_gw_request_latency_avg` | Average 
request duration |
+| Request Latency Percentile | ms | 
`meter_envoy_ai_gw_request_latency_percentile` | P50/P75/P90/P95/P99 request 
duration |
+| Input Tokens Rate | tokens/min | `meter_envoy_ai_gw_input_token_rate` | 
Input tokens per minute (total across all models) |
+| Output Tokens Rate | tokens/min | `meter_envoy_ai_gw_output_token_rate` | 
Output tokens per minute (total across all models) |
+| Total Tokens Rate | tokens/min | `meter_envoy_ai_gw_total_token_rate` | 
Total tokens per minute |
+| TTFT Avg | ms | `meter_envoy_ai_gw_ttft_avg` | Average time to first token |
+| TTFT Percentile | ms | `meter_envoy_ai_gw_ttft_percentile` | 
P50/P75/P90/P95/P99 time to first token |
+| Time Per Output Token Avg | ms | `meter_envoy_ai_gw_tpot_avg` | Average 
inter-token latency |
+| Time Per Output Token Percentile | ms | `meter_envoy_ai_gw_tpot_percentile` 
| P50/P75/P90/P95/P99 inter-token latency |
+| Estimated Cost | cost/min | `meter_envoy_ai_gw_estimated_cost` | Estimated 
cost per minute (from token counts × config pricing) |
+
+**Per-provider breakdown metrics (labeled, within gateway service):**
+
+| Monitoring Panel | Unit | Metric Name | Description |
+|---|---|---|---|
+| Provider Request CPM | count/min | `meter_envoy_ai_gw_provider_request_cpm` 
| Requests per minute by provider |
+| Provider Token Usage | tokens/min | `meter_envoy_ai_gw_provider_token_rate` 
| Token rate by provider and token type |
+| Provider Latency Avg | ms | `meter_envoy_ai_gw_provider_latency_avg` | 
Average latency by provider |
+
+**Per-model breakdown metrics (labeled, within gateway service):**
+
+| Monitoring Panel | Unit | Metric Name | Description |
+|---|---|---|---|
+| Model Request CPM | count/min | `meter_envoy_ai_gw_model_request_cpm` | 
Requests per minute by model |
+| Model Token Usage | tokens/min | `meter_envoy_ai_gw_model_token_rate` | 
Token rate by model and token type |
+| Model Latency Avg | ms | `meter_envoy_ai_gw_model_latency_avg` | Average 
latency by model |
+| Model TTFT Avg | ms | `meter_envoy_ai_gw_model_ttft_avg` | Average TTFT by 
model |
+| Model TPOT Avg | ms | `meter_envoy_ai_gw_model_tpot_avg` | Average 
inter-token latency by model |
+
+#### Cost Estimation
+
+Reuse the same `gen-ai-config.yml` pricing configuration from PR #13745. The 
MAL rules will:
+1. Keep total token counts (input + output) per model from 
`gen_ai_client_token_usage`.
+2. Look up per-million-token pricing from config.
+3. Compute `estimated_cost = input_tokens × input_cost_per_m / 1_000_000 + 
output_tokens × output_cost_per_m / 1_000_000`.
+4. Amplify by 10^6 (same as PR #13745) to avoid floating point precision 
issues.
+
+No new MAL function is needed — standard arithmetic operations on 
counters/gauges are sufficient.
+
+#### Metrics vs Access Logs for Token Cost
+
+Both data sources provide token counts, but serve different cost analysis 
purposes:
+
+| Aspect | OTLP Metrics (MAL) | Access Logs (LAL) |
+|---|---|---|
+| **Granularity** | Aggregated counters — token sums over time windows | 
Per-request — exact token count for each individual call |
+| **Cost output** | Cost **rate** (e.g., $X/minute) — good for trends and 
capacity planning | Cost **per request** (e.g., this call cost $0.03) — good 
for attribution and audit |
+| **Precision** | Approximate (counter deltas over scrape intervals) | Exact 
(individual request values) |
+| **Use case** | Dashboard trends, billing estimates, provider comparison | 
Detect expensive individual requests, cost anomaly alerting, 
per-user/per-session attribution |
+
+The metrics path provides aggregated cost trends. The access log path enables 
per-request cost
+analysis — for example, alerting on a single request that consumed an 
unusually large number of tokens
+(e.g., a runaway prompt). Both paths reuse the same `gen-ai-config.yml` 
pricing data.
+
+### 4. Access Log Collection via OTLP
+
+The AI Gateway natively supports an OTLP access log sink. When 
`OTEL_LOGS_EXPORTER=otlp` (or defaulting
+to OTLP when `OTEL_EXPORTER_OTLP_ENDPOINT` is set), Envoy pushes structured 
access logs directly via
+OTLP gRPC to the same endpoint as metrics. No FluentBit or external log 
collector is needed.
+
+#### AI Gateway Configuration
+
+The OTLP log sink shares the same `GatewayConfig` CRD env vars as metrics (see 
Section 2).
+`OTEL_LOGS_EXPORTER=otlp` and `OTEL_EXPORTER_OTLP_ENDPOINT` enable the log 
sink. The
+`OTEL_RESOURCE_ATTRIBUTES` (including `aigw.service` and 
`service.instance.id`) are injected as
+resource attributes on each OTLP log record, ensuring consistency between 
metrics and access logs.
+
+Additionally, enable token metadata population in `AIGatewayRoute` so token 
counts appear in access logs:
+```yaml
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIGatewayRoute
+spec:
+  llmRequestCosts:
+    - metadataKey: llm_input_token
+      type: InputToken
+    - metadataKey: llm_output_token
+      type: OutputToken
+    - metadataKey: llm_total_token
+      type: TotalToken
+```
+
+#### OTLP Log Record Structure (Verified)
+
+Each access log record is pushed as an OTLP LogRecord with the following 
structure:
+
+**Resource attributes** (from `OTEL_RESOURCE_ATTRIBUTES` + Envoy metadata):
+
+| Attribute | Example | Notes |
+|---|---|---|
+| `aigw.service` | `envoy-ai-gateway-basic` | From `OTEL_RESOURCE_ATTRIBUTES` 
— SkyWalking service name |
+| `service.instance.id` | `aigw-pod-7b9f4d8c5` | From 
`OTEL_RESOURCE_ATTRIBUTES` — SkyWalking instance name |
+| `service.name` | `envoy-ai-gateway` | From `OTEL_SERVICE_NAME` — mapped to 
`job_name` for rule routing |
+| `node_name` | `default-aigw-run-85f8cf28` | Envoy node identifier |
+| `cluster_name` | `default/aigw-run` | Envoy cluster name |
+
+**Log record attributes** (per-request, LLM traffic):
+
+| Attribute | Example | Description |
+|---|---|---|
+| `gen_ai.request.model` | `llama3.2:latest` | Original requested model |
+| `gen_ai.response.model` | `llama3.2:latest` | Actual model from response |
+| `gen_ai.provider.name` | `openai` | Backend provider name |
+| `gen_ai.usage.input_tokens` | `31` | Input token count |
+| `gen_ai.usage.output_tokens` | `4` | Output token count |
+| `session.id` | `sess-abc123` | Session identifier (if set via header 
mapping) |
+| `response_code` | `200` | HTTP status code |
+| `duration` | `1835` | Request duration (ms) |
+| `request.path` | `/v1/chat/completions` | API path |
+| `connection_termination_details` | `-` | Envoy connection termination reason 
|
+| `upstream_transport_failure_reason` | `-` | Upstream failure reason |
+
+Note: `total_tokens` is not a separate field in the OTLP log — it equals 
`input_tokens + output_tokens`
+and can be computed in LAL rules. `connection_termination_details` and 
`upstream_transport_failure_reason`
+serve as error/timeout indicators (replacing `response_flags` from the 
file-based log format).
+
+**Log record attributes** (per-request, MCP traffic):
+
+| Attribute | Example | Description |
+|---|---|---|
+| `mcp.method.name` | `tools/call` | MCP method name |
+| `mcp.provider.name` | `kiwi` | MCP provider identifier |
+| `jsonrpc.request.id` | `1` | JSON-RPC request ID |
+| `mcp.session.id` | `sess-xyz` | MCP session ID |
+
+#### LAL Rules — Sampling Policy
+
+Create 
`oap-server/server-starter/src/main/resources/lal/envoy-ai-gateway.yaml` to 
process the OTLP
+access logs.
+
+**Sampling strategy:** Not all access logs need to be stored — only those that 
indicate abnormal or
+expensive requests. The LAL rules apply the following sampling policy:
+
+1. **High token cost** — persist logs where `input_tokens + output_tokens >= 
threshold` (default 10,000).
+2. **Error responses** — always persist logs with `response_code >= 400`.
+3. **Slow/timeout requests** — always persist logs where `duration` exceeds a 
configurable timeout
+   threshold, or where `connection_termination_details` / 
`upstream_transport_failure_reason` indicate
+   upstream failures. LLM requests are inherently slow (especially streaming), 
so timeout sampling is
+   important for diagnosing provider availability issues.
+
+All other access logs are dropped to avoid storage bloat.
+
+**Industry token usage reference** (from [OpenRouter State of AI 
2025](https://openrouter.ai/state-of-ai),
+100 trillion token study):
+
+| Use Case | Avg Input Tokens | Avg Output Tokens | Avg Total |
+|---|---|---|---|
+| Simple chat/Q&A | 500–1,000 | 200–400 | ~1,000 |
+| Customer support | 500–3,000 | 300–400 | ~2,500 |
+| RAG applications | 3,000–4,000 | 300–500 | ~3,500 |
+| Programming/code | 6,000–20,000+ | 400–1,500 | ~10,000+ |
+| Overall average (2025) | ~6,000 | ~400 | ~6,400 |
+
+Note: The overall average is heavily skewed by programming workloads. 
Non-programming use cases
+(chat, RAG, support) typically fall in the 1,000–3,500 total token range.
+
+**Default sampling threshold: 10,000 total tokens** (configurable). This is 
approximately 3× the
+non-programming median (~3,000), which captures genuinely expensive or 
abnormal requests without
+logging every routine call. The threshold is configurable to accommodate 
different workload profiles:
+- Lower (e.g., 5,000) for chat-heavy deployments where most requests are short.
+- Higher (e.g., 30,000) for code-generation-heavy deployments where large 
prompts are normal.
+
+The LAL rules would:
+1. Extract AI metadata (`gen_ai.usage.input_tokens`, 
`gen_ai.usage.output_tokens`, `gen_ai.request.model`,
+   `gen_ai.provider.name`) from OTLP log record attributes.
+2. Compute `total_tokens = input_tokens + output_tokens`.
+3. Associate logs with the gateway service and instance using resource 
attributes (`service.name`,
+   `service.instance.id`) in the `ENVOY_AI_GATEWAY` layer.
+4. **Apply sampling**: persist only logs matching at least one of:
+   - `total_tokens >= 10,000` (configurable threshold)
+   - `response_code >= 400`
+   - `duration >= timeout_threshold` or non-empty 
`upstream_transport_failure_reason`
+
+### 5. UI Dashboard
+
+**OAP side** — Create dashboard JSON templates under
+`oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/`:
+- `envoy-ai-gateway-root.json` — Root list view of all AI Gateway services.
+- `envoy-ai-gateway-service.json` — Service dashboard: Request CPM, latency, 
token rates, TTFT, TPOT,
+  estimated cost, with provider and model breakdown panels.
+- `envoy-ai-gateway-instance.json` — Instance (pod) level dashboard.
+
+**UI side** — A separate PR in 
[skywalking-booster-ui](https://github.com/apache/skywalking-booster-ui)
+is needed for i18n menu entries (similar to
+[skywalking-booster-ui#534](https://github.com/apache/skywalking-booster-ui/pull/534)
 for Virtual GenAI).
+The menu entry should be added under the infrastructure/gateway category.
+
+## Imported Dependencies libs and their licenses.
+No new dependency. The AI Gateway pushes both metrics and access logs via OTLP 
to SkyWalking's
+existing otel-receiver.
+
+## Compatibility
+- New layer `ENVOY_AI_GATEWAY` — no breaking change, additive only.
+- New MAL rules — opt-in via configuration.
+- New LAL rules for OTLP access logs — opt-in via configuration.
+- Reuses existing `gen-ai-config.yml` for cost estimation (shared with 
agent-based GenAI from PR #13745).
+- No changes to query protocol or storage structure — uses existing meter and 
log storage.
+- No external log collector (FluentBit, etc.) required — access logs are 
pushed via OTLP.
+
+## General usage docs
+
+### Prerequisites
+- Envoy AI Gateway deployed with the `GatewayConfig` CRD configured (see 
Section 2 for the full
+  env var setup including `OTEL_SERVICE_NAME`, `OTEL_EXPORTER_OTLP_ENDPOINT`, 
`OTEL_RESOURCE_ATTRIBUTES`).
+
+### Step 1: Configure Envoy AI Gateway
+
+Apply the `GatewayConfig` CRD from Section 2 to your AI Gateway deployment. 
Key env vars:
+
+| Env Var | Value | Purpose |
+|---|---|---|
+| `OTEL_SERVICE_NAME` | `envoy-ai-gateway` | Routes metrics/logs to correct 
MAL/LAL rules via `job_name` (fixed for all deployments) |
+| `OTEL_EXPORTER_OTLP_ENDPOINT` | `http://skywalking-oap:11800` | SkyWalking 
OAP OTLP receiver |
+| `OTEL_EXPORTER_OTLP_PROTOCOL` | `grpc` | OTLP transport |
+| `OTEL_METRICS_EXPORTER` | `otlp` | Enable OTLP metrics push |
+| `OTEL_LOGS_EXPORTER` | `otlp` | Enable OTLP access log push |
+| `GATEWAY_NAME` | (auto from label) | Auto-resolved from pod label 
`gateway.envoyproxy.io/owning-gateway-name` |
+| `POD_NAME` | (auto from Downward API) | Auto-resolved from pod name |
+| `OTEL_RESOURCE_ATTRIBUTES` | 
`aigw.service=$(GATEWAY_NAME),service.instance.id=$(POD_NAME)` | SkyWalking 
service name (auto) + instance ID (auto) |
+
+### Step 2: Configure SkyWalking OAP
+
+Enable the OTel receiver, MAL rules, and LAL rules in `application.yml`:
+```yaml
+receiver-otel:
+  selector: ${SW_OTEL_RECEIVER:default}
+  default:
+    enabledHandlers: 
${SW_OTEL_RECEIVER_ENABLED_HANDLERS:"otlp-metrics,otlp-logs"}
+    enabledOtelMetricsRules: 
${SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES:"envoy-ai-gateway"}
+
+log-analyzer:
+  selector: ${SW_LOG_ANALYZER:default}
+  default:
+    lalFiles: ${SW_LOG_LAL_FILES:"envoy-ai-gateway"}
+```
+
+### Cost Estimation
+
+Update `gen-ai-config.yml` with pricing for the models served through the AI 
Gateway.
+The same config file is shared with agent-based GenAI monitoring.
+
+## Appendix A: OTLP Payload Verification
+
+The following data was verified by capturing raw OTLP payloads from the AI 
Gateway
+(`envoyproxy/ai-gateway-cli:latest` Docker image) via an OTel Collector debug 
exporter.
+
+#### Resource Attributes
+
+With `OTEL_RESOURCE_ATTRIBUTES=service.instance.id=test-instance-456` and
+`OTEL_SERVICE_NAME=aigw-test-service`:
+
+| Attribute | Value | Notes |
+|---|---|---|
+| `service.instance.id` | `test-instance-456` | Set via 
`OTEL_RESOURCE_ATTRIBUTES` — **confirmed working** |
+| `service.name` | `aigw-test-service` | Set via `OTEL_SERVICE_NAME` env var |
+| `telemetry.sdk.language` | `go` | SDK metadata |
+| `telemetry.sdk.name` | `opentelemetry` | SDK metadata |
+| `telemetry.sdk.version` | `1.40.0` | SDK metadata |
+
+**Not present by default (without explicit env config):** 
`service.instance.id`, `aigw.service`, `host.name`.
+These must be explicitly set via `OTEL_RESOURCE_ATTRIBUTES` in the 
`GatewayConfig` CRD (see Section 2).
+
+`resource.WithFromEnv()` (source: `internal/metrics/metrics.go:35-94`) is 
called inside a conditional
+block that requires `OTEL_EXPORTER_OTLP_ENDPOINT` to be set. When configured, 
`OTEL_RESOURCE_ATTRIBUTES`
+is fully honored.
+
+#### Metric-Level Attributes (Labels)
+
+All 4 metrics carry:
+
+| Label | Example Value | Notes |
+|---|---|---|
+| `gen_ai.operation.name` | `chat` | Operation type |
+| `gen_ai.original.model` | `llama3.2:latest` | Original model from request |
+| `gen_ai.provider.name` | `openai` | Backend provider name. In K8s mode with 
explicit backend routing, this is the configured backend name. |
+| `gen_ai.request.model` | `llama3.2:latest` | Requested model |
+| `gen_ai.response.model` | `llama3.2:latest` | Model from response |
+| `gen_ai.token.type` | `input` / `output` / `cached_input` / 
`cache_creation_input` | Only on `gen_ai.client.token.usage`. **No `total` 
value** — total must be computed. `cached_input` and `cache_creation_input` are 
for Anthropic-style prompt caching. |
+
+#### Metric Names and Types
+
+| OTLP Metric Name | Type | Unit | Temporality |
+|---|---|---|---|
+| `gen_ai.client.token.usage` | **Histogram** (not Counter!) | `token` | 
**Delta** |
+| `gen_ai.server.request.duration` | Histogram | `s` (seconds, not ms!) | 
Delta |
+| `gen_ai.server.time_to_first_token` | Histogram | `s` | Delta (streaming 
only) |
+| `gen_ai.server.time_per_output_token` | Histogram | `s` | Delta (streaming 
only) |
+
+**Key findings:**
+1. Token usage is a **Histogram**, not a Counter — Sum/Count/Min/Max available 
per bucket.
+2. Duration is in **seconds** — MAL rules must multiply by 1000 for ms display.
+3. Temporality is **Delta** — MAL needs `increase()` semantics, not `rate()`.
+4. TTFT and TPOT **only appear for streaming requests** — non-streaming 
produces only token.usage + request.duration.
+5. **Dots in metric names** — OTLP uses dots (`gen_ai.client.token.usage`), 
Prometheus converts to underscores.
+
+#### Histogram Bucket Boundaries (verified from source: 
`internal/metrics/genai.go`)
+
+Token usage (14 boundaries, power-of-4):
+`1, 4, 16, 64, 256, 1024, 4096, 16384, 65536, 262144, 1048576, 4194304, 
16777216, 67108864`
+
+Request duration (14 boundaries, power-of-2 seconds):
+`0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 
40.96, 81.92`
+
+TTFT (21 boundaries, finer granularity for streaming):
+`0.001, 0.005, 0.01, 0.02, 0.04, 0.06, 0.08, 0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 
5.0, 7.5, 10.0, 15.0, 20.0, 30.0, 45.0, 60.0`
+
+TPOT (13 boundaries, finest granularity):
+`0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.3, 0.4, 0.5, 0.75, 1.0, 2.5`
+
+#### Impact on Implementation
+
+| Finding | Impact |
+|---|---|
+| No `service.instance.id` by default | 
`OTEL_RESOURCE_ATTRIBUTES=service.instance.id=<value>` **works** when OTLP 
exporter is configured (verified). MAL rules should treat instance as optional 
and document `OTEL_RESOURCE_ATTRIBUTES` configuration. |
+| `gen_ai.provider.name` = backend name | In K8s mode with explicit backend 
config, this is the configured backend name. |
+| Token usage is Histogram | MAL uses histogram sum/count, not counter value. |
+| Delta temporality | SkyWalking OTel receiver must handle delta-to-cumulative 
conversion. |
+| Duration in seconds | MAL rules multiply by 1000 for ms-based metrics. |
+| TTFT/TPOT streaming-only | Dashboard should note these metrics may be absent 
for non-streaming workloads. |
+
+#### Bonus: Traces Also Pushed
+
+The AI Gateway also pushes OpenInference traces via OTLP, including full 
request/response content
+in span attributes (`llm.input_messages`, `llm.output_messages`, 
`llm.token_count.*`). This is a
+potential future integration point but out of scope for this SWIP.
+
+## Appendix B: Raw OTLP Metric Data (Verified)
+
+Captured from OTel Collector debug exporter. This is the actual OTLP payload 
from `envoyproxy/ai-gateway-cli:latest`.
+
+### Resource Attributes
+```
+Resource SchemaURL: https://opentelemetry.io/schemas/1.39.0
+Resource attributes:
+     -> service.instance.id: Str(test-instance-456)
+     -> service.name: Str(aigw-test-service)
+     -> telemetry.sdk.language: Str(go)
+     -> telemetry.sdk.name: Str(opentelemetry)
+     -> telemetry.sdk.version: Str(1.40.0)
+```
+
+`OTEL_RESOURCE_ATTRIBUTES=service.instance.id=<value>` **is honored** when an 
OTLP exporter is configured
+(i.e., `OTEL_EXPORTER_OTLP_ENDPOINT` is set). Without an OTLP endpoint, the 
resource block is skipped and
+only the Prometheus reader is used (which does not carry resource attributes 
per-metric).
+
+### InstrumentationScope
+```
+ScopeMetrics SchemaURL:
+InstrumentationScope envoyproxy/ai-gateway
+```
+
+### Metric 1: gen_ai.client.token.usage (input tokens)
+```
+Name: gen_ai.client.token.usage
+Description: Number of tokens processed.
+Unit: token
+DataType: Histogram
+AggregationTemporality: Delta
+
+Data point attributes:
+     -> gen_ai.operation.name: Str(chat)
+     -> gen_ai.original.model: Str(llama3.2:latest)
+     -> gen_ai.provider.name: Str(openai)
+     -> gen_ai.request.model: Str(llama3.2:latest)
+     -> gen_ai.response.model: Str(llama3.2:latest)
+     -> gen_ai.token.type: Str(input)
+Count: 1
+Sum: 31.000000
+Min: 31.000000
+Max: 31.000000
+ExplicitBounds: [1, 4, 16, 64, 256, 1024, 4096, 16384, 65536, 262144, 1048576, 
4194304, 16777216, 67108864]
+```
+
+### Metric 1b: gen_ai.client.token.usage (output tokens)
+```
+Data point attributes:
+     -> gen_ai.token.type: Str(output)
+     (other attributes same as above)
+Count: 1
+Sum: 3.000000
+```
+
+### Metric 2: gen_ai.server.request.duration
+```
+Name: gen_ai.server.request.duration
+Description: Generative AI server request duration such as time-to-last byte 
or last output token.
+Unit: s
+DataType: Histogram
+AggregationTemporality: Delta
+
+Data point attributes:
+     -> gen_ai.operation.name: Str(chat)
+     -> gen_ai.original.model: Str(llama3.2:latest)
+     -> gen_ai.provider.name: Str(openai)
+     -> gen_ai.request.model: Str(llama3.2:latest)
+     -> gen_ai.response.model: Str(llama3.2:latest)
+Count: 1
+Sum: 10.432428
+ExplicitBounds: [0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 
10.24, 20.48, 40.96, 81.92]
+```
+
+### Metric 3: gen_ai.server.time_to_first_token (streaming only)
+```
+Name: gen_ai.server.time_to_first_token
+Description: Time to receive first token in streaming responses.
+Unit: s
+DataType: Histogram
+AggregationTemporality: Delta
+(Same attributes as request.duration, excluding gen_ai.token.type)
+ExplicitBounds (from source code): [0.001, 0.005, 0.01, 0.02, 0.04, 0.06, 
0.08, 0.1, 0.25, 0.5,
+                                     0.75, 1.0, 2.5, 5.0, 7.5, 10.0, 15.0, 
20.0, 30.0, 45.0, 60.0]
+```
+
+### Metric 4: gen_ai.server.time_per_output_token (streaming only)
+```
+Name: gen_ai.server.time_per_output_token
+Description: Time per output token generated after the first token for 
successful responses.
+Unit: s
+DataType: Histogram
+AggregationTemporality: Delta
+(Same attributes as request.duration, excluding gen_ai.token.type)
+ExplicitBounds (from source code): [0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 
0.3, 0.4, 0.5,
+                                     0.75, 1.0, 2.5]
+```
+
+## Appendix C: Access Log Format (from Envoy Config Dump)
+
+The AI Gateway auto-configures two access log entries on the listener (one for 
LLM, one for MCP).
+Verified from `config_dump` of the AI Gateway.
+
+### LLM Access Log Format (JSON)
+
+Filter: `request.headers['x-ai-eg-model'] != ''` (only logs requests processed 
by the AI Gateway ext_proc)
+
+```json
+{
+  "start_time": "%START_TIME%",
+  "method": "%REQ(:METHOD)%",
+  "request.path": "%REQ(:PATH)%",
+  "x-envoy-origin-path": "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%",
+  "response_code": "%RESPONSE_CODE%",
+  "duration": "%DURATION%",
+  "bytes_received": "%BYTES_RECEIVED%",
+  "bytes_sent": "%BYTES_SENT%",
+  "user-agent": "%REQ(USER-AGENT)%",
+  "x-request-id": "%REQ(X-REQUEST-ID)%",
+  "x-forwarded-for": "%REQ(X-FORWARDED-FOR)%",
+  "x-envoy-upstream-service-time": "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%",
+  "upstream_host": "%UPSTREAM_HOST%",
+  "upstream_cluster": "%UPSTREAM_CLUSTER%",
+  "upstream_local_address": "%UPSTREAM_LOCAL_ADDRESS%",
+  "upstream_transport_failure_reason": "%UPSTREAM_TRANSPORT_FAILURE_REASON%",
+  "downstream_remote_address": "%DOWNSTREAM_REMOTE_ADDRESS%",
+  "downstream_local_address": "%DOWNSTREAM_LOCAL_ADDRESS%",
+  "connection_termination_details": "%CONNECTION_TERMINATION_DETAILS%",
+  "gen_ai.request.model": "%REQ(X-AI-EG-MODEL)%",
+  "gen_ai.response.model": 
"%DYNAMIC_METADATA(io.envoy.ai_gateway:model_name_override)%",
+  "gen_ai.provider.name": 
"%DYNAMIC_METADATA(io.envoy.ai_gateway:backend_name)%",
+  "gen_ai.usage.input_tokens": 
"%DYNAMIC_METADATA(io.envoy.ai_gateway:llm_input_token)%",
+  "gen_ai.usage.output_tokens": 
"%DYNAMIC_METADATA(io.envoy.ai_gateway:llm_output_token)%",
+  "session.id": "%DYNAMIC_METADATA(io.envoy.ai_gateway:session.id)%"
+}
+```
+
+**Code review corrections** (source: `internal/metrics/genai.go`, 
`examples/access-log/basic.yaml`,
+`site/docs/capabilities/observability/accesslogs.md`):
+- `response_flags` (`%RESPONSE_FLAGS%`) IS documented in AI Gateway access log 
docs and used in tests,
+  but not in the default config. Can be added via `EnvoyProxy` resource if 
needed.
+- `gen_ai.usage.total_tokens` IS supported via 
`%DYNAMIC_METADATA(io.envoy.ai_gateway:llm_total_token)%`
+  when `AIGatewayRoute.spec.llmRequestCosts` includes `type: TotalToken`.
+- Access log format is **user-configurable** via `EnvoyProxy` resource, not 
hardcoded by the AI Gateway.
+  The AI Gateway only populates dynamic metadata; users define which fields 
appear in logs.
+- Additional token cost types beyond input/output/total: `CachedInputToken` 
and `CacheCreationInputToken`
+  (for Anthropic-style prompt caching, stored as `llm_cached_input_token` and
+  `llm_cache_creation_input_token` in dynamic metadata).
+
+### MCP Access Log Format (JSON)
+
+Filter: `request.headers['x-ai-eg-mcp-backend'] != ''`
+
+```json
+{
+  "start_time": "%START_TIME%",
+  "method": "%REQ(:METHOD)%",
+  "request.path": "%REQ(:PATH)%",
+  "response_code": "%RESPONSE_CODE%",
+  "duration": "%DURATION%",
+  "mcp.method.name": "%DYNAMIC_METADATA(io.envoy.ai_gateway:mcp_method)%",
+  "mcp.provider.name": "%DYNAMIC_METADATA(io.envoy.ai_gateway:mcp_backend)%",
+  "mcp.session.id": "%REQ(MCP-SESSION-ID)%",
+  "jsonrpc.request.id": 
"%DYNAMIC_METADATA(io.envoy.ai_gateway:mcp_request_id)%",
+  "session.id": "%DYNAMIC_METADATA(io.envoy.ai_gateway:session.id)%"
+}
+```
+
diff --git a/docs/en/swip/SWIP-10/kind-test-resources.yaml 
b/docs/en/swip/SWIP-10/kind-test-resources.yaml
new file mode 100644
index 0000000000..ff5d5bd790
--- /dev/null
+++ b/docs/en/swip/SWIP-10/kind-test-resources.yaml
@@ -0,0 +1,247 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# SWIP-10 Kind Test Resources
+# Deploy with: kubectl apply -f kind-test-resources.yaml
+#
+# This file contains all K8s resources for the SWIP-10 local verification:
+# - Ollama (in-cluster LLM backend)
+# - OTel Collector (debug exporter for capturing OTLP payloads)
+# - AI Gateway CRDs (GatewayClass, GatewayConfig, Gateway, AIGatewayRoute, 
AIServiceBackend, Backend)
+
+# --- Ollama (in-cluster) ---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: ollama
+  namespace: default
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: ollama
+  template:
+    metadata:
+      labels:
+        app: ollama
+    spec:
+      containers:
+        - name: ollama
+          image: ollama/ollama:latest
+          imagePullPolicy: Never
+          ports:
+            - containerPort: 11434
+          resources:
+            requests:
+              cpu: "500m"
+              memory: "2Gi"
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: ollama
+  namespace: default
+spec:
+  selector:
+    app: ollama
+  ports:
+    - port: 11434
+      targetPort: 11434
+---
+# --- OTel Collector (debug exporter) ---
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: otel-collector-config
+  namespace: default
+data:
+  config.yaml: |
+    receivers:
+      otlp:
+        protocols:
+          grpc:
+            endpoint: 0.0.0.0:4317
+    exporters:
+      debug:
+        verbosity: detailed
+    service:
+      pipelines:
+        metrics:
+          receivers: [otlp]
+          exporters: [debug]
+        logs:
+          receivers: [otlp]
+          exporters: [debug]
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: otel-collector
+  namespace: default
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: otel-collector
+  template:
+    metadata:
+      labels:
+        app: otel-collector
+    spec:
+      containers:
+        - name: collector
+          image: otel/opentelemetry-collector:latest
+          imagePullPolicy: Never
+          ports:
+            - containerPort: 4317
+          volumeMounts:
+            - name: config
+              mountPath: /etc/otelcol/config.yaml
+              subPath: config.yaml
+      volumes:
+        - name: config
+          configMap:
+            name: otel-collector-config
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: otel-collector
+  namespace: default
+spec:
+  selector:
+    app: otel-collector
+  ports:
+    - port: 4317
+      targetPort: 4317
+---
+# --- AI Gateway CRDs ---
+# 1. GatewayClass
+apiVersion: gateway.networking.k8s.io/v1
+kind: GatewayClass
+metadata:
+  name: envoy-ai-gateway
+spec:
+  controllerName: gateway.envoyproxy.io/gatewayclass-controller
+---
+# 2. GatewayConfig — OTLP configuration for SkyWalking
+#    Verified: GATEWAY_NAME auto-resolves from pod label
+#    gateway.envoyproxy.io/owning-gateway-name via Downward API
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: GatewayConfig
+metadata:
+  name: sw-test-config
+  namespace: default
+spec:
+  extProc:
+    kubernetes:
+      env:
+        # job_name for MAL/LAL rule routing (fixed for all deployments)
+        - name: OTEL_SERVICE_NAME
+          value: "envoy-ai-gateway"
+        # OTLP endpoint — OTel Collector (or SkyWalking OAP in production)
+        - name: OTEL_EXPORTER_OTLP_ENDPOINT
+          value: "http://otel-collector.default:4317";
+        - name: OTEL_EXPORTER_OTLP_PROTOCOL
+          value: "grpc"
+        # Enable OTLP for both metrics and access logs
+        - name: OTEL_METRICS_EXPORTER
+          value: "otlp"
+        - name: OTEL_LOGS_EXPORTER
+          value: "otlp"
+        - name: OTEL_METRIC_EXPORT_INTERVAL
+          value: "5000"
+        # Gateway name = Gateway CRD metadata.name (e.g., "my-ai-gateway")
+        # Read from pod label gateway.envoyproxy.io/owning-gateway-name,
+        # which is auto-set by the Envoy Gateway controller on every envoy pod.
+        - name: GATEWAY_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: 
metadata.labels['gateway.envoyproxy.io/owning-gateway-name']
+        # Pod name (e.g., "envoy-default-my-ai-gateway-76d02f2b-xxx")
+        - name: POD_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        # aigw.service → SkyWalking service name (= Gateway CRD name, 
auto-resolved)
+        # service.instance.id → SkyWalking instance name (= pod name, 
auto-resolved)
+        # $(VAR) substitution references the valueFrom env vars defined above.
+        - name: OTEL_RESOURCE_ATTRIBUTES
+          value: "aigw.service=$(GATEWAY_NAME),service.instance.id=$(POD_NAME)"
+---
+# 3. Gateway — references GatewayConfig via annotation
+apiVersion: gateway.networking.k8s.io/v1
+kind: Gateway
+metadata:
+  name: my-ai-gateway
+  namespace: default
+  annotations:
+    aigateway.envoyproxy.io/gateway-config: sw-test-config
+spec:
+  gatewayClassName: envoy-ai-gateway
+  listeners:
+    - name: http
+      protocol: HTTP
+      port: 80
+---
+# 4. AIGatewayRoute — routing + token metadata for access logs
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIGatewayRoute
+metadata:
+  name: my-ai-gateway-route
+  namespace: default
+spec:
+  parentRefs:
+    - name: my-ai-gateway
+      kind: Gateway
+      group: gateway.networking.k8s.io
+  llmRequestCosts:
+    - metadataKey: llm_input_token
+      type: InputToken
+    - metadataKey: llm_output_token
+      type: OutputToken
+    - metadataKey: llm_total_token
+      type: TotalToken
+  rules:
+    - backendRefs:
+        - name: ollama-backend
+---
+# 5. AIServiceBackend + Backend — Ollama in-cluster
+apiVersion: aigateway.envoyproxy.io/v1alpha1
+kind: AIServiceBackend
+metadata:
+  name: ollama-backend
+  namespace: default
+spec:
+  schema:
+    name: OpenAI
+    prefix: "/v1"
+  backendRef:
+    name: ollama-backend
+    kind: Backend
+    group: gateway.envoyproxy.io
+---
+apiVersion: gateway.envoyproxy.io/v1alpha1
+kind: Backend
+metadata:
+  name: ollama-backend
+  namespace: default
+spec:
+  endpoints:
+    - fqdn:
+        hostname: ollama.default.svc.cluster.local
+        port: 11434
diff --git a/docs/en/swip/SWIP-10/kind-test-setup.sh 
b/docs/en/swip/SWIP-10/kind-test-setup.sh
new file mode 100644
index 0000000000..4fd3afcc46
--- /dev/null
+++ b/docs/en/swip/SWIP-10/kind-test-setup.sh
@@ -0,0 +1,108 @@
+#!/bin/bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# SWIP-10 Local Verification: Envoy AI Gateway + SkyWalking OTLP on Kind
+#
+# Prerequisites:
+#   - kind, kubectl, helm, docker installed
+#   - Docker images pulled (or internet access for Kind to pull)
+#
+# This script sets up a Kind cluster with:
+#   - Envoy Gateway (v1.3.3) + AI Gateway controller (v0.5.0)
+#   - Ollama (in-cluster) with a small model
+#   - OTel Collector (debug exporter) to capture OTLP metrics and logs
+#   - AI Gateway configured with SkyWalking-compatible OTLP resource attributes
+#
+# Usage:
+#   ./kind-test-setup.sh          # Full setup
+#   ./kind-test-setup.sh cleanup  # Delete the cluster
+
+set -e
+
+CLUSTER_NAME="aigw-swip10-test"
+
+if [ "$1" = "cleanup" ]; then
+  echo "Cleaning up..."
+  kind delete cluster --name $CLUSTER_NAME
+  exit 0
+fi
+
+echo "=== Step 1: Create Kind cluster ==="
+kind create cluster --name $CLUSTER_NAME
+
+echo "=== Step 2: Pre-load Docker images ==="
+IMAGES=(
+  "envoyproxy/ai-gateway-controller:v0.5.0"
+  "envoyproxy/ai-gateway-extproc:v0.5.0"
+  "envoyproxy/gateway:v1.3.3"
+  "envoyproxy/envoy:distroless-v1.33.3"
+  "otel/opentelemetry-collector:latest"
+  "ollama/ollama:latest"
+)
+for img in "${IMAGES[@]}"; do
+  echo "Pulling $img..."
+  docker pull "$img"
+  echo "Loading $img into Kind..."
+  kind load docker-image "$img" --name $CLUSTER_NAME
+done
+
+echo "=== Step 3: Install Envoy Gateway ==="
+# enableBackend is required for Backend resources used by AIServiceBackend
+helm install eg oci://docker.io/envoyproxy/gateway-helm \
+  --version v1.3.3 -n envoy-gateway-system --create-namespace \
+  --set config.envoyGateway.extensionApis.enableBackend=true
+kubectl wait --for=condition=available deployment/envoy-gateway \
+  -n envoy-gateway-system --timeout=120s
+
+echo "=== Step 4: Install AI Gateway ==="
+helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm \
+  --namespace envoy-ai-gateway-system --create-namespace
+helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \
+  --namespace envoy-ai-gateway-system --create-namespace
+kubectl wait --for=condition=available deployment/ai-gateway-controller \
+  -n envoy-ai-gateway-system --timeout=120s
+
+echo "=== Step 5: Deploy test resources ==="
+kubectl apply -f kind-test-resources.yaml
+
+echo "=== Step 6: Wait for pods ==="
+sleep 10
+kubectl wait --for=condition=available deployment/ollama -n default 
--timeout=120s
+kubectl wait --for=condition=available deployment/otel-collector -n default 
--timeout=60s
+
+echo "=== Step 7: Pull Ollama model ==="
+OLLAMA_POD=$(kubectl get pod -l app=ollama -o 
jsonpath='{.items[0].metadata.name}')
+kubectl exec "$OLLAMA_POD" -- ollama pull qwen2.5:0.5b
+
+echo "=== Step 8: Wait for Envoy pod ==="
+sleep 30
+kubectl get pods -A
+
+echo ""
+echo "=== Setup complete ==="
+echo "To test:"
+echo "  kubectl port-forward -n envoy-gateway-system 
svc/envoy-default-my-ai-gateway-76d02f2b 8080:80 &"
+echo "  curl -s --noproxy '*' http://localhost:8080/v1/chat/completions \\"
+echo "    -H 'Content-Type: application/json' \\"
+echo "    -d 
'{\"model\":\"qwen2.5:0.5b\",\"messages\":[{\"role\":\"user\",\"content\":\"Say 
hi\"}]}'"
+echo ""
+echo "To check OTLP output:"
+echo "  kubectl logs -l app=otel-collector | grep -A 20 
'ResourceMetrics\\|ResourceLog'"
+echo ""
+echo "To cleanup:"
+echo "  ./kind-test-setup.sh cleanup"
diff --git a/docs/en/swip/readme.md b/docs/en/swip/readme.md
index 0cf9f8cc43..50121cbe29 100644
--- a/docs/en/swip/readme.md
+++ b/docs/en/swip/readme.md
@@ -68,10 +68,11 @@ All accepted and proposed SWIPs can be found in 
[here](https://github.com/apache
 
 ## Known SWIPs
 
-Next SWIP Number: 10
+Next SWIP Number: 11
 
 ### Accepted SWIPs
 
+- [SWIP-10 Support Envoy AI Gateway Observability](SWIP-10/SWIP.md)
 - [SWIP-9 Support Flink Monitoring](SWIP-9.md)
 - [SWIP-8 Support Kong Monitoring](SWIP-8.md)
 - [SWIP-6 Support ActiveMQ Monitoring](SWIP-6.md)

(skywalking) branch master updated: Add SWIP-10: Support Envoy AI Gateway Observability (#13757)

Reply via email to