(skywalking) 01/05: Add Envoy AI Gateway observability (SWIP-10)

wusheng Tue, 31 Mar 2026 00:39:33 -0700

This is an automated email from the ASF dual-hosted git repository.

wusheng pushed a commit to branch feature/swip-10-envoy-ai-gateway
in repository https://gitbox.apache.org/repos/asf/skywalking.git


commit 84836a8e2382368baa1b3cfba4d4af84e83254c4
Author: Wu Sheng <[email protected]>
AuthorDate: Tue Mar 31 13:39:10 2026 +0800

    Add Envoy AI Gateway observability (SWIP-10)
    
    - New layer: ENVOY_AI_GATEWAY
    - MAL rules for OTLP metrics: service and instance level aggregates,
      per-provider and per-model breakdowns (38 metrics total)
    - LAL rules for access log sampling (error responses, high token cost)
    - UI dashboard templates: root, service, instance with Log tabs
    - OTel receiver: convert data point attribute dots to underscores,
      change LABEL_MAPPINGS to fallback-only (preserve service_name tag)
    - SampleFamily: add toString() and debugDump() for debugging
    - E2e test: docker-compose with ai-gateway-cli + Ollama
    - SWIP-10 doc updated: use OTEL_SERVICE_NAME for service identity,
      explicit job_name for MAL routing
---
 docs/en/swip/SWIP-10/SWIP.md                       | 123 +++--
 docs/en/swip/SWIP-10/kind-test-resources.yaml      | 247 ----------
 docs/en/swip/SWIP-10/kind-test-setup.sh            | 108 -----
 .../oap/meter/analyzer/v2/dsl/SampleFamily.java    |  21 +
 .../skywalking/oap/server/core/analysis/Layer.java |   8 +-
 .../ui/template/UITemplateInitializer.java         |   1 +
 .../otlp/OpenTelemetryMetricRequestProcessor.java  |  35 +-
 .../src/main/resources/application.yml             |   4 +-
 .../src/main/resources/lal/envoy-ai-gateway.yaml   |  47 ++
 .../envoy-ai-gateway/gateway-instance.yaml         |  98 ++++
 .../envoy-ai-gateway/gateway-service.yaml          | 103 ++++
 .../envoy-ai-gateway-instance.json                 | 509 ++++++++++++++++++++
 .../envoy_ai_gateway/envoy-ai-gateway-root.json    |  63 +++
 .../envoy_ai_gateway/envoy-ai-gateway-service.json | 528 +++++++++++++++++++++
 .../resources/ui-initialized-templates/menu.yaml   |   5 +
 .../cases/envoy-ai-gateway/docker-compose.yml      |  85 ++++
 test/e2e-v2/cases/envoy-ai-gateway/e2e.yaml        |  61 +++
 .../envoy-ai-gateway/envoy-ai-gateway-cases.yaml   |  46 ++
 .../expected/metrics-has-value-label.yml           |  38 ++
 .../expected/metrics-has-value.yml                 |  34 ++
 .../cases/envoy-ai-gateway/expected/service.yml    |  24 +
 21 files changed, 1777 insertions(+), 411 deletions(-)

diff --git a/docs/en/swip/SWIP-10/SWIP.md b/docs/en/swip/SWIP-10/SWIP.md
index 1910b140e6..98a9e89474 100644
--- a/docs/en/swip/SWIP-10/SWIP.md
+++ b/docs/en/swip/SWIP-10/SWIP.md
@@ -81,24 +81,27 @@ filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
 
 | SkyWalking Entity | Source | Example |
 |---|---|---|
-| **Service** | `aigw.service` resource attribute (K8s Deployment/Service 
name, set via CRD) | `envoy-ai-gateway-basic` |
-| **Service Instance** | `service.instance.id` resource attribute (pod name, 
set via CRD + Downward API) | `aigw-pod-7b9f4d8c5` |
+| **Service** | `OTEL_SERVICE_NAME` / `service.name` (per-deployment gateway 
name) | `my-ai-gateway` |
+| **Service Instance** | `service.instance.id` resource attribute (pod name, 
set via Downward API) | `aigw-pod-7b9f4d8c5` |
 
-Each Kubernetes Gateway deployment is a separate SkyWalking **service**. Each 
pod (ext_proc replica) is a
-**service instance**. Neither attribute is emitted by the AI Gateway by 
default — both must be explicitly
-set via `OTEL_RESOURCE_ATTRIBUTES` in the `GatewayConfig` CRD (see below).
+Each Kubernetes Gateway deployment sets its own `OTEL_SERVICE_NAME` (the 
standard OTel env var) as the
+SkyWalking **service** name. Each pod is a **service instance** identified by 
`service.instance.id`.
 
-The **layer** (`ENVOY_AI_GATEWAY`) is set by MAL/LAL rules based on the 
`job_name` filter, not by the
-client. This follows the same pattern as other SkyWalking OTel integrations 
(e.g., ActiveMQ, K8s).
+The `job_name` resource attribute is set explicitly to the fixed value 
`envoy-ai-gateway` for MAL/LAL
+rule routing. This is separate from `service.name` — all AI Gateway 
deployments share the same
+`job_name` for routing, but each has its own `service.name` for entity 
identity.
+
+The **layer** (`ENVOY_AI_GATEWAY`) is set via `service.layer` resource 
attribute and used by LAL for
+log routing. MAL rules use `job_name` for metric routing.
 
 Provider and model are **metric-level labels**, not separate entities in this 
layer. They are used for
 fine-grained metric breakdowns within the gateway service dashboards rather 
than being modeled as separate
 services (unlike the agent-based `VIRTUAL_GENAI` layer where provider=service, 
model=instance).
 
-The MAL `expSuffix` uses the `aigw_service` tag (dots converted to underscores 
by OTel receiver) as the
-SkyWalking service name and `service_instance_id` as the instance name:
+The MAL `expSuffix` uses the `service_name` tag as the SkyWalking service name 
and `service_instance_id`
+as the instance name:
 ```yaml
-expSuffix: service(['aigw_service'], 
Layer.ENVOY_AI_GATEWAY).instance(['aigw_service', 'service_instance_id'])
+expSuffix: service(['service_name'], 
Layer.ENVOY_AI_GATEWAY).instance(['service_name', 'service_instance_id'])
 ```
 
 #### Complete Kubernetes Setup Example
@@ -127,36 +130,33 @@ spec:
   extProc:
     kubernetes:
       env:
-        # job_name — fixed value for MAL/LAL rule routing (same for ALL AI 
Gateway deployments)
+        # SkyWalking service name = Gateway CRD name (auto-resolved from pod 
label)
+        # OTEL_SERVICE_NAME is the standard OTel env var for service.name
+        - name: GATEWAY_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: 
metadata.labels['gateway.envoyproxy.io/owning-gateway-name']
         - name: OTEL_SERVICE_NAME
-          value: "envoy-ai-gateway"
+          value: "$(GATEWAY_NAME)"
         # OTLP endpoint — SkyWalking OAP gRPC receiver
         - name: OTEL_EXPORTER_OTLP_ENDPOINT
           value: "http://skywalking-oap.skywalking:11800";
         - name: OTEL_EXPORTER_OTLP_PROTOCOL
           value: "grpc"
-        # Enable OTLP for both metrics and access logs
         - name: OTEL_METRICS_EXPORTER
           value: "otlp"
         - name: OTEL_LOGS_EXPORTER
           value: "otlp"
-        # Gateway name = Gateway CRD metadata.name (e.g., "my-ai-gateway")
-        # Read from pod label gateway.envoyproxy.io/owning-gateway-name,
-        # which is auto-set by the Envoy Gateway controller on every envoy pod.
-        - name: GATEWAY_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: 
metadata.labels['gateway.envoyproxy.io/owning-gateway-name']
-        # Pod name (e.g., "envoy-default-my-ai-gateway-76d02f2b-xxx")
+        # Pod name for instance identity
         - name: POD_NAME
           valueFrom:
             fieldRef:
               fieldPath: metadata.name
-        # aigw.service → SkyWalking service name (= Gateway CRD name, 
auto-resolved)
-        # service.instance.id → SkyWalking instance name (= pod name, 
auto-resolved)
-        # $(VAR) substitution references the valueFrom env vars defined above.
+        # job_name — fixed routing tag for MAL/LAL rules (same for ALL AI 
Gateway deployments)
+        # service.instance.id — SkyWalking instance name (= pod name)
+        # service.layer — routes logs to ENVOY_AI_GATEWAY LAL rules
         - name: OTEL_RESOURCE_ATTRIBUTES
-          value: "aigw.service=$(GATEWAY_NAME),service.instance.id=$(POD_NAME)"
+          value: 
"job_name=envoy-ai-gateway,service.instance.id=$(POD_NAME),service.layer=ENVOY_AI_GATEWAY"
 ---
 # 3. Gateway — references the GatewayConfig via annotation
 apiVersion: gateway.networking.k8s.io/v1
@@ -227,15 +227,16 @@ spec:
 
 | Env Var / Resource Attribute | SkyWalking Concept | Example Value |
 |---|---|---|
-| `OTEL_SERVICE_NAME` | `job_name` (MAL/LAL rule routing) | `envoy-ai-gateway` 
(fixed for all deployments) |
-| `aigw.service` | Service name | `my-ai-gateway` (auto-resolved from gateway 
name label) |
-| `service.instance.id` | Instance name | `envoy-default-my-ai-gateway-...` 
(auto-resolved from pod name) |
+| `OTEL_SERVICE_NAME` | Service name | `my-ai-gateway` (auto-resolved from 
Gateway CRD name) |
+| `job_name` (in `OTEL_RESOURCE_ATTRIBUTES`) | MAL/LAL rule routing | 
`envoy-ai-gateway` (fixed for all deployments) |
+| `service.instance.id` (in `OTEL_RESOURCE_ATTRIBUTES`) | Instance name | 
`envoy-default-my-ai-gateway-...` (auto-resolved from pod name) |
+| `service.layer` (in `OTEL_RESOURCE_ATTRIBUTES`) | LAL log routing | 
`ENVOY_AI_GATEWAY` (fixed) |
 
 **No manual per-gateway configuration needed** for service and instance names:
 - `GATEWAY_NAME` is auto-resolved from the pod label 
`gateway.envoyproxy.io/owning-gateway-name`,
   which is set automatically by the Envoy Gateway controller on every envoy 
pod.
+- `OTEL_SERVICE_NAME` uses `$(GATEWAY_NAME)` substitution to set the 
per-deployment service name.
 - `POD_NAME` is auto-resolved from the pod name via the Downward API.
-- Both are injected into `OTEL_RESOURCE_ATTRIBUTES` via standard Kubernetes 
`$(VAR)` substitution.
 
 The `GatewayConfig.spec.extProc.kubernetes.env` field accepts full 
`corev1.EnvVar` objects (including
 `valueFrom`), merged into the ext_proc container by the gateway mutator 
webhook. Verified on Kind
@@ -247,8 +248,15 @@ The ext_proc runs in-process (not as a subprocess), so 
there is no env var propa
 
 ### 3. MAL Rules for OTLP Metrics
 
-Create 
`oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/` 
with MAL rules consuming
-the 4 GenAI metrics from Envoy AI Gateway.
+Create 
`oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/` 
with 2 MAL rule files
+consuming the 4 GenAI metrics from Envoy AI Gateway. Since `expSuffix` is 
file-level, service and
+instance scopes need separate files. Provider and model breakdowns share the 
same `expSuffix` as their
+parent scope, so they are included in the same file.
+
+| File | `expSuffix` | Contains |
+|---|---|---|
+| `gateway-service.yaml` | `service(['service_name'], Layer.ENVOY_AI_GATEWAY)` 
| Service aggregates + per-provider breakdown + per-model breakdown |
+| `gateway-instance.yaml` | `instance(['service_name'], 
['service_instance_id'], Layer.ENVOY_AI_GATEWAY)` | Instance aggregates + 
per-provider breakdown + per-model breakdown |
 
 All MAL rule files use the `job_name` filter to match only AI Gateway traffic:
 ```yaml
@@ -282,7 +290,7 @@ filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
 | Time Per Output Token Percentile | ms | `meter_envoy_ai_gw_tpot_percentile` 
| P50/P75/P90/P95/P99 inter-token latency |
 | Estimated Cost | cost/min | `meter_envoy_ai_gw_estimated_cost` | Estimated 
cost per minute (from token counts × config pricing) |
 
-**Per-provider breakdown metrics (labeled, within gateway service):**
+**Per-provider breakdown metrics (service scope):**
 
 | Monitoring Panel | Unit | Metric Name | Description |
 |---|---|---|---|
@@ -290,7 +298,7 @@ filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
 | Provider Token Usage | tokens/min | `meter_envoy_ai_gw_provider_token_rate` 
| Token rate by provider and token type |
 | Provider Latency Avg | ms | `meter_envoy_ai_gw_provider_latency_avg` | 
Average latency by provider |
 
-**Per-model breakdown metrics (labeled, within gateway service):**
+**Per-model breakdown metrics (service scope):**
 
 | Monitoring Panel | Unit | Metric Name | Description |
 |---|---|---|---|
@@ -300,6 +308,42 @@ filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
 | Model TTFT Avg | ms | `meter_envoy_ai_gw_model_ttft_avg` | Average TTFT by 
model |
 | Model TPOT Avg | ms | `meter_envoy_ai_gw_model_tpot_avg` | Average 
inter-token latency by model |
 
+**Instance-level (per-pod) aggregate metrics:**
+
+Same metrics as service-level but scoped to individual pods via `expSuffix: 
service([...]).instance([...])`.
+
+| Monitoring Panel | Unit | Metric Name | Description |
+|---|---|---|---|
+| Request CPM | count/min | `meter_envoy_ai_gw_instance_request_cpm` | 
Requests per minute per pod |
+| Request Latency Avg | ms | `meter_envoy_ai_gw_instance_request_latency_avg` 
| Average request duration per pod |
+| Request Latency Percentile | ms | 
`meter_envoy_ai_gw_instance_request_latency_percentile` | P50/P75/P90/P95/P99 
per pod |
+| Input Tokens Rate | tokens/min | 
`meter_envoy_ai_gw_instance_input_token_rate` | Input tokens per minute per pod 
|
+| Output Tokens Rate | tokens/min | 
`meter_envoy_ai_gw_instance_output_token_rate` | Output tokens per minute per 
pod |
+| Total Tokens Rate | tokens/min | 
`meter_envoy_ai_gw_instance_total_token_rate` | Total tokens per minute per pod 
|
+| TTFT Avg | ms | `meter_envoy_ai_gw_instance_ttft_avg` | Average TTFT per pod 
|
+| TTFT Percentile | ms | `meter_envoy_ai_gw_instance_ttft_percentile` | 
P50/P75/P90/P95/P99 TTFT per pod |
+| TPOT Avg | ms | `meter_envoy_ai_gw_instance_tpot_avg` | Average inter-token 
latency per pod |
+| TPOT Percentile | ms | `meter_envoy_ai_gw_instance_tpot_percentile` | 
P50/P75/P90/P95/P99 TPOT per pod |
+| Estimated Cost | cost/min | `meter_envoy_ai_gw_instance_estimated_cost` | 
Estimated cost per minute per pod |
+
+**Per-provider breakdown metrics (instance scope):**
+
+| Monitoring Panel | Unit | Metric Name | Description |
+|---|---|---|---|
+| Provider Request CPM | count/min | 
`meter_envoy_ai_gw_instance_provider_request_cpm` | Requests per minute by 
provider per pod |
+| Provider Token Usage | tokens/min | 
`meter_envoy_ai_gw_instance_provider_token_rate` | Token rate by provider per 
pod |
+| Provider Latency Avg | ms | 
`meter_envoy_ai_gw_instance_provider_latency_avg` | Average latency by provider 
per pod |
+
+**Per-model breakdown metrics (instance scope):**
+
+| Monitoring Panel | Unit | Metric Name | Description |
+|---|---|---|---|
+| Model Request CPM | count/min | 
`meter_envoy_ai_gw_instance_model_request_cpm` | Requests per minute by model 
per pod |
+| Model Token Usage | tokens/min | 
`meter_envoy_ai_gw_instance_model_token_rate` | Token rate by model per pod |
+| Model Latency Avg | ms | `meter_envoy_ai_gw_instance_model_latency_avg` | 
Average latency by model per pod |
+| Model TTFT Avg | ms | `meter_envoy_ai_gw_instance_model_ttft_avg` | Average 
TTFT by model per pod |
+| Model TPOT Avg | ms | `meter_envoy_ai_gw_instance_model_tpot_avg` | Average 
inter-token latency by model per pod |
+
 #### Cost Estimation
 
 Reuse the same `gen-ai-config.yml` pricing configuration from PR #13745. The 
MAL rules will:
@@ -335,7 +379,7 @@ OTLP gRPC to the same endpoint as metrics. No FluentBit or 
external log collecto
 
 The OTLP log sink shares the same `GatewayConfig` CRD env vars as metrics (see 
Section 2).
 `OTEL_LOGS_EXPORTER=otlp` and `OTEL_EXPORTER_OTLP_ENDPOINT` enable the log 
sink. The
-`OTEL_RESOURCE_ATTRIBUTES` (including `aigw.service` and 
`service.instance.id`) are injected as
+`OTEL_RESOURCE_ATTRIBUTES` (including `job_name`, `service.instance.id`, and 
`service.layer`) are injected as
 resource attributes on each OTLP log record, ensuring consistency between 
metrics and access logs.
 
 Additionally, enable token metadata population in `AIGatewayRoute` so token 
counts appear in access logs:
@@ -360,7 +404,7 @@ Each access log record is pushed as an OTLP LogRecord with 
the following structu
 
 | Attribute | Example | Notes |
 |---|---|---|
-| `aigw.service` | `envoy-ai-gateway-basic` | From `OTEL_RESOURCE_ATTRIBUTES` 
— SkyWalking service name |
+| `job_name` | `envoy-ai-gateway` | From `OTEL_RESOURCE_ATTRIBUTES` — MAL/LAL 
routing tag |
 | `service.instance.id` | `aigw-pod-7b9f4d8c5` | From 
`OTEL_RESOURCE_ATTRIBUTES` — SkyWalking instance name |
 | `service.name` | `envoy-ai-gateway` | From `OTEL_SERVICE_NAME` — mapped to 
`job_name` for rule routing |
 | `node_name` | `default-aigw-run-85f8cf28` | Envoy node identifier |
@@ -450,7 +494,8 @@ The LAL rules would:
 - `envoy-ai-gateway-root.json` — Root list view of all AI Gateway services.
 - `envoy-ai-gateway-service.json` — Service dashboard: Request CPM, latency, 
token rates, TTFT, TPOT,
   estimated cost, with provider and model breakdown panels.
-- `envoy-ai-gateway-instance.json` — Instance (pod) level dashboard.
+- `envoy-ai-gateway-instance.json` — Instance (pod) level dashboard: Same 
aggregate metrics as service
+  dashboard but scoped to a single pod, plus per-provider and per-model 
breakdown panels for that pod.
 
 **UI side** — A separate PR in 
[skywalking-booster-ui](https://github.com/apache/skywalking-booster-ui)
 is needed for i18n menu entries (similar to
@@ -481,14 +526,14 @@ Apply the `GatewayConfig` CRD from Section 2 to your AI 
Gateway deployment. Key
 
 | Env Var | Value | Purpose |
 |---|---|---|
-| `OTEL_SERVICE_NAME` | `envoy-ai-gateway` | Routes metrics/logs to correct 
MAL/LAL rules via `job_name` (fixed for all deployments) |
+| `OTEL_SERVICE_NAME` | `$(GATEWAY_NAME)` | SkyWalking service name 
(per-deployment, auto-resolved from Gateway CRD name) |
 | `OTEL_EXPORTER_OTLP_ENDPOINT` | `http://skywalking-oap:11800` | SkyWalking 
OAP OTLP receiver |
 | `OTEL_EXPORTER_OTLP_PROTOCOL` | `grpc` | OTLP transport |
 | `OTEL_METRICS_EXPORTER` | `otlp` | Enable OTLP metrics push |
 | `OTEL_LOGS_EXPORTER` | `otlp` | Enable OTLP access log push |
 | `GATEWAY_NAME` | (auto from label) | Auto-resolved from pod label 
`gateway.envoyproxy.io/owning-gateway-name` |
 | `POD_NAME` | (auto from Downward API) | Auto-resolved from pod name |
-| `OTEL_RESOURCE_ATTRIBUTES` | 
`aigw.service=$(GATEWAY_NAME),service.instance.id=$(POD_NAME)` | SkyWalking 
service name (auto) + instance ID (auto) |
+| `OTEL_RESOURCE_ATTRIBUTES` | 
`job_name=envoy-ai-gateway,service.instance.id=$(POD_NAME),service.layer=ENVOY_AI_GATEWAY`
 | Routing tag (fixed) + instance ID (auto) + layer for LAL routing |
 
 ### Step 2: Configure SkyWalking OAP
 
@@ -529,7 +574,7 @@ With 
`OTEL_RESOURCE_ATTRIBUTES=service.instance.id=test-instance-456` and
 | `telemetry.sdk.name` | `opentelemetry` | SDK metadata |
 | `telemetry.sdk.version` | `1.40.0` | SDK metadata |
 
-**Not present by default (without explicit env config):** 
`service.instance.id`, `aigw.service`, `host.name`.
+**Not present by default (without explicit env config):** 
`service.instance.id`, `job_name`, `service.layer`, `host.name`.
 These must be explicitly set via `OTEL_RESOURCE_ATTRIBUTES` in the 
`GatewayConfig` CRD (see Section 2).
 
 `resource.WithFromEnv()` (source: `internal/metrics/metrics.go:35-94`) is 
called inside a conditional
diff --git a/docs/en/swip/SWIP-10/kind-test-resources.yaml 
b/docs/en/swip/SWIP-10/kind-test-resources.yaml
deleted file mode 100644
index ff5d5bd790..0000000000
--- a/docs/en/swip/SWIP-10/kind-test-resources.yaml
+++ /dev/null
@@ -1,247 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-
-# SWIP-10 Kind Test Resources
-# Deploy with: kubectl apply -f kind-test-resources.yaml
-#
-# This file contains all K8s resources for the SWIP-10 local verification:
-# - Ollama (in-cluster LLM backend)
-# - OTel Collector (debug exporter for capturing OTLP payloads)
-# - AI Gateway CRDs (GatewayClass, GatewayConfig, Gateway, AIGatewayRoute, 
AIServiceBackend, Backend)
-
-# --- Ollama (in-cluster) ---
-apiVersion: apps/v1
-kind: Deployment
-metadata:
-  name: ollama
-  namespace: default
-spec:
-  replicas: 1
-  selector:
-    matchLabels:
-      app: ollama
-  template:
-    metadata:
-      labels:
-        app: ollama
-    spec:
-      containers:
-        - name: ollama
-          image: ollama/ollama:latest
-          imagePullPolicy: Never
-          ports:
-            - containerPort: 11434
-          resources:
-            requests:
-              cpu: "500m"
-              memory: "2Gi"
----
-apiVersion: v1
-kind: Service
-metadata:
-  name: ollama
-  namespace: default
-spec:
-  selector:
-    app: ollama
-  ports:
-    - port: 11434
-      targetPort: 11434
----
-# --- OTel Collector (debug exporter) ---
-apiVersion: v1
-kind: ConfigMap
-metadata:
-  name: otel-collector-config
-  namespace: default
-data:
-  config.yaml: |
-    receivers:
-      otlp:
-        protocols:
-          grpc:
-            endpoint: 0.0.0.0:4317
-    exporters:
-      debug:
-        verbosity: detailed
-    service:
-      pipelines:
-        metrics:
-          receivers: [otlp]
-          exporters: [debug]
-        logs:
-          receivers: [otlp]
-          exporters: [debug]
----
-apiVersion: apps/v1
-kind: Deployment
-metadata:
-  name: otel-collector
-  namespace: default
-spec:
-  replicas: 1
-  selector:
-    matchLabels:
-      app: otel-collector
-  template:
-    metadata:
-      labels:
-        app: otel-collector
-    spec:
-      containers:
-        - name: collector
-          image: otel/opentelemetry-collector:latest
-          imagePullPolicy: Never
-          ports:
-            - containerPort: 4317
-          volumeMounts:
-            - name: config
-              mountPath: /etc/otelcol/config.yaml
-              subPath: config.yaml
-      volumes:
-        - name: config
-          configMap:
-            name: otel-collector-config
----
-apiVersion: v1
-kind: Service
-metadata:
-  name: otel-collector
-  namespace: default
-spec:
-  selector:
-    app: otel-collector
-  ports:
-    - port: 4317
-      targetPort: 4317
----
-# --- AI Gateway CRDs ---
-# 1. GatewayClass
-apiVersion: gateway.networking.k8s.io/v1
-kind: GatewayClass
-metadata:
-  name: envoy-ai-gateway
-spec:
-  controllerName: gateway.envoyproxy.io/gatewayclass-controller
----
-# 2. GatewayConfig — OTLP configuration for SkyWalking
-#    Verified: GATEWAY_NAME auto-resolves from pod label
-#    gateway.envoyproxy.io/owning-gateway-name via Downward API
-apiVersion: aigateway.envoyproxy.io/v1alpha1
-kind: GatewayConfig
-metadata:
-  name: sw-test-config
-  namespace: default
-spec:
-  extProc:
-    kubernetes:
-      env:
-        # job_name for MAL/LAL rule routing (fixed for all deployments)
-        - name: OTEL_SERVICE_NAME
-          value: "envoy-ai-gateway"
-        # OTLP endpoint — OTel Collector (or SkyWalking OAP in production)
-        - name: OTEL_EXPORTER_OTLP_ENDPOINT
-          value: "http://otel-collector.default:4317";
-        - name: OTEL_EXPORTER_OTLP_PROTOCOL
-          value: "grpc"
-        # Enable OTLP for both metrics and access logs
-        - name: OTEL_METRICS_EXPORTER
-          value: "otlp"
-        - name: OTEL_LOGS_EXPORTER
-          value: "otlp"
-        - name: OTEL_METRIC_EXPORT_INTERVAL
-          value: "5000"
-        # Gateway name = Gateway CRD metadata.name (e.g., "my-ai-gateway")
-        # Read from pod label gateway.envoyproxy.io/owning-gateway-name,
-        # which is auto-set by the Envoy Gateway controller on every envoy pod.
-        - name: GATEWAY_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: 
metadata.labels['gateway.envoyproxy.io/owning-gateway-name']
-        # Pod name (e.g., "envoy-default-my-ai-gateway-76d02f2b-xxx")
-        - name: POD_NAME
-          valueFrom:
-            fieldRef:
-              fieldPath: metadata.name
-        # aigw.service → SkyWalking service name (= Gateway CRD name, 
auto-resolved)
-        # service.instance.id → SkyWalking instance name (= pod name, 
auto-resolved)
-        # $(VAR) substitution references the valueFrom env vars defined above.
-        - name: OTEL_RESOURCE_ATTRIBUTES
-          value: "aigw.service=$(GATEWAY_NAME),service.instance.id=$(POD_NAME)"
----
-# 3. Gateway — references GatewayConfig via annotation
-apiVersion: gateway.networking.k8s.io/v1
-kind: Gateway
-metadata:
-  name: my-ai-gateway
-  namespace: default
-  annotations:
-    aigateway.envoyproxy.io/gateway-config: sw-test-config
-spec:
-  gatewayClassName: envoy-ai-gateway
-  listeners:
-    - name: http
-      protocol: HTTP
-      port: 80
----
-# 4. AIGatewayRoute — routing + token metadata for access logs
-apiVersion: aigateway.envoyproxy.io/v1alpha1
-kind: AIGatewayRoute
-metadata:
-  name: my-ai-gateway-route
-  namespace: default
-spec:
-  parentRefs:
-    - name: my-ai-gateway
-      kind: Gateway
-      group: gateway.networking.k8s.io
-  llmRequestCosts:
-    - metadataKey: llm_input_token
-      type: InputToken
-    - metadataKey: llm_output_token
-      type: OutputToken
-    - metadataKey: llm_total_token
-      type: TotalToken
-  rules:
-    - backendRefs:
-        - name: ollama-backend
----
-# 5. AIServiceBackend + Backend — Ollama in-cluster
-apiVersion: aigateway.envoyproxy.io/v1alpha1
-kind: AIServiceBackend
-metadata:
-  name: ollama-backend
-  namespace: default
-spec:
-  schema:
-    name: OpenAI
-    prefix: "/v1"
-  backendRef:
-    name: ollama-backend
-    kind: Backend
-    group: gateway.envoyproxy.io
----
-apiVersion: gateway.envoyproxy.io/v1alpha1
-kind: Backend
-metadata:
-  name: ollama-backend
-  namespace: default
-spec:
-  endpoints:
-    - fqdn:
-        hostname: ollama.default.svc.cluster.local
-        port: 11434
diff --git a/docs/en/swip/SWIP-10/kind-test-setup.sh 
b/docs/en/swip/SWIP-10/kind-test-setup.sh
deleted file mode 100644
index 4fd3afcc46..0000000000
--- a/docs/en/swip/SWIP-10/kind-test-setup.sh
+++ /dev/null
@@ -1,108 +0,0 @@
-#!/bin/bash
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-
-# SWIP-10 Local Verification: Envoy AI Gateway + SkyWalking OTLP on Kind
-#
-# Prerequisites:
-#   - kind, kubectl, helm, docker installed
-#   - Docker images pulled (or internet access for Kind to pull)
-#
-# This script sets up a Kind cluster with:
-#   - Envoy Gateway (v1.3.3) + AI Gateway controller (v0.5.0)
-#   - Ollama (in-cluster) with a small model
-#   - OTel Collector (debug exporter) to capture OTLP metrics and logs
-#   - AI Gateway configured with SkyWalking-compatible OTLP resource attributes
-#
-# Usage:
-#   ./kind-test-setup.sh          # Full setup
-#   ./kind-test-setup.sh cleanup  # Delete the cluster
-
-set -e
-
-CLUSTER_NAME="aigw-swip10-test"
-
-if [ "$1" = "cleanup" ]; then
-  echo "Cleaning up..."
-  kind delete cluster --name $CLUSTER_NAME
-  exit 0
-fi
-
-echo "=== Step 1: Create Kind cluster ==="
-kind create cluster --name $CLUSTER_NAME
-
-echo "=== Step 2: Pre-load Docker images ==="
-IMAGES=(
-  "envoyproxy/ai-gateway-controller:v0.5.0"
-  "envoyproxy/ai-gateway-extproc:v0.5.0"
-  "envoyproxy/gateway:v1.3.3"
-  "envoyproxy/envoy:distroless-v1.33.3"
-  "otel/opentelemetry-collector:latest"
-  "ollama/ollama:latest"
-)
-for img in "${IMAGES[@]}"; do
-  echo "Pulling $img..."
-  docker pull "$img"
-  echo "Loading $img into Kind..."
-  kind load docker-image "$img" --name $CLUSTER_NAME
-done
-
-echo "=== Step 3: Install Envoy Gateway ==="
-# enableBackend is required for Backend resources used by AIServiceBackend
-helm install eg oci://docker.io/envoyproxy/gateway-helm \
-  --version v1.3.3 -n envoy-gateway-system --create-namespace \
-  --set config.envoyGateway.extensionApis.enableBackend=true
-kubectl wait --for=condition=available deployment/envoy-gateway \
-  -n envoy-gateway-system --timeout=120s
-
-echo "=== Step 4: Install AI Gateway ==="
-helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm \
-  --namespace envoy-ai-gateway-system --create-namespace
-helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm \
-  --namespace envoy-ai-gateway-system --create-namespace
-kubectl wait --for=condition=available deployment/ai-gateway-controller \
-  -n envoy-ai-gateway-system --timeout=120s
-
-echo "=== Step 5: Deploy test resources ==="
-kubectl apply -f kind-test-resources.yaml
-
-echo "=== Step 6: Wait for pods ==="
-sleep 10
-kubectl wait --for=condition=available deployment/ollama -n default 
--timeout=120s
-kubectl wait --for=condition=available deployment/otel-collector -n default 
--timeout=60s
-
-echo "=== Step 7: Pull Ollama model ==="
-OLLAMA_POD=$(kubectl get pod -l app=ollama -o 
jsonpath='{.items[0].metadata.name}')
-kubectl exec "$OLLAMA_POD" -- ollama pull qwen2.5:0.5b
-
-echo "=== Step 8: Wait for Envoy pod ==="
-sleep 30
-kubectl get pods -A
-
-echo ""
-echo "=== Setup complete ==="
-echo "To test:"
-echo "  kubectl port-forward -n envoy-gateway-system 
svc/envoy-default-my-ai-gateway-76d02f2b 8080:80 &"
-echo "  curl -s --noproxy '*' http://localhost:8080/v1/chat/completions \\"
-echo "    -H 'Content-Type: application/json' \\"
-echo "    -d 
'{\"model\":\"qwen2.5:0.5b\",\"messages\":[{\"role\":\"user\",\"content\":\"Say 
hi\"}]}'"
-echo ""
-echo "To check OTLP output:"
-echo "  kubectl logs -l app=otel-collector | grep -A 20 
'ResourceMetrics\\|ResourceLog'"
-echo ""
-echo "To cleanup:"
-echo "  ./kind-test-setup.sh cleanup"
diff --git 
a/oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/v2/dsl/SampleFamily.java
 
b/oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/v2/dsl/SampleFamily.java
index fa392f9265..980b9d8a1e 100644
--- 
a/oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/v2/dsl/SampleFamily.java
+++ 
b/oap-server/analyzer/meter-analyzer/src/main/java/org/apache/skywalking/oap/meter/analyzer/v2/dsl/SampleFamily.java
@@ -98,6 +98,27 @@ public class SampleFamily {
 
     public final RunningContext context;
 
+    @Override
+    public String toString() {
+        if (samples.length == 0) {
+            return "SampleFamily{EMPTY}";
+        }
+        final StringBuilder sb = new StringBuilder("SampleFamily{samples=[\n");
+        for (final Sample s : samples) {
+            sb.append("  ").append(s.getName()).append(s.getLabels()).append(" 
").append(s.getValue()).append('\n');
+        }
+        sb.append("]}");
+        return sb.toString();
+    }
+
+    /**
+     * Dump this SampleFamily for debugging.
+     */
+    public SampleFamily debugDump() {
+        log.info("{}", this);
+        return this;
+    }
+
     /**
      * Following operations are used in DSL
      */
diff --git 
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java
 
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java
index 284192ee8d..54076b72d5 100644
--- 
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java
+++ 
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java
@@ -272,7 +272,13 @@ public enum Layer {
      * Virtual GenAI is a virtual layer used to represent and monitor remote, 
uninstrumented
      * Generative AI providers.
      */
-    VIRTUAL_GENAI(45, false);
+    VIRTUAL_GENAI(45, false),
+
+    /**
+     * Envoy AI Gateway is an AI/LLM traffic gateway built on Envoy Proxy,
+     * providing observability for GenAI API traffic.
+     */
+    ENVOY_AI_GATEWAY(46, true);
 
     private final int value;
     /**
diff --git 
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/management/ui/template/UITemplateInitializer.java
 
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/management/ui/template/UITemplateInitializer.java
index 525ccf11e2..4651101e84 100644
--- 
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/management/ui/template/UITemplateInitializer.java
+++ 
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/management/ui/template/UITemplateInitializer.java
@@ -82,6 +82,7 @@ public class UITemplateInitializer {
         Layer.FLINK.name(),
         Layer.BANYANDB.name(),
         Layer.VIRTUAL_GENAI.name(),
+        Layer.ENVOY_AI_GATEWAY.name(),
         "custom"
     };
     private final UITemplateManagementService uiTemplateManagementService;
diff --git 
a/oap-server/server-receiver-plugin/otel-receiver-plugin/src/main/java/org/apache/skywalking/oap/server/receiver/otel/otlp/OpenTelemetryMetricRequestProcessor.java
 
b/oap-server/server-receiver-plugin/otel-receiver-plugin/src/main/java/org/apache/skywalking/oap/server/receiver/otel/otlp/OpenTelemetryMetricRequestProcessor.java
index 6b26a86656..315892fb05 100644
--- 
a/oap-server/server-receiver-plugin/otel-receiver-plugin/src/main/java/org/apache/skywalking/oap/server/receiver/otel/otlp/OpenTelemetryMetricRequestProcessor.java
+++ 
b/oap-server/server-receiver-plugin/otel-receiver-plugin/src/main/java/org/apache/skywalking/oap/server/receiver/otel/otlp/OpenTelemetryMetricRequestProcessor.java
@@ -70,7 +70,12 @@ public class OpenTelemetryMetricRequestProcessor implements 
Service {
 
     private final OtelMetricReceiverConfig config;
 
-    private static final Map<String, String> LABEL_MAPPINGS =
+    /**
+     * Fallback label mappings: if the target label (value) is absent in 
resource attributes,
+     * copy the source label (key) as the target. The source label is always 
kept as-is.
+     * For example, if "job_name" is not present, "service.name" value is 
copied as "job_name".
+     */
+    private static final Map<String, String> FALLBACK_LABEL_MAPPINGS =
         ImmutableMap
             .<String, String>builder()
             .put("net.host.name", "node_identifier_host_name")
@@ -99,18 +104,20 @@ public class OpenTelemetryMetricRequestProcessor 
implements Service {
                     log.debug("Resource attributes: {}", 
request.getResource().getAttributesList());
                 }
 
-                final Map<String, String> nodeLabels =
-                    request
-                        .getResource()
-                        .getAttributesList()
-                        .stream()
-                        .collect(toMap(
-                            it -> LABEL_MAPPINGS
-                                .getOrDefault(it.getKey(), it.getKey())
-                                .replaceAll("\\.", "_"),
-                                it -> anyValueToString(it.getValue()),
-                        (v1, v2) -> v1
-                        ));
+                // First pass: collect all resource attributes with dots 
replaced by underscores
+                final Map<String, String> nodeLabels = new HashMap<>();
+                for (final var it : request.getResource().getAttributesList()) 
{
+                    final String key = it.getKey().replaceAll("\\.", "_");
+                    final String value = anyValueToString(it.getValue());
+                    nodeLabels.putIfAbsent(key, value);
+                }
+                // Second pass: apply fallback mappings — only if the target 
key is absent
+                for (final var it : request.getResource().getAttributesList()) 
{
+                    final String targetKey = 
FALLBACK_LABEL_MAPPINGS.get(it.getKey());
+                    if (targetKey != null) {
+                        nodeLabels.putIfAbsent(targetKey, 
anyValueToString(it.getValue()));
+                    }
+                }
 
                 ImmutableMap<String, SampleFamily> sampleFamilies = 
PrometheusMetricConverter.convertPromMetricToSampleFamily(
                     request.getScopeMetricsList().stream()
@@ -154,7 +161,7 @@ public class OpenTelemetryMetricRequestProcessor implements 
Service {
         return kvs
             .stream()
             .collect(toMap(
-                KeyValue::getKey,
+                it -> it.getKey().replaceAll("\\.", "_"),
                 it -> anyValueToString(it.getValue())
             ));
     }
diff --git a/oap-server/server-starter/src/main/resources/application.yml 
b/oap-server/server-starter/src/main/resources/application.yml
index 7f2223f8fc..71f7c24de9 100644
--- a/oap-server/server-starter/src/main/resources/application.yml
+++ b/oap-server/server-starter/src/main/resources/application.yml
@@ -237,7 +237,7 @@ agent-analyzer:
 log-analyzer:
   selector: ${SW_LOG_ANALYZER:default}
   default:
-    lalFiles: 
${SW_LOG_LAL_FILES:envoy-als,mesh-dp,mysql-slowsql,pgsql-slowsql,redis-slowsql,k8s-service,nginx,default}
+    lalFiles: 
${SW_LOG_LAL_FILES:envoy-als,mesh-dp,mysql-slowsql,pgsql-slowsql,redis-slowsql,k8s-service,nginx,envoy-ai-gateway,default}
     malFiles: ${SW_LOG_MAL_FILES:"nginx"}
 
 event-analyzer:
@@ -390,7 +390,7 @@ receiver-otel:
   selector: ${SW_OTEL_RECEIVER:default}
   default:
     enabledHandlers: 
${SW_OTEL_RECEIVER_ENABLED_HANDLERS:"otlp-traces,otlp-metrics,otlp-logs"}
-    enabledOtelMetricsRules: 
${SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES:"apisix,nginx/*,k8s/*,istio-controlplane,vm,mysql/*,postgresql/*,oap,aws-eks/*,windows,aws-s3/*,aws-dynamodb/*,aws-gateway/*,redis/*,elasticsearch/*,rabbitmq/*,mongodb/*,kafka/*,pulsar/*,bookkeeper/*,rocketmq/*,clickhouse/*,activemq/*,kong/*,flink/*,banyandb/*"}
+    enabledOtelMetricsRules: 
${SW_OTEL_RECEIVER_ENABLED_OTEL_METRICS_RULES:"apisix,nginx/*,k8s/*,istio-controlplane,vm,mysql/*,postgresql/*,oap,aws-eks/*,windows,aws-s3/*,aws-dynamodb/*,aws-gateway/*,redis/*,elasticsearch/*,rabbitmq/*,mongodb/*,kafka/*,pulsar/*,bookkeeper/*,rocketmq/*,clickhouse/*,activemq/*,kong/*,flink/*,banyandb/*,envoy-ai-gateway/*"}
 
 receiver-zipkin:
   selector: ${SW_RECEIVER_ZIPKIN:-}
diff --git 
a/oap-server/server-starter/src/main/resources/lal/envoy-ai-gateway.yaml 
b/oap-server/server-starter/src/main/resources/lal/envoy-ai-gateway.yaml
new file mode 100644
index 0000000000..0e94bd7c5d
--- /dev/null
+++ b/oap-server/server-starter/src/main/resources/lal/envoy-ai-gateway.yaml
@@ -0,0 +1,47 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Envoy AI Gateway access log processing via OTLP.
+#
+# Sampling policy: only persist abnormal or expensive requests.
+# Normal 200 responses with low token count and no upstream failure are 
dropped.
+
+rules:
+  - name: envoy-ai-gateway-access-log
+    layer: ENVOY_AI_GATEWAY
+    dsl: |
+      filter {
+        // Drop normal logs: response < 400, no upstream failure, low token 
count
+        if (tag("response_code") as Integer < 400) {
+          if (tag("upstream_transport_failure_reason") == "" || 
tag("upstream_transport_failure_reason") == "-") {
+            if ((tag("gen_ai.usage.input_tokens") as Integer) + 
(tag("gen_ai.usage.output_tokens") as Integer) < 10000) {
+              abort {}
+            }
+          }
+        }
+
+        extractor {
+          tag 'gen_ai.request.model': tag("gen_ai.request.model")
+          tag 'gen_ai.response.model': tag("gen_ai.response.model")
+          tag 'gen_ai.provider.name': tag("gen_ai.provider.name")
+          tag 'gen_ai.usage.input_tokens': tag("gen_ai.usage.input_tokens")
+          tag 'gen_ai.usage.output_tokens': tag("gen_ai.usage.output_tokens")
+          tag 'response_code': tag("response_code")
+          tag 'duration': tag("duration")
+        }
+
+        sink {
+        }
+      }
diff --git 
a/oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/gateway-instance.yaml
 
b/oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/gateway-instance.yaml
new file mode 100644
index 0000000000..d0509d087e
--- /dev/null
+++ 
b/oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/gateway-instance.yaml
@@ -0,0 +1,98 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Envoy AI Gateway — Instance-level (per-pod) metrics
+#
+# Same metrics as gateway-service.yaml but scoped to individual pods.
+# All durations are in seconds from the AI Gateway; multiply by 1000 for ms 
display.
+
+filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
+expSuffix: instance(['service_name'], ['service_instance_id'], 
Layer.ENVOY_AI_GATEWAY)
+metricPrefix: meter_envoy_ai_gw_instance
+
+metricsRules:
+  # ===================== Aggregate metrics =====================
+
+  # Request CPM
+  - name: request_cpm
+    exp: gen_ai_server_request_duration_count.sum(['service_name', 
'service_instance_id']).increase('PT1M')
+
+  # Request latency average (ms)
+  - name: request_latency_avg
+    exp: gen_ai_server_request_duration_sum.sum(['service_name', 
'service_instance_id']).increase('PT1M') / 
gen_ai_server_request_duration_count.sum(['service_name', 
'service_instance_id']).increase('PT1M') * 1000
+
+  # Request latency percentile (ms)
+  - name: request_latency_percentile
+    exp: gen_ai_server_request_duration.sum(['le', 'service_name', 
'service_instance_id']).increase('PT1M').histogram().histogram_percentile([50,75,90,95,99])
 * 1000
+
+  # Input token rate (tokens/min)
+  - name: input_token_rate
+    exp: gen_ai_client_token_usage_sum.tagEqual('gen_ai_token_type', 
'input').sum(['service_name', 'service_instance_id']).increase('PT1M')
+
+  # Output token rate (tokens/min)
+  - name: output_token_rate
+    exp: gen_ai_client_token_usage_sum.tagEqual('gen_ai_token_type', 
'output').sum(['service_name', 'service_instance_id']).increase('PT1M')
+
+  # TTFT average (ms)
+  - name: ttft_avg
+    exp: gen_ai_server_time_to_first_token_sum.sum(['service_name', 
'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_to_first_token_count.sum(['service_name', 
'service_instance_id']).increase('PT1M') * 1000
+
+  # TTFT percentile (ms)
+  - name: ttft_percentile
+    exp: gen_ai_server_time_to_first_token.sum(['le', 'service_name', 
'service_instance_id']).increase('PT1M').histogram().histogram_percentile([50,75,90,95,99])
 * 1000
+
+  # TPOT average (ms)
+  - name: tpot_avg
+    exp: gen_ai_server_time_per_output_token_sum.sum(['service_name', 
'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_per_output_token_count.sum(['service_name', 
'service_instance_id']).increase('PT1M') * 1000
+
+  # TPOT percentile (ms)
+  - name: tpot_percentile
+    exp: gen_ai_server_time_per_output_token.sum(['le', 'service_name', 
'service_instance_id']).increase('PT1M').histogram().histogram_percentile([50,75,90,95,99])
 * 1000
+
+  # ===================== Per-provider breakdown =====================
+
+  # Provider request CPM
+  - name: provider_request_cpm
+    exp: gen_ai_server_request_duration_count.sum(['gen_ai_provider_name', 
'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Provider token rate
+  - name: provider_token_rate
+    exp: gen_ai_client_token_usage_sum.sum(['gen_ai_provider_name', 
'gen_ai_token_type', 'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Provider latency average (ms)
+  - name: provider_latency_avg
+    exp: gen_ai_server_request_duration_sum.sum(['gen_ai_provider_name', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_request_duration_count.sum(['gen_ai_provider_name', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
+
+  # ===================== Per-model breakdown =====================
+
+  # Model request CPM
+  - name: model_request_cpm
+    exp: gen_ai_server_request_duration_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Model token rate
+  - name: model_token_rate
+    exp: gen_ai_client_token_usage_sum.sum(['gen_ai_response_model', 
'gen_ai_token_type', 'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Model latency average (ms)
+  - name: model_latency_avg
+    exp: gen_ai_server_request_duration_sum.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_request_duration_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
+
+  # Model TTFT average (ms)
+  - name: model_ttft_avg
+    exp: gen_ai_server_time_to_first_token_sum.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_to_first_token_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
+
+  # Model TPOT average (ms)
+  - name: model_tpot_avg
+    exp: gen_ai_server_time_per_output_token_sum.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_per_output_token_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
diff --git 
a/oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/gateway-service.yaml
 
b/oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/gateway-service.yaml
new file mode 100644
index 0000000000..a23dd2a8cb
--- /dev/null
+++ 
b/oap-server/server-starter/src/main/resources/otel-rules/envoy-ai-gateway/gateway-service.yaml
@@ -0,0 +1,103 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Envoy AI Gateway — Service-level metrics
+#
+# Source OTLP metrics (dots → underscores by OTel receiver):
+#   gen_ai_client_token_usage          — Histogram (Delta), labels: 
gen_ai_token_type, gen_ai_provider_name, gen_ai_response_model
+#   gen_ai_server_request_duration     — Histogram (Delta), unit: seconds
+#   gen_ai_server_time_to_first_token  — Histogram (Delta), unit: seconds, 
streaming only
+#   gen_ai_server_time_per_output_token — Histogram (Delta), unit: seconds, 
streaming only
+#
+# All durations are in seconds from the AI Gateway; multiply by 1000 for ms 
display.
+
+filter: "{ tags -> tags.job_name == 'envoy-ai-gateway' }"
+expSuffix: service(['service_name'], Layer.ENVOY_AI_GATEWAY)
+metricPrefix: meter_envoy_ai_gw
+
+metricsRules:
+  # ===================== Aggregate metrics =====================
+
+  # Request CPM — count of requests per minute
+  - name: request_cpm
+    exp: gen_ai_server_request_duration_count.sum(['service_name', 
'service_instance_id']).increase('PT1M')
+
+  # Request latency average (ms)
+  - name: request_latency_avg
+    exp: gen_ai_server_request_duration_sum.sum(['service_name', 
'service_instance_id']).increase('PT1M') / 
gen_ai_server_request_duration_count.sum(['service_name', 
'service_instance_id']).increase('PT1M') * 1000
+
+  # Request latency percentile (ms)
+  - name: request_latency_percentile
+    exp: gen_ai_server_request_duration.sum(['le', 'service_name', 
'service_instance_id']).increase('PT1M').histogram().histogram_percentile([50,75,90,95,99])
 * 1000
+
+  # Input token rate (tokens/min)
+  - name: input_token_rate
+    exp: gen_ai_client_token_usage_sum.tagEqual('gen_ai_token_type', 
'input').sum(['service_name', 'service_instance_id']).increase('PT1M')
+
+  # Output token rate (tokens/min)
+  - name: output_token_rate
+    exp: gen_ai_client_token_usage_sum.tagEqual('gen_ai_token_type', 
'output').sum(['service_name', 'service_instance_id']).increase('PT1M')
+
+  # TTFT average (ms) — streaming requests only
+  - name: ttft_avg
+    exp: gen_ai_server_time_to_first_token_sum.sum(['service_name', 
'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_to_first_token_count.sum(['service_name', 
'service_instance_id']).increase('PT1M') * 1000
+
+  # TTFT percentile (ms)
+  - name: ttft_percentile
+    exp: gen_ai_server_time_to_first_token.sum(['le', 'service_name', 
'service_instance_id']).increase('PT1M').histogram().histogram_percentile([50,75,90,95,99])
 * 1000
+
+  # TPOT average (ms) — time per output token, streaming only
+  - name: tpot_avg
+    exp: gen_ai_server_time_per_output_token_sum.sum(['service_name', 
'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_per_output_token_count.sum(['service_name', 
'service_instance_id']).increase('PT1M') * 1000
+
+  # TPOT percentile (ms)
+  - name: tpot_percentile
+    exp: gen_ai_server_time_per_output_token.sum(['le', 'service_name', 
'service_instance_id']).increase('PT1M').histogram().histogram_percentile([50,75,90,95,99])
 * 1000
+
+  # ===================== Per-provider breakdown =====================
+
+  # Provider request CPM — labeled by gen_ai_provider_name
+  - name: provider_request_cpm
+    exp: gen_ai_server_request_duration_count.sum(['gen_ai_provider_name', 
'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Provider token rate — labeled by gen_ai_provider_name and gen_ai_token_type
+  - name: provider_token_rate
+    exp: gen_ai_client_token_usage_sum.sum(['gen_ai_provider_name', 
'gen_ai_token_type', 'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Provider latency average (ms) — labeled by gen_ai_provider_name
+  - name: provider_latency_avg
+    exp: gen_ai_server_request_duration_sum.sum(['gen_ai_provider_name', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_request_duration_count.sum(['gen_ai_provider_name', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
+
+  # ===================== Per-model breakdown =====================
+
+  # Model request CPM — labeled by gen_ai_response_model
+  - name: model_request_cpm
+    exp: gen_ai_server_request_duration_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Model token rate — labeled by gen_ai_response_model and gen_ai_token_type
+  - name: model_token_rate
+    exp: gen_ai_client_token_usage_sum.sum(['gen_ai_response_model', 
'gen_ai_token_type', 'service_name', 'service_instance_id']).increase('PT1M')
+
+  # Model latency average (ms) — labeled by gen_ai_response_model
+  - name: model_latency_avg
+    exp: gen_ai_server_request_duration_sum.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_request_duration_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
+
+  # Model TTFT average (ms) — labeled by gen_ai_response_model
+  - name: model_ttft_avg
+    exp: gen_ai_server_time_to_first_token_sum.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_to_first_token_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
+
+  # Model TPOT average (ms) — labeled by gen_ai_response_model
+  - name: model_tpot_avg
+    exp: gen_ai_server_time_per_output_token_sum.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') / 
gen_ai_server_time_per_output_token_count.sum(['gen_ai_response_model', 
'service_name', 'service_instance_id']).increase('PT1M') * 1000
diff --git 
a/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-instance.json
 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-instance.json
new file mode 100644
index 0000000000..7cc0e375c6
--- /dev/null
+++ 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-instance.json
@@ -0,0 +1,509 @@
+[
+  {
+    "id": "Envoy-AI-Gateway-Instance",
+    "configuration": {
+      "children": [
+        {
+          "x": 0,
+          "y": 0,
+          "w": 24,
+          "h": 42,
+          "i": "0",
+          "type": "Tab",
+          "children": [
+            {
+              "name": "Overview",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "0",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_request_cpm"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Request CPM",
+                      "unit": "calls/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request CPM",
+                    "tips": "Calls Per Minute — total requests through this 
pod"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "1",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_request_latency_avg"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Avg Latency",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request Latency Avg"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "2",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_request_latency_percentile"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Latency Percentile",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request Latency Percentile",
+                    "tips": "P50 / P75 / P90 / P95 / P99"
+                  }
+                },
+                {
+                  "x": 0,
+                  "y": 13,
+                  "w": 8,
+                  "h": 13,
+                  "i": "3",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_input_token_rate"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Input Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Input Token Rate",
+                    "tips": "Input (prompt) tokens per minute sent to LLM 
providers"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 13,
+                  "w": 8,
+                  "h": 13,
+                  "i": "4",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_output_token_rate"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Output Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Output Token Rate",
+                    "tips": "Output (completion) tokens per minute generated 
by LLM providers"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 13,
+                  "w": 8,
+                  "h": 13,
+                  "i": "5",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_ttft_avg"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TTFT Avg",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Time to First Token (Avg)",
+                    "tips": "Average time to first token for streaming 
requests"
+                  }
+                },
+                {
+                  "x": 0,
+                  "y": 26,
+                  "w": 8,
+                  "h": 13,
+                  "i": "6",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_ttft_percentile"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TTFT Percentile",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TTFT Percentile",
+                    "tips": "P50 / P75 / P90 / P95 / P99"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 26,
+                  "w": 8,
+                  "h": 13,
+                  "i": "7",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_tpot_avg"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TPOT Avg",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Time Per Output Token (Avg)",
+                    "tips": "Average inter-token latency for streaming 
requests"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 26,
+                  "w": 8,
+                  "h": 13,
+                  "i": "8",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_instance_tpot_percentile"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TPOT Percentile",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TPOT Percentile",
+                    "tips": "P50 / P75 / P90 / P95 / P99"
+                  }
+                }
+              ]
+            },
+            {
+              "name": "Providers",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "0",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_provider_request_cpm,sum(gen_ai_provider_name))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Provider CPM",
+                      "unit": "calls/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request CPM by Provider"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "1",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_provider_token_rate,sum(gen_ai_provider_name))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Provider Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Token Rate by Provider"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "2",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_provider_latency_avg,avg(gen_ai_provider_name))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Provider Latency",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Latency Avg by Provider"
+                  }
+                }
+              ]
+            },
+            {
+              "name": "Models",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "0",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_model_request_cpm,sum(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model CPM",
+                      "unit": "calls/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request CPM by Model"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "1",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_model_token_rate,sum(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Token Rate by Model"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "2",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_model_latency_avg,avg(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model Latency",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Latency Avg by Model"
+                  }
+                },
+                {
+                  "x": 0,
+                  "y": 13,
+                  "w": 12,
+                  "h": 13,
+                  "i": "3",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_model_ttft_avg,avg(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model TTFT",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TTFT Avg by Model"
+                  }
+                },
+                {
+                  "x": 12,
+                  "y": 13,
+                  "w": 12,
+                  "h": 13,
+                  "i": "4",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_instance_model_tpot_avg,avg(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model TPOT",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TPOT Avg by Model"
+                  }
+                }
+              ]
+            },
+            {
+              "name": "Log",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 24,
+                  "h": 48,
+                  "i": "0",
+                  "type": "Log"
+                }
+              ]
+            }
+          ]
+        }
+      ],
+      "layer": "ENVOY_AI_GATEWAY",
+      "entity": "ServiceInstance",
+      "name": "Envoy-AI-Gateway-Instance",
+      "id": "Envoy-AI-Gateway-Instance",
+      "isRoot": false,
+      "expressions": [
+        "avg(meter_envoy_ai_gw_instance_request_latency_avg)",
+        "avg(meter_envoy_ai_gw_instance_request_cpm)",
+        "avg(meter_envoy_ai_gw_instance_input_token_rate)",
+        "avg(meter_envoy_ai_gw_instance_output_token_rate)"
+      ],
+      "expressionsConfig": [
+        {
+          "unit": "ms",
+          "label": "Latency"
+        },
+        {
+          "label": "CPM",
+          "unit": "calls/min"
+        },
+        {
+          "label": "Input Tokens",
+          "unit": "tokens/min"
+        },
+        {
+          "label": "Output Tokens",
+          "unit": "tokens/min"
+        }
+      ]
+    }
+  }
+]
diff --git 
a/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-root.json
 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-root.json
new file mode 100644
index 0000000000..37796f89b9
--- /dev/null
+++ 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-root.json
@@ -0,0 +1,63 @@
+[
+  {
+    "id": "Envoy-AI-Gateway-Root",
+    "configuration": {
+      "children": [
+        {
+          "x": 0,
+          "y": 0,
+          "w": 24,
+          "h": 52,
+          "i": "0",
+          "type": "Widget",
+          "widget": {
+            "title": "Envoy AI Gateway"
+          },
+          "graph": {
+            "type": "ServiceList",
+            "dashboardName": "Envoy-AI-Gateway-Service",
+            "fontSize": 12,
+            "showXAxis": false,
+            "showYAxis": false,
+            "showGroup": false
+          },
+          "expressions": [
+            "avg(meter_envoy_ai_gw_request_cpm)",
+            "avg(meter_envoy_ai_gw_request_latency_avg)",
+            "avg(meter_envoy_ai_gw_input_token_rate)",
+            "avg(meter_envoy_ai_gw_output_token_rate)"
+          ],
+          "subExpressions": [
+            "meter_envoy_ai_gw_request_cpm",
+            "meter_envoy_ai_gw_request_latency_avg",
+            "meter_envoy_ai_gw_input_token_rate",
+            "meter_envoy_ai_gw_output_token_rate"
+          ],
+          "metricConfig": [
+            {
+              "label": "CPM",
+              "unit": "calls/min"
+            },
+            {
+              "unit": "ms",
+              "label": "Latency"
+            },
+            {
+              "label": "Input Tokens",
+              "unit": "tokens/min"
+            },
+            {
+              "label": "Output Tokens",
+              "unit": "tokens/min"
+            }
+          ]
+        }
+      ],
+      "id": "Envoy-AI-Gateway-Root",
+      "layer": "ENVOY_AI_GATEWAY",
+      "entity": "All",
+      "name": "Envoy-AI-Gateway-Root",
+      "isRoot": true
+    }
+  }
+]
diff --git 
a/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-service.json
 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-service.json
new file mode 100644
index 0000000000..8acd215c07
--- /dev/null
+++ 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/envoy_ai_gateway/envoy-ai-gateway-service.json
@@ -0,0 +1,528 @@
+[
+  {
+    "id": "Envoy-AI-Gateway-Service",
+    "configuration": {
+      "children": [
+        {
+          "x": 0,
+          "y": 0,
+          "w": 24,
+          "h": 42,
+          "i": "0",
+          "type": "Tab",
+          "children": [
+            {
+              "name": "Overview",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "0",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_request_cpm"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Request CPM",
+                      "unit": "calls/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request CPM",
+                    "tips": "Calls Per Minute — total requests through the AI 
Gateway"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "1",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_request_latency_avg"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Avg Latency",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request Latency Avg"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "2",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_request_latency_percentile"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Latency Percentile",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request Latency Percentile",
+                    "tips": "P50 / P75 / P90 / P95 / P99"
+                  }
+                },
+                {
+                  "x": 0,
+                  "y": 13,
+                  "w": 8,
+                  "h": 13,
+                  "i": "3",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_input_token_rate"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Input Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Input Token Rate",
+                    "tips": "Input (prompt) tokens per minute sent to LLM 
providers"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 13,
+                  "w": 8,
+                  "h": 13,
+                  "i": "4",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_output_token_rate"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Output Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Output Token Rate",
+                    "tips": "Output (completion) tokens per minute generated 
by LLM providers"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 13,
+                  "w": 8,
+                  "h": 13,
+                  "i": "5",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_ttft_avg"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TTFT Avg",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Time to First Token (Avg)",
+                    "tips": "Average time to first token for streaming 
requests"
+                  }
+                },
+                {
+                  "x": 0,
+                  "y": 26,
+                  "w": 8,
+                  "h": 13,
+                  "i": "6",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_ttft_percentile"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TTFT Percentile",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TTFT Percentile",
+                    "tips": "P50 / P75 / P90 / P95 / P99"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 26,
+                  "w": 8,
+                  "h": 13,
+                  "i": "7",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_tpot_avg"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TPOT Avg",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Time Per Output Token (Avg)",
+                    "tips": "Average inter-token latency for streaming 
requests"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 26,
+                  "w": 8,
+                  "h": 13,
+                  "i": "8",
+                  "type": "Widget",
+                  "expressions": [
+                    "meter_envoy_ai_gw_tpot_percentile"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "TPOT Percentile",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TPOT Percentile",
+                    "tips": "P50 / P75 / P90 / P95 / P99"
+                  }
+                }
+              ]
+            },
+            {
+              "name": "Providers",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "0",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_provider_request_cpm,sum(gen_ai_provider_name))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Provider CPM",
+                      "unit": "calls/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request CPM by Provider"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "1",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_provider_token_rate,sum(gen_ai_provider_name))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Provider Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Token Rate by Provider"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "2",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_provider_latency_avg,avg(gen_ai_provider_name))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Provider Latency",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Latency Avg by Provider"
+                  }
+                }
+              ]
+            },
+            {
+              "name": "Models",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "0",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_model_request_cpm,sum(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model CPM",
+                      "unit": "calls/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Request CPM by Model"
+                  }
+                },
+                {
+                  "x": 8,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "1",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_model_token_rate,sum(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model Tokens",
+                      "unit": "tokens/min"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Token Rate by Model"
+                  }
+                },
+                {
+                  "x": 16,
+                  "y": 0,
+                  "w": 8,
+                  "h": 13,
+                  "i": "2",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_model_latency_avg,avg(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model Latency",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "Latency Avg by Model"
+                  }
+                },
+                {
+                  "x": 0,
+                  "y": 13,
+                  "w": 12,
+                  "h": 13,
+                  "i": "3",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_model_ttft_avg,avg(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model TTFT",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TTFT Avg by Model"
+                  }
+                },
+                {
+                  "x": 12,
+                  "y": 13,
+                  "w": 12,
+                  "h": 13,
+                  "i": "4",
+                  "type": "Widget",
+                  "expressions": [
+                    
"aggregate_labels(meter_envoy_ai_gw_model_tpot_avg,avg(gen_ai_response_model))"
+                  ],
+                  "graph": {
+                    "type": "Line",
+                    "showXAxis": true,
+                    "showYAxis": true
+                  },
+                  "metricConfig": [
+                    {
+                      "label": "Model TPOT",
+                      "unit": "ms"
+                    }
+                  ],
+                  "widget": {
+                    "title": "TPOT Avg by Model"
+                  }
+                }
+              ]
+            },
+            {
+              "name": "Log",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 24,
+                  "h": 48,
+                  "i": "0",
+                  "type": "Log"
+                }
+              ]
+            },
+            {
+              "name": "Instances",
+              "children": [
+                {
+                  "x": 0,
+                  "y": 0,
+                  "w": 24,
+                  "h": 17,
+                  "i": "0",
+                  "type": "Widget",
+                  "graph": {
+                    "type": "InstanceList",
+                    "dashboardName": "Envoy-AI-Gateway-Instance",
+                    "fontSize": 12
+                  }
+                }
+              ]
+            }
+          ]
+        }
+      ],
+      "layer": "ENVOY_AI_GATEWAY",
+      "entity": "Service",
+      "name": "Envoy-AI-Gateway-Service",
+      "id": "Envoy-AI-Gateway-Service",
+      "isRoot": false,
+      "isDefault": true,
+      "expressions": [
+        "avg(meter_envoy_ai_gw_request_latency_avg)",
+        "avg(meter_envoy_ai_gw_request_cpm)",
+        "avg(meter_envoy_ai_gw_input_token_rate)/1000000",
+        "avg(meter_envoy_ai_gw_output_token_rate)/1000000"
+      ],
+      "expressionsConfig": [
+        {
+          "unit": "ms",
+          "label": "Latency"
+        },
+        {
+          "label": "CPM",
+          "unit": "calls/min"
+        },
+        {
+          "label": "Input Tokens",
+          "unit": "M tokens/min"
+        },
+        {
+          "label": "Output Tokens",
+          "unit": "M tokens/min"
+        }
+      ]
+    }
+  }
+]
diff --git 
a/oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml
 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml
index 2057f8b74a..feb37405af 100644
--- 
a/oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml
+++ 
b/oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml
@@ -252,6 +252,11 @@ menus:
         description: Observe the virtual GenAI providers and models which are 
conjectured by language agents through various plugins.
         documentLink: 
https://skywalking.apache.org/docs/main/next/en/setup/service-agent/virtual-genai/
         i18nKey: virtual_gen_ai
+      - title: Envoy AI Gateway
+        layer: ENVOY_AI_GATEWAY
+        description: Observe Envoy AI Gateway traffic including token usage, 
latency, TTFT, and per-provider/model breakdowns via OTLP metrics and access 
logs.
+        documentLink: 
https://skywalking.apache.org/docs/main/next/en/setup/backend/backend-envoy-ai-gateway-monitoring/
+        i18nKey: envoy_ai_gateway
   - title: Self Observability
     icon: self_observability
     description: Self Observability provides the observabilities for running 
components and servers from the SkyWalking ecosystem.
diff --git a/test/e2e-v2/cases/envoy-ai-gateway/docker-compose.yml 
b/test/e2e-v2/cases/envoy-ai-gateway/docker-compose.yml
new file mode 100644
index 0000000000..89669cf6fd
--- /dev/null
+++ b/test/e2e-v2/cases/envoy-ai-gateway/docker-compose.yml
@@ -0,0 +1,85 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Envoy AI Gateway e2e — ai-gateway-cli + Ollama + SkyWalking OAP
+#
+# Architecture:
+#   trigger → ai-gateway-cli (port 1975) → ollama (port 11434)
+#                  ↓ OTLP gRPC
+#             oap (port 11800) → banyandb
+
+services:
+  banyandb:
+    extends:
+      file: ../../script/docker-compose/base-compose.yml
+      service: banyandb
+    networks:
+      - e2e
+
+  oap:
+    extends:
+      file: ../../script/docker-compose/base-compose.yml
+      service: oap
+    environment:
+      SW_STORAGE: banyandb
+    ports:
+      - 12800
+    depends_on:
+      banyandb:
+        condition: service_healthy
+
+  ollama:
+    image: ollama/ollama:0.6.2
+    networks:
+      - e2e
+    expose:
+      - 11434
+    healthcheck:
+      test: ["CMD", "ollama", "list"]
+      interval: 5s
+      timeout: 60s
+      retries: 120
+
+  aigw:
+    # TODO: pin to a release version once ai-gateway-cli HTTP listener is 
available in a release
+    image: envoyproxy/ai-gateway-cli:latest
+    command: run --run-id=0
+    environment:
+      OPENAI_API_KEY: "dummy-key-not-used"
+      OPENAI_BASE_URL: "http://ollama:11434/v1";
+      OTEL_SERVICE_NAME: e2e-ai-gateway
+      OTEL_EXPORTER_OTLP_ENDPOINT: http://oap:11800
+      OTEL_EXPORTER_OTLP_PROTOCOL: grpc
+      OTEL_METRICS_EXPORTER: otlp
+      OTEL_LOGS_EXPORTER: otlp
+      OTEL_METRIC_EXPORT_INTERVAL: "5000"
+      OTEL_RESOURCE_ATTRIBUTES: 
"job_name=envoy-ai-gateway,service.instance.id=aigw-1,service.layer=ENVOY_AI_GATEWAY"
+    ports:
+      - 1975
+    networks:
+      - e2e
+    healthcheck:
+      test: ["CMD", "aigw", "healthcheck"]
+      interval: 5s
+      timeout: 60s
+      retries: 120
+    depends_on:
+      oap:
+        condition: service_healthy
+      ollama:
+        condition: service_healthy
+
+networks:
+  e2e:
diff --git a/test/e2e-v2/cases/envoy-ai-gateway/e2e.yaml 
b/test/e2e-v2/cases/envoy-ai-gateway/e2e.yaml
new file mode 100644
index 0000000000..71c40c8b75
--- /dev/null
+++ b/test/e2e-v2/cases/envoy-ai-gateway/e2e.yaml
@@ -0,0 +1,61 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Envoy AI Gateway e2e test (docker-compose)
+#
+# Validates ENVOY_AI_GATEWAY layer metrics and logs via OTLP from 
ai-gateway-cli.
+#
+# Architecture:
+#   trigger (curl) → ai-gateway-cli (port 1975) → Ollama (port 11434)
+#                          ↓ OTLP gRPC
+#                     SkyWalking OAP (port 11800)
+#                          ↓
+#                     BanyanDB
+
+setup:
+  env: compose
+  file: docker-compose.yml
+  timeout: 20m
+  init-system-environment: ../../script/env
+  steps:
+    - name: set PATH
+      command: export PATH=/tmp/skywalking-infra-e2e/bin:$PATH
+    - name: install yq
+      command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh yq
+    - name: install swctl
+      command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh swctl
+    - name: Pull Ollama model
+      command: docker compose -f 
test/e2e-v2/cases/envoy-ai-gateway/docker-compose.yml exec ollama ollama pull 
qwen2.5:0.5b
+
+trigger:
+  action: http
+  interval: 3s
+  times: 10
+  url: http://${aigw_host}:${aigw_1975}/v1/chat/completions
+  method: POST
+  headers:
+    Content-Type: application/json
+  body: 
'{"model":"qwen2.5:0.5b","stream":true,"messages":[{"role":"user","content":"Say
 hi"}]}'
+
+verify:
+  retry:
+    count: 30
+    interval: 10s
+  cases:
+    - includes:
+        - ./envoy-ai-gateway-cases.yaml
+
+cleanup:
+  on: always
diff --git a/test/e2e-v2/cases/envoy-ai-gateway/envoy-ai-gateway-cases.yaml 
b/test/e2e-v2/cases/envoy-ai-gateway/envoy-ai-gateway-cases.yaml
new file mode 100644
index 0000000000..f28e5b3083
--- /dev/null
+++ b/test/e2e-v2/cases/envoy-ai-gateway/envoy-ai-gateway-cases.yaml
@@ -0,0 +1,46 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Envoy AI Gateway e2e verification cases
+# Service name = "e2e-ai-gateway" (from OTEL_RESOURCE_ATTRIBUTES aigw.service)
+
+cases:
+  # Service exists in ENVOY_AI_GATEWAY layer
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql service ls
+    expected: expected/service.yml
+
+  # Service-level aggregate metrics
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_request_cpm --service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_request_latency_avg --service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_request_latency_percentile 
--service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value-label.yml
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_input_token_rate --service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_output_token_rate --service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+
+  # Provider breakdown
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_provider_request_cpm 
--service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_provider_latency_avg 
--service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+
+  # Model breakdown
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_model_request_cpm --service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
+  - query: swctl --display yaml 
--base-url=http://${oap_host}:${oap_12800}/graphql metrics exec 
--expression=meter_envoy_ai_gw_model_latency_avg --service-name=e2e-ai-gateway
+    expected: expected/metrics-has-value.yml
diff --git 
a/test/e2e-v2/cases/envoy-ai-gateway/expected/metrics-has-value-label.yml 
b/test/e2e-v2/cases/envoy-ai-gateway/expected/metrics-has-value-label.yml
new file mode 100644
index 0000000000..4b2001de51
--- /dev/null
+++ b/test/e2e-v2/cases/envoy-ai-gateway/expected/metrics-has-value-label.yml
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+debuggingtrace: null
+type: TIME_SERIES_VALUES
+results:
+  {{- contains .results }}
+  - metric:
+      labels:
+        {{- contains .metric.labels }}
+        - key: "p"
+          value: {{ notEmpty .value }}
+        {{- end}}
+    values:
+      {{- contains .values }}
+      - id: {{ notEmpty .id }}
+        value: {{ .value }}
+        traceid: null
+        owner: null
+      - id: {{ notEmpty .id }}
+        value: null
+        traceid: null
+        owner: null
+      {{- end}}
+  {{- end}}
+error: null
diff --git a/test/e2e-v2/cases/envoy-ai-gateway/expected/metrics-has-value.yml 
b/test/e2e-v2/cases/envoy-ai-gateway/expected/metrics-has-value.yml
new file mode 100644
index 0000000000..979b9b2577
--- /dev/null
+++ b/test/e2e-v2/cases/envoy-ai-gateway/expected/metrics-has-value.yml
@@ -0,0 +1,34 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+debuggingtrace: null
+type: TIME_SERIES_VALUES
+results:
+  {{- contains .results }}
+  - metric:
+      labels: []
+    values:
+      {{- contains .values }}
+      - id: {{ notEmpty .id }}
+        value: {{ notEmpty .value }}
+        traceid: null
+        owner: null
+      - id: {{ notEmpty .id }}
+        value: null
+        traceid: null
+        owner: null
+      {{- end}}
+  {{- end}}
+error: null
diff --git a/test/e2e-v2/cases/envoy-ai-gateway/expected/service.yml 
b/test/e2e-v2/cases/envoy-ai-gateway/expected/service.yml
new file mode 100644
index 0000000000..c97c7e2997
--- /dev/null
+++ b/test/e2e-v2/cases/envoy-ai-gateway/expected/service.yml
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+{{- contains . }}
+- id: {{ b64enc "e2e-ai-gateway" }}.1
+  name: e2e-ai-gateway
+  group: ""
+  shortname: e2e-ai-gateway
+  layers:
+    - ENVOY_AI_GATEWAY
+  normal: true
+{{- end }}

(skywalking) 01/05: Add Envoy AI Gateway observability (SWIP-10)

Reply via email to