Re: [PR] Support Virtual-GenAI monitoring [skywalking]

via GitHub Sun, 22 Mar 2026 06:53:52 -0700


Copilot commented on code in PR #13745:
URL: https://github.com/apache/skywalking/pull/13745#discussion_r2971554021



##########
oap-server/analyzer/gen-ai-analyzer/src/main/java/org/apache/skywalking/oap/analyzer/genai/service/GenAIMeterAnalyzer.java:
##########
@@ -0,0 +1,130 @@
+/*
+ *   Licensed to the Apache Software Foundation (ASF) under one or more
+ *   contributor license agreements.  See the NOTICE file distributed with
+ *   this work for additional information regarding copyright ownership.
+ *   The ASF licenses this file to You under the Apache License, Version 2.0
+ *   (the "License"); you may not use this file except in compliance with
+ *   the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *   Unless required by applicable law or agreed to in writing, software
+ *   distributed under the License is distributed on an "AS IS" BASIS,
+ *   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *   See the License for the specific language governing permissions and
+ *   limitations under the License.
+ */
+
+package org.apache.skywalking.oap.analyzer.genai.service;
+
+import org.apache.skywalking.apm.network.common.v3.KeyStringValuePair;
+import org.apache.skywalking.apm.network.language.agent.v3.SegmentObject;
+import org.apache.skywalking.apm.network.language.agent.v3.SpanObject;
+import org.apache.skywalking.oap.analyzer.genai.config.GenAIConfig;
+import org.apache.skywalking.oap.analyzer.genai.config.GenAITagKey;
+import 
org.apache.skywalking.oap.analyzer.genai.matcher.GenAIProviderPrefixMatcher;
+import org.apache.skywalking.oap.server.core.analysis.IDManager;
+import org.apache.skywalking.oap.server.core.analysis.Layer;
+import org.apache.skywalking.oap.server.core.analysis.TimeBucket;
+import org.apache.skywalking.oap.server.core.source.GenAIMetrics;
+import org.apache.skywalking.oap.server.library.util.StringUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Map;
+
+import static java.util.stream.Collectors.toMap;
+
+public class GenAIMeterAnalyzer implements IGenAIMeterAnalyzerService {
+
+    private static final Logger LOG = 
LoggerFactory.getLogger(GenAIMeterAnalyzer.class);
+
+    private final GenAIProviderPrefixMatcher matcher;
+
+    public GenAIMeterAnalyzer(GenAIProviderPrefixMatcher matcher) {
+        this.matcher = matcher;
+    }
+
+    @Override
+    public GenAIMetrics extractMetricsFromSWSpan(SpanObject span, 
SegmentObject segment) {
+        Map<String, String> tags = span.getTagsList().stream()
+                .collect(toMap(
+                        KeyStringValuePair::getKey,
+                        KeyStringValuePair::getValue,
+                        (v1, v2) -> v1
+                ));
+
+        String modelName = tags.get(GenAITagKey.RESPONSE_MODEL);
+
+        if (StringUtil.isBlank(modelName)) {
+            if (LOG.isDebugEnabled()) {
+                LOG.debug("Model name is missing in span [{}], skipping GenAI 
analysis", span.getOperationName());
+            }
+            return null;
+        }
+        String provider = tags.get(GenAITagKey.PROVIDER_NAME);
+        GenAIProviderPrefixMatcher.MatchResult matchResult = 
matcher.match(modelName);
+
+        if (StringUtil.isBlank(provider)) {
+            provider = matchResult.getProvider();
+        }
+
+        GenAIConfig.Model modelConfig = matchResult.getModelConfig();
+
+        long inputTokens = parseSafeLong(tags.get(GenAITagKey.INPUT_TOKENS));
+        long outputTokens = parseSafeLong(tags.get(GenAITagKey.OUTPUT_TOKENS));
+
+        // calculate the total cost by the cost configs
+        double totalCost = 0.0;
+        if (modelConfig != null) {
+            if (modelConfig.getInputCostPerM() > 0) {
+                totalCost += inputTokens * modelConfig.getInputCostPerM();
+            }
+            if (modelConfig.getOutputCostPerM() > 0) {
+                totalCost += outputTokens * modelConfig.getOutputCostPerM();

Review Comment:
   The cost calculation is inconsistent with the documented meaning of 
`*-cost-per-m` (“price per 1,000,000 tokens”). Multiplying `tokens * costPerM` 
makes the recorded cost ~1,000,000× too large. Compute `tokens / 1_000_000.0 * 
costPerM` (for both input and output), and then keep the UI expressions 
consistent (i.e., remove `/1000000` in dashboards if the stored cost is already 
in currency units).
   ```suggestion
                   // costPerM is the price per 1,000,000 tokens
                   totalCost += (inputTokens / 1_000_000.0) * 
modelConfig.getInputCostPerM();
               }
               if (modelConfig.getOutputCostPerM() > 0) {
                   // costPerM is the price per 1,000,000 tokens
                   totalCost += (outputTokens / 1_000_000.0) * 
modelConfig.getOutputCostPerM();
   ```



##########
oap-server/analyzer/gen-ai-analyzer/src/main/java/org/apache/skywalking/oap/analyzer/genai/service/GenAIMeterAnalyzer.java:
##########
@@ -0,0 +1,130 @@
+/*
+ *   Licensed to the Apache Software Foundation (ASF) under one or more
+ *   contributor license agreements.  See the NOTICE file distributed with
+ *   this work for additional information regarding copyright ownership.
+ *   The ASF licenses this file to You under the Apache License, Version 2.0
+ *   (the "License"); you may not use this file except in compliance with
+ *   the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *   Unless required by applicable law or agreed to in writing, software
+ *   distributed under the License is distributed on an "AS IS" BASIS,
+ *   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *   See the License for the specific language governing permissions and
+ *   limitations under the License.
+ */
+
+package org.apache.skywalking.oap.analyzer.genai.service;
+
+import org.apache.skywalking.apm.network.common.v3.KeyStringValuePair;
+import org.apache.skywalking.apm.network.language.agent.v3.SegmentObject;
+import org.apache.skywalking.apm.network.language.agent.v3.SpanObject;
+import org.apache.skywalking.oap.analyzer.genai.config.GenAIConfig;
+import org.apache.skywalking.oap.analyzer.genai.config.GenAITagKey;
+import 
org.apache.skywalking.oap.analyzer.genai.matcher.GenAIProviderPrefixMatcher;
+import org.apache.skywalking.oap.server.core.analysis.IDManager;
+import org.apache.skywalking.oap.server.core.analysis.Layer;
+import org.apache.skywalking.oap.server.core.analysis.TimeBucket;
+import org.apache.skywalking.oap.server.core.source.GenAIMetrics;
+import org.apache.skywalking.oap.server.library.util.StringUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Map;
+
+import static java.util.stream.Collectors.toMap;
+
+public class GenAIMeterAnalyzer implements IGenAIMeterAnalyzerService {
+
+    private static final Logger LOG = 
LoggerFactory.getLogger(GenAIMeterAnalyzer.class);
+
+    private final GenAIProviderPrefixMatcher matcher;
+
+    public GenAIMeterAnalyzer(GenAIProviderPrefixMatcher matcher) {
+        this.matcher = matcher;
+    }
+
+    @Override
+    public GenAIMetrics extractMetricsFromSWSpan(SpanObject span, 
SegmentObject segment) {
+        Map<String, String> tags = span.getTagsList().stream()
+                .collect(toMap(
+                        KeyStringValuePair::getKey,
+                        KeyStringValuePair::getValue,
+                        (v1, v2) -> v1
+                ));
+
+        String modelName = tags.get(GenAITagKey.RESPONSE_MODEL);
+
+        if (StringUtil.isBlank(modelName)) {
+            if (LOG.isDebugEnabled()) {
+                LOG.debug("Model name is missing in span [{}], skipping GenAI 
analysis", span.getOperationName());
+            }
+            return null;
+        }
+        String provider = tags.get(GenAITagKey.PROVIDER_NAME);
+        GenAIProviderPrefixMatcher.MatchResult matchResult = 
matcher.match(modelName);
+
+        if (StringUtil.isBlank(provider)) {
+            provider = matchResult.getProvider();
+        }
+
+        GenAIConfig.Model modelConfig = matchResult.getModelConfig();
+
+        long inputTokens = parseSafeLong(tags.get(GenAITagKey.INPUT_TOKENS));
+        long outputTokens = parseSafeLong(tags.get(GenAITagKey.OUTPUT_TOKENS));
+
+        // calculate the total cost by the cost configs
+        double totalCost = 0.0;
+        if (modelConfig != null) {
+            if (modelConfig.getInputCostPerM() > 0) {
+                totalCost += inputTokens * modelConfig.getInputCostPerM();
+            }
+            if (modelConfig.getOutputCostPerM() > 0) {
+                totalCost += outputTokens * modelConfig.getOutputCostPerM();
+            }
+        }
+
+        GenAIMetrics metrics = new GenAIMetrics();
+
+        metrics.setServiceId(IDManager.ServiceID.buildId(provider, 
Layer.VIRTUAL_GENAI.isNormal()));
+        metrics.setProviderName(provider);
+        metrics.setModelName(modelName);
+        metrics.setInputTokens(inputTokens);
+        metrics.setOutputTokens(outputTokens);
+
+        
metrics.setTimeToFirstToken(parseSafeInt(tags.get(GenAITagKey.SERVER_TIME_TO_FIRST_TOKEN)));
+        metrics.setTotalCost(totalCost);
+
+        long latency = span.getEndTime() - span.getStartTime();
+        metrics.setLatency(latency);
+        metrics.setStatus(!span.getIsError());
+        
metrics.setTimeBucket(TimeBucket.getMinuteTimeBucket(span.getStartTime()));
+
+        return metrics;
+    }
+
+    private long parseSafeLong(String value) {
+        if (StringUtil.isEmpty(value)) {
+            return 0;
+        }
+        try {
+            return Long.parseLong(value);
+        } catch (NumberFormatException e) {
+            LOG.warn("Failed to parse token count: {}", value);
+            return 0;
+        }
+    }
+
+    private int parseSafeInt(String value) {
+        if (StringUtil.isEmpty(value)) {
+            return 0;
+        }
+        try {
+            return Integer.parseInt(value);
+        } catch (NumberFormatException e) {
+            LOG.warn("Failed to parse token count: {}", value);

Review Comment:
   `parseSafeInt` is used for `gen_ai.server.time_to_first_token`, but the 
warning message says “token count”, which is inaccurate and confusing for 
troubleshooting. Adjust the message to reflect the actual field being parsed 
(e.g., TTFT / time-to-first-token).
   ```suggestion
               LOG.warn("Failed to parse time-to-first-token (TTFT) value: {}", 
value);
   ```



##########
oap-server/oal-rt/src/test/java/org/apache/skywalking/oal/v2/generator/RuntimeOALGenerationTest.java:
##########
@@ -98,6 +98,8 @@ public static void setup() {
         // DisableOALDefine - no catalog
         registerOALDefine("disable", createOALDefine("oal/disable.oal", 
SOURCE_PACKAGE, ""));
 
+        registerOALDefine("disable", createOALDefine("oal/virtual-gen-ai.oal", 
SOURCE_PACKAGE, ""));

Review Comment:
   The second `registerOALDefine` uses the same key `"disable"` as the earlier 
registration, which will overwrite the first OAL define and likely causes the 
runtime generation test to miss coverage for `oal/disable.oal`. Use a distinct 
key for `virtual-gen-ai.oal` (e.g., `"virtual-gen-ai"`).
   ```suggestion
           registerOALDefine("virtual-gen-ai", 
createOALDefine("oal/virtual-gen-ai.oal", SOURCE_PACKAGE, ""));
   ```



##########
docs/en/setup/service-agent/virtual-genai.md:
##########
@@ -0,0 +1,62 @@
+# Virtual GenAI
+
+Virtual GenAI represents the Generative AI service nodes detected by [server 
agents' plugins](server-agents.md). The performance
+metrics of the GenAI operations are from the GenAI client-side perspective.
+
+For example, a Spring AI plugin in the Java agent could detect the latency of 
a chat completion request.
+As a result, SkyWalking would show traffic, latency, success rate, token usage 
(input/output), and estimated cost in the GenAI dashboard.
+
+## Span Contract
+
+The GenAI operation span should have the following properties:
+- It is an **Exit** span
+- **Span's layer == GENAI**
+- Tag key = `gen_ai.provider.name`, value = The Generative AI provider, e.g. 
openai, anthropic, ollama
+- Tag key = `gen_ai.response.model`, value = The name of the GenAI model, e.g. 
gpt-4o, claude-3-5-sonnet
+- Tag key = `gen_ai.usage.input_tokens`, value = The number of tokens used in 
the GenAI input (prompt)
+- Tag key = `gen_ai.usage.output_tokens`, value = The number of tokens used in 
the GenAI response (completion)
+- Tag key = `gen_ai.server.time_to_first_token`, value = The duration in 
milliseconds until the first token is received (streaming requests only)
+- If the GenAI service is a remote API (e.g. OpenAI), the span's peer should 
be the network address (IP or domain) of the GenAI server.
+
+## Provider Configuration
+
+SkyWalking uses `gen-ai-config.yml` to map model names to providers and 
configure cost estimation.
+
+When the `gen_ai.provider.name` tag is present in the span, it is used 
directly. Otherwise, SkyWalking matches the model name
+against `prefix-match` rules to identify the provider. For example, a model 
name starting with `gpt` is mapped to `openai`.
+
+To configure cost estimation, add `models` with pricing under the provider:
+
+```yaml
+providers:
+- provider: openai
+  prefix-match:
+    - gpt
+      models:
+    - name: gpt-4o
+      input-cost-per-m: 2.5    # cost per 1,000,000 input tokens
+      output-cost-per-m: 10    # cost per 1,000,000 output tokens
+      ```
+
+## Metrics
+
+The following metrics are available at the **provider** (service) level:
+- `gen_ai_provider_cpm` - Calls per minute
+- `gen_ai_provider_sla` - Success rate
+- `gen_ai_provider_resp_time` - Average response time
+- `gen_ai_provider_latency_percentile` - Latency percentiles
+- `gen_ai_provider_input_tokens_sum / avg` - Input token usage
+- `gen_ai_provider_output_tokens_sum / avg` - Output token usage
+- `gen_ai_provider_total_cost / avg_cost` - Estimated cost
+
+The following metrics are available at the **model** (service instance) level:
+- `gen_ai_model_call_cpm` - Calls per minute
+- `gen_ai_model_sla` - Success rate
+- `gen_ai_model_latency_avg / percentile` - Latency
+- `gen_ai_model_ttft_avg / percentile` - Time to first token (streaming only)
+- `gen_ai_model_input_tokens_sum / avg` - Input token usage
+- `gen_ai_model_output_tokens_sum / avg` - Output token usage
+- `gen_ai_model_total_cost / avg_cost` - Estimated cost
+
+## Requirement
+`skwaylking java agent` version >= 9.7

Review Comment:
   Corrected spelling of 'skwaylking' to 'skywalking'.
   ```suggestion
   `SkyWalking Java agent` version >= 9.7
   ```



##########
docs/en/setup/service-agent/virtual-genai.md:
##########
@@ -0,0 +1,62 @@
+# Virtual GenAI
+
+Virtual GenAI represents the Generative AI service nodes detected by [server 
agents' plugins](server-agents.md). The performance
+metrics of the GenAI operations are from the GenAI client-side perspective.
+
+For example, a Spring AI plugin in the Java agent could detect the latency of 
a chat completion request.
+As a result, SkyWalking would show traffic, latency, success rate, token usage 
(input/output), and estimated cost in the GenAI dashboard.
+
+## Span Contract
+
+The GenAI operation span should have the following properties:
+- It is an **Exit** span
+- **Span's layer == GENAI**
+- Tag key = `gen_ai.provider.name`, value = The Generative AI provider, e.g. 
openai, anthropic, ollama
+- Tag key = `gen_ai.response.model`, value = The name of the GenAI model, e.g. 
gpt-4o, claude-3-5-sonnet
+- Tag key = `gen_ai.usage.input_tokens`, value = The number of tokens used in 
the GenAI input (prompt)
+- Tag key = `gen_ai.usage.output_tokens`, value = The number of tokens used in 
the GenAI response (completion)
+- Tag key = `gen_ai.server.time_to_first_token`, value = The duration in 
milliseconds until the first token is received (streaming requests only)
+- If the GenAI service is a remote API (e.g. OpenAI), the span's peer should 
be the network address (IP or domain) of the GenAI server.
+
+## Provider Configuration
+
+SkyWalking uses `gen-ai-config.yml` to map model names to providers and 
configure cost estimation.
+
+When the `gen_ai.provider.name` tag is present in the span, it is used 
directly. Otherwise, SkyWalking matches the model name
+against `prefix-match` rules to identify the provider. For example, a model 
name starting with `gpt` is mapped to `openai`.
+
+To configure cost estimation, add `models` with pricing under the provider:
+
+```yaml

Review Comment:
   The YAML example is malformed: `models:` is indented under the `- gpt` 
prefix entry rather than being a sibling of `prefix-match`, and the fenced code 
block markers appear to contain stray/zero-width characters. This will confuse 
readers and makes the example non-copyable; please fix indentation and use a 
standard Markdown fenced block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Support Virtual-GenAI monitoring [skywalking]

Reply via email to