Copilot commented on code in PR #1018:
URL: 
https://github.com/apache/skywalking-banyandb/pull/1018#discussion_r2963457938


##########
banyand/query/processor.go:
##########
@@ -484,7 +386,7 @@ func (p *measureInternalQueryProcessor) Rev(ctx 
context.Context, message bus.Mes
        var span *query.Span
        if queryCriteria.Trace {
                tracer, ctx = query.NewTracer(ctx, n.Format(time.RFC3339Nano))
-               span, ctx = tracer.StartSpan(ctx, "data-%s", 
p.queryService.nodeID)
+               span, _ = tracer.StartSpan(ctx, "data-%s", 
p.queryService.nodeID)

Review Comment:
   `span, _ = tracer.StartSpan(...)` drops the returned context that carries 
the active span. Keeping and using the returned context (as done in other 
processors) preserves correct parent/child relationships if any downstream code 
starts additional spans, and makes the tracing behavior consistent across query 
processors.
   ```suggestion
                span, ctx = tracer.StartSpan(ctx, "data-%s", 
p.queryService.nodeID)
   ```



##########
pkg/query/logical/measure/measure_analyzer.go:
##########
@@ -181,7 +181,7 @@ func DistributedAnalyze(criteria *measurev1.QueryRequest, 
ss []logical.Schema) (
                        criteria.GetAgg().GetFunction(),
                        criteria.GetGroupBy() != nil,
                        false,       // emitPartial: liaison does not emit 
partial
-                       pushDownAgg, // reduceMode: only reduce partials when 
push-down is active (no TopN)
+                       pushDownAgg, // reduceMode: reduce partials from data 
nodes when push-down is active
                )
                pushedLimit = math.MaxInt

Review Comment:
   With `pushDownAgg := criteria.GetAgg() != nil`, distributed Agg+Top queries 
will now push aggregation to data nodes and perform Top on the liaison. Because 
`pushedLimit` is set to `math.MaxInt` whenever `Agg` is present, 
`PushDownMaxSize` effectively removes the per-node limit, so data nodes may 
return *all* aggregated groups. For high-cardinality group-by this can 
substantially increase network/memory/CPU usage on both data nodes and the 
liaison. Consider adding a practical cap/heuristic for the pushed-down limit 
when `Top` is present (or a configurable safety limit) to avoid pathological 
fan-in while still keeping correctness.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to