eye-gu commented on issue #13811: URL: https://github.com/apache/skywalking/issues/13811#issuecomment-4230932227
> * Entity5's all data points land on exactly one node > * That node's streaming processor computes the correct TopN ranking for entity5 based on all its data > * If entity5 doesn't make the local top-N, it genuinely has a lower value than the N entities that did — no missing partial data exists on other nodes 1. Shard tag and entity tag can differ, scattering the same entity across nodes. Entity and ShardingKey are independent fields in Measure. When ShardingKey is set, it overrides the shard routing from Entity. For example, with entity.tag_names=["service_id"] and sharding_key.tag_names=["instance_id"], data for the same service_id lands on different shards/nodes under different instance_id values. Each node only sees a partial view of that entity. 2. Even on a single node, agg=UNSPECIFIED still truncates incorrectly. The coordinator sends agg=AGGREGATION_FUNCTION_UNSPECIFIED to data nodes, which prevents proper aggregation. For a COUNT TopN with TopN=2, a node holding entity-A(5 points), entity-B(3 points), entity-C(1 point) cannot compute COUNT(entity-A)=5. It simply truncates raw results by the TopN limit, returning incorrect partial data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
