Copilot commented on code in PR #1034:
URL: 
https://github.com/apache/skywalking-banyandb/pull/1034#discussion_r3009524936


##########
banyand/internal/sidx/sidx.go:
##########
@@ -366,7 +368,7 @@ func (b *blockCursorBuilder) processWithFilter(req 
QueryRequest, log *logger.Log
                        continue
                }
 
-               tags = b.collectTagsForFilter(tags, decoder, i)
+               tags = b.collectTagsForFilter(tags, decoder, i, 
req.TagProjection)

Review Comment:
   `collectTagsForFilter` now uses `req.TagProjection` to decide both the 
*order* and the *set* of tags passed into `req.TagFilter.Match`. However 
`TagFilterMatcher` (via `logical.tagFilterAdapter`) resolves tag values by 
`(TagFamilyIdx, TagIdx)` from its schema, so the slice passed to `Match` must 
align exactly with the schema used to build the matcher. If `TagProjection` 
contains any projected tags that are not part of the matcher schema (e.g. 
user-selected projection tags placed before condition tags), the indices shift 
and filters can be evaluated against the wrong tag values (or error on nil 
values), turning the previous flake into a deterministic wrong result. Consider 
passing an explicit ordered list of condition-tag names/refs for matching, or 
building the matcher schema using the same projection order that SIDX uses here.
   ```suggestion
                tags = b.collectTagsForFilter(tags, decoder, i, nil)
   ```



##########
test/cases/stream/stream.go:
##########
@@ -43,6 +43,9 @@ var _ = g.DescribeTable("Scanning Streams", func(args 
helpers.Args) {
        gm.Eventually(func(innerGm gm.Gomega) {
                verify(innerGm, args)
        }, flags.EventuallyTimeout).Should(gm.Succeed())
+       gm.Consistently(func(innerGm gm.Gomega) {
+               verify(innerGm, args)
+       }, flags.ConsistentlyTimeout).Should(gm.Succeed())

Review Comment:
   `Consistently` is called without an explicit polling interval, which means 
Gomega will poll very frequently (default ~10ms). Since `verify` performs 
integration queries, this can generate hundreds of requests per test case over 
`flags.ConsistentlyTimeout`, increasing suite runtime and load. Consider 
setting an explicit polling interval (similar to the `topn` suite) for the 
`Consistently` assertion.



##########
test/cases/property/property.go:
##########
@@ -34,6 +34,9 @@ var (
                gm.Eventually(func(innerGm gm.Gomega) {
                        propertyTestData.VerifyFn(innerGm, SharedContext, args)
                }, flags.EventuallyTimeout).Should(gm.Succeed())
+               gm.Consistently(func(innerGm gm.Gomega) {
+                       propertyTestData.VerifyFn(innerGm, SharedContext, args)
+               }, flags.ConsistentlyTimeout).Should(gm.Succeed())

Review Comment:
   `Consistently` is called without an explicit polling interval, so it will 
poll very frequently by default. Because `VerifyFn` performs integration 
queries, this can create a large number of requests over 
`flags.ConsistentlyTimeout`. Consider setting a larger polling interval (e.g. 
200ms–2s) for the `Consistently` assertion to reduce load and runtime.



##########
banyand/internal/sidx/sidx.go:
##########
@@ -395,22 +397,66 @@ func (b *blockCursorBuilder) processWithoutFilter() {
        }
 }
 
-func (b *blockCursorBuilder) collectTagsForFilter(buf []*modelv1.Tag, decoder 
func(pbv1.ValueType, []byte, [][]byte) *modelv1.TagValue, index int) 
[]*modelv1.Tag {
+func (b *blockCursorBuilder) collectTagsForFilter(
+       buf []*modelv1.Tag,
+       decoder func(pbv1.ValueType, []byte, [][]byte) *modelv1.TagValue,
+       index int,
+       projections []model.TagProjection,
+) []*modelv1.Tag {
        buf = buf[:0]
 
-       for tagName, tagData := range b.block.tags {
+       // IMPORTANT: The logical tag filter matcher relies on stable tag 
ordering,
+       // because it uses schema tag refs to locate tag values by index 
(TagFamily.Tags[idx]).
+       // Iterating a Go map is randomized, which makes results flaky.
+       if len(projections) > 0 {
+               seenTagNames := make(map[string]struct{}, len(b.block.tags))
+               for _, proj := range projections {
+                       for _, tagName := range proj.Names {
+                               if _, exists := seenTagNames[tagName]; exists {
+                                       continue
+                               }

Review Comment:
   `collectTagsForFilter` allocates a fresh `seenTagNames` map on every row 
processed (this function is called inside the per-element loop in 
`processWithFilter`). With large blocks this can become allocation/GC heavy. 
Consider precomputing a deduplicated, ordered `[]string` of tag names once per 
block/request and reusing it for all rows, instead of rebuilding the dedupe map 
each time.



##########
test/cases/trace/trace.go:
##########
@@ -41,6 +41,9 @@ var _ = g.DescribeTable("Scanning Traces", func(args 
helpers.Args) {
        gm.Eventually(func(innerGm gm.Gomega) {
                verify(innerGm, args)
        }, flags.EventuallyTimeout).Should(gm.Succeed())
+       gm.Consistently(func(innerGm gm.Gomega) {
+               verify(innerGm, args)
+       }, flags.ConsistentlyTimeout).Should(gm.Succeed())

Review Comment:
   `Consistently` is called without an explicit polling interval, which means 
Gomega will poll very frequently (default ~10ms). Since `verify` performs an 
integration query/compare, this can result in hundreds of backend queries per 
table entry over `flags.ConsistentlyTimeout`, significantly increasing suite 
runtime and load. Consider setting a saner polling interval (e.g. 200ms–2s) for 
the `Consistently` assertion.
   ```suggestion
        }, flags.ConsistentlyTimeout, 500*time.Millisecond).Should(gm.Succeed())
   ```



##########
test/cases/measure/measure.go:
##########
@@ -36,6 +36,9 @@ var (
                gm.Eventually(func(innerGm gm.Gomega) {
                        measureTestData.VerifyFn(innerGm, SharedContext, args)
                }, flags.EventuallyTimeout).Should(gm.Succeed())
+               gm.Consistently(func(innerGm gm.Gomega) {
+                       measureTestData.VerifyFn(innerGm, SharedContext, args)
+               }, flags.ConsistentlyTimeout).Should(gm.Succeed())

Review Comment:
   `Consistently` is called without an explicit polling interval, so it will 
poll very frequently by default. Given `VerifyFn` is an integration query, this 
can generate many requests over `flags.ConsistentlyTimeout` and slow the suite. 
Consider specifying a polling interval (e.g. 200ms–2s) for the `Consistently` 
assertion.
   ```suggestion
                }, flags.ConsistentlyTimeout, 
500*time.Millisecond).Should(gm.Succeed())
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to