laskoviymishka commented on code in PR #1098:
URL: https://github.com/apache/iceberg-go/pull/1098#discussion_r3272278011


##########
table/evaluators.go:
##########
@@ -1568,12 +1568,36 @@ func (m *strictMetricsEval) canContainNulls(fieldID 
int) bool {
        return exists && cnt > 0
 }
 
+func (m *strictMetricsEval) mayContainNulls(field iceberg.NestedField) bool {
+       cnt, exists := m.nullCounts[field.ID]
+       if !exists {
+               return !field.Required
+       }
+
+       return cnt > 0
+}
+
 func (m *strictMetricsEval) canContainNans(fieldID int) bool {
        cnt, exists := m.nanCounts[fieldID]
 
        return exists && cnt > 0
 }
 
+func (m *strictMetricsEval) mayContainNans(field iceberg.NestedField) bool {
+       switch field.Type.(type) {
+       case iceberg.Float32Type, iceberg.Float64Type:
+       default:
+               return false
+       }

Review Comment:
   The empty `case iceberg.Float32Type, iceberg.Float64Type:` arm with 
`default: return false` below is valid but unusual — a reader pauses to confirm 
the float arm is intentionally empty-and-fall-through rather than an oversight. 
A one-line comment in the arm, or inverting to a positive guard, would save the 
double-take.



##########
table/evaluators.go:
##########
@@ -1568,12 +1568,36 @@ func (m *strictMetricsEval) canContainNulls(fieldID 
int) bool {
        return exists && cnt > 0
 }
 
+func (m *strictMetricsEval) mayContainNulls(field iceberg.NestedField) bool {
+       cnt, exists := m.nullCounts[field.ID]
+       if !exists {
+               return !field.Required

Review Comment:
   `VisitNotEqual` and `VisitNotIn` still call `canContainNulls(fieldID) || 
canContainNans(fieldID)` — same absent-count short-circuit this PR is replacing 
everywhere else. Under their inverted return semantics (`rowsMustMatch` instead 
of `rowsMightNotMatch`), the bug class is identical: for an optional field with 
no per-key null count, `canContainNulls` returns `false`, the guard 
short-circuits, and the bounds check can wrongly conclude all rows satisfy `col 
!= x` when nulls may actually be present. Both methods already have `field := 
t.Ref().Field()` in scope, so this is the same two-line substitution to 
`m.mayContainNulls(field) || m.mayContainNans(field)`. Worth folding in here to 
fully close #1091.



##########
table/evaluators_test.go:
##########
@@ -2346,6 +2346,87 @@ func (suite *StrictMetricsTestSuite) TestMissingStats() {
        }
 }
 
+func (suite *StrictMetricsTestSuite) TestMissingNullCounts() {

Review Comment:
   Both new tests exercise only `GreaterThan`. The six migrated visitors share 
the same early guard so coverage is defensible, but one `EqualTo` + one `In` 
case (or a comment noting `GreaterThan` is representative) would make the 
regression intent explicit. Once `VisitNotEqual` / `VisitNotIn` are addressed, 
a case there closes the loop.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to