youngxinler commented on code in PR #6554:
URL: https://github.com/apache/iceberg/pull/6554#discussion_r1071834035
##########
data/src/test/java/org/apache/iceberg/data/TestMetricsRowGroupFilter.java:
##########
@@ -341,6 +345,25 @@ public void testNotNaN() {
Assert.assertTrue("Should read: NaN counts are not tracked in Parquet
metrics", shouldRead);
}
+
+ @Test
+ public void testDoubleWithNan() {
Review Comment:
thanks for @RussellSpitzer suggestion, but when i run test as your
suggestion.
I found ORC also can't skip read. so i think the negative test is work for
parquet and ORC?
as for #6517, "// Only ORC should be able to distinguish using min/max when
NaN is present", i add these new tests for ORC skip.
```java
// just for "format == FileFormat.ORC"
// record.setField("_some_double_nans", (i % 10 == 0) ? Double.NaN :
2D); // includes some nan values
shouldRead = shouldRead(equal("some_double_nans", 10.0));
Assert.assertFalse("Should skip: column with some nans contains target
value", shouldRead);
shouldRead = shouldRead(greaterThan("some_double_nans", 10.0));
Assert.assertFalse("Should skip: column with some nans contains target
value", shouldRead);
```
bacause the some_double_nans Max Value is 2D, assert showRead will return
false, but all shouldRead is true, it looks strange?
If my guess is correct, the pushdown of ORC containing nans will not work.
Is this as expected? Should we open a new issue to discuss?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]