wombatu-kun opened a new pull request, #16609:
URL: https://github.com/apache/iceberg/pull/16609

   ## What
   
   Pushes Iceberg `timestamp_ns` and `timestamptz_ns` predicates down into the 
ORC reader. Before this change, any filter on a nanosecond-timestamp column 
threw `UnsupportedOperationException: Type timestamp_ns not supported in ORC 
SearchArguments` and failed the read.
   
   ## Why
   
   `ExpressionToSearchArgument` maps Iceberg types to an ORC 
`PredicateLeaf.Type` in `type()` and to predicate values in `literal()`, but 
`TIMESTAMP_NANO` was handled in neither switch, and it was also missing from 
`UNSUPPORTED_TYPES` (the set of types that degrade gracefully to 
`YES_NO_NULL`). So a nanosecond-timestamp predicate fell through to the 
`default:` branch and threw, crashing any filtered read of such a column. This 
affects both `timestamp_ns` and `timestamptz_ns` (both have type id 
`TIMESTAMP_NANO`) and every predicate kind, since they all call `type()`.
   
   ORC 1.9.8 represents timestamp predicates with `java.sql.Timestamp`, which 
carries full nanosecond precision, and evaluates row-level filters at that 
precision, so the predicate can be pushed down exactly rather than skipped.
   
   ## Changes
   
   `ExpressionToSearchArgument` now maps `TIMESTAMP_NANO` to 
`PredicateLeaf.Type.TIMESTAMP` and converts the nanos-from-epoch literal to a 
`java.sql.Timestamp`, mirroring the existing micros `TIMESTAMP` handling.
   
   ## Tests
   
   - `TestExpressionToSearchArgument#testTimestampNanoTypes` asserts the 
converted `SearchArgument` for both `timestamp_ns` and `timestamptz_ns`.
   - `TestOrcDataReader#testTimestampNanoFilterPushdownRespectsNanoseconds` 
writes rows that differ only by sub-microsecond nanoseconds and verifies that 
row-level SARG filtering returns exactly the rows past a sub-microsecond 
boundary, proving nanosecond precision is honored rather than truncated to 
micros.
   - `TestOrcDataReader#testTimestampTzNanoFilterAcrossTimezones` writes 
`timestamptz_ns` values in several different zone offsets and filters with a 
boundary expressed in yet another zone, verifying the comparison is by instant 
at nanosecond precision.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to