cccs-jc opened a new issue, #10022: URL: https://github.com/apache/iceberg/issues/10022
### Apache Iceberg version 1.5.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 I have a table of IP flow data. The src_ipv4 and dst_ipv4 columns are of type long. I can zorder sort this table like so: ```sql CALL users.system.rewrite_data_files( table => 'users.jcc.flow', options => map('max-concurrent-file-group-rewrites', '20', 'partial-progress.enabled', 'true', 'rewrite-all', 'true'), strategy => 'sort', sort_order => 'zorder(src_ipv4, dst_ipv4)' ) ``` However, IP can be IPv6, so the table also has a src_ipv6 and dst_ipv6 column of type BINARY. When the src IP is a v4, then src_ipv6 is null. When the src IP is a v6, then src_ipv4 is null. zordering with null v4 works just fine. But not with the v6. Running this statement fails. This is the same statement as above with an added zorder(src_ipv6, dst_ipv6). ```sql CALL users.system.rewrite_data_files( table => 'users.jcc.flow', options => map('max-concurrent-file-group-rewrites', '20', 'partial-progress.enabled', 'true', 'rewrite-all', 'true'), strategy => 'sort', sort_order => 'zorder(src_ipv4, dst_ipv4), zorder(src_ipv6, dst_ipv6)' ) ``` When I run that statement I get the following error. ``` [FAILED_EXECUTE_UDF] Failed to execute user defined function (`BYTE-TRUNCATE (functions$$$Lambda$1091/0x00007f88b08cccc8)`: (binary) => binary). Caused by: java.lang.NullPointerException: Cannot read the array length because "val" is null at org.apache.iceberg.util.ZOrderByteUtils.byteTruncateOrFill(ZOrderByteUtils.java:148) at org.apache.iceberg.spark.actions.SparkZOrderUDF.lambda$bytesTruncateUDF$f2cd8334$1(SparkZOrderUDF.java:269) ``` Seems like there is a check missing for null BINARY values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org