cccs-jc opened a new issue, #10022:
URL: https://github.com/apache/iceberg/issues/10022

   ### Apache Iceberg version
   
   1.5.0 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   I have a table of IP flow data. The src_ipv4 and dst_ipv4 columns are of 
type long.
   I can zorder sort this table like so:
   ```sql
   CALL users.system.rewrite_data_files(
           table => 'users.jcc.flow',
           options => map('max-concurrent-file-group-rewrites', '20',
                          'partial-progress.enabled', 'true',
                          'rewrite-all', 'true'),
           strategy => 'sort',
           sort_order => 'zorder(src_ipv4, dst_ipv4)'
           )
   ```
   However, IP can be IPv6, so the table also has a src_ipv6 and dst_ipv6 
column of type BINARY. When the src IP is a v4, then src_ipv6 is null. When the 
src IP is a v6, then src_ipv4 is null. zordering with null v4 works just fine. 
But not with the v6.
   
   Running this statement fails. This is the same statement as above with an 
added zorder(src_ipv6, dst_ipv6).
   
   ```sql
   CALL users.system.rewrite_data_files(
           table => 'users.jcc.flow',
           options => map('max-concurrent-file-group-rewrites', '20',
                          'partial-progress.enabled', 'true',
                          'rewrite-all', 'true'),
           strategy => 'sort',
           sort_order => 'zorder(src_ipv4, dst_ipv4), zorder(src_ipv6, 
dst_ipv6)'
           )
   ```
   When I run that statement I get the following error.
   
   ```
   [FAILED_EXECUTE_UDF] Failed to execute user defined function (`BYTE-TRUNCATE 
(functions$$$Lambda$1091/0x00007f88b08cccc8)`: (binary) => binary).    
   Caused by: java.lang.NullPointerException: Cannot read the array length 
because "val" is null
           at 
org.apache.iceberg.util.ZOrderByteUtils.byteTruncateOrFill(ZOrderByteUtils.java:148)
           at 
org.apache.iceberg.spark.actions.SparkZOrderUDF.lambda$bytesTruncateUDF$f2cd8334$1(SparkZOrderUDF.java:269)
   ```
   
   Seems like there is a check missing for null BINARY values.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to