ludlows opened a new issue, #6759:
URL: https://github.com/apache/iceberg/issues/6759

   ### Feature Request / Improvement
   
   
   ### Improvement
   
   ### background
   
   we have known that the rewriteDataFiles is suggested to run periodically. 
   in our production, we would like to run rewriteDataFiles for a iceberg table 
once a month using spark sql procedure rewrite_data_files.
   
   for convenience, we add the following sql command in each ETL daily job.
   `
   call catalog.system.rewrite_data_files(table=>'hive.iceberg_table', where => 
"truncated(load_date, 6) = '$LASTMONTH' and substr('$TODAY', 7,2) = '03'" )
   `
   for instance, when $TODAY = '20230208',  then where condition is always 
false.  so we expected that rewrite_data_files can exit directly.
   
   in other words, we got exceptions by executing the sql:
   `
   call catalog.system.rewrite_data_files(table=>'hive.iceberg_table', where 
=>" '01'='03' ")
   `
   It is an AnalysisException in scala code below since the option object 
filtered by where condition is empty.
   
https://github.com/apache/iceberg/blob/32a8ef52ddf20aa2068dfff8f9e73bd5d27ef610/spark/v3.3/spark/src/main/scala/org/apache/spark/sql/execution/datasources/SparkExpressionConverter.scala#L47
 
   
   ### Our Request
   so could it be possible make rewrite_data_files exit directly without 
exceptions if the where condtion is a deterministic false?
   
   
   
   
   
   ### Query engine
   
   Spark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to