sandugood commented on issue #4723:
URL: 
https://github.com/apache/datafusion-comet/issues/4723#issuecomment-4799748849

   Returning with additional info:
   1. When tried to debug and rerun the pipeline, once got: `WARN 
CometIcebergNativeScan: Failed to serialize delete file: null`
   2. Additionally, when performing a single `.count()` over data - Comet's row 
count is higher, than vanilla Spark.
   
   For context. Using spark-4.0.3 version (official image) + additional .jar 
files inside the image (all of them pulled from maven-central proxy):
   - compiled Comet (from `main` branch, with `iceberg-rust` crate pulled also 
from `main` branch)
   - `iceberg-spark-runtime-4.0_2.13-1.11.0.jar`
   - `iceberg-aws-bundle-1.11.0.jar`
   
   In-comparison - vanilla Spark has all of the .jar files listed, besides the 
Comet one.
   
   So, right now, I would say that these wrong values comes purely from reading 
with native Iceberg scan on. When I did enable Comet, but didn't enable native 
scan - it was the same as vanilla Spark.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to