wfxxh opened a new issue, #13635:
URL: https://github.com/apache/iceberg/issues/13635

   ### Apache Iceberg version
   
   1.9.2 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   ## 🧭 Problem Summary
   
   My ENV is : 
   - iceberg: 1.7.x ~1.9.x
   - spark: 3.5.3
   - jdk: openjdk version "17.0.2" 2022-01-18
   - linux: 5.12.5-1.el7.elrepo.x86_64
   
   When querying a partitioned Iceberg table (by year, month) with Parquet 
bloom filter enabled on a STRING column (`resource_id`), the query returns **0 
rows** on **Linux** (Iceberg 1.9.x + Spark 3.5.3) but returns correctly on 
**Windows** or when downgrading to **Iceberg 1.7.x**.
   
   This discrepancy leads to **incorrect query results** and is 
platform-dependent.
   
   ---
   
   ## 📦 Table DDL
   
   > CREATE TABLE IF NOT EXISTS iceberg_catalog.test.xxh (
     date_time TIMESTAMP,
     operate_type INT,
     resource_id STRING,
     year INT,
     month INT,
     day INT
   )
   USING iceberg
   PARTITIONED BY (year, month)
   TBLPROPERTIES (
     'write.distribution-mode' = 'hash',
     'write.metadata.delete-after-commit.enabled' = 'true',
     'write.metadata.previous-versions-max' = '2',
     'write.parquet.bloom-filter-enabled.column.resource_id' = 'true',
     'write.parquet.compression-codec' = 'zstd',
     'write.target-file-size-bytes' = '4294967296'
   ); 
   
   ---
   
   ## on linux iceberg v1.7.2 return correct result, v1.9.2 can not return 
correct result
   
   <img width="1899" height="649" alt="Image" 
src="https://github.com/user-attachments/assets/8a86cea2-11a4-4902-a504-6abe4fae9210";
 />
   
   ---
   
   ## on windows iceberg v1.9.2 can return correct result
   
   <img width="1301" height="711" alt="Image" 
src="https://github.com/user-attachments/assets/7f4bb7a6-52f2-432b-9223-92ac22aab5af";
 />
   
   ---
   
   ## spark version > 3.5.3 with iceberg 1.7.1 will get another error
   
   <img width="1783" height="581" alt="Image" 
src="https://github.com/user-attachments/assets/39383885-9cc5-4dc4-9c65-8f5d782e367e";
 />
   
   **when i use this code it worked well. But i know how to set 
vectorization-enabled in sql**
   
   <img width="1084" height="602" alt="Image" 
src="https://github.com/user-attachments/assets/b42ff87b-501e-4b9c-9458-cfacef95089a";
 />
   
   ---
   
   ## this is my data file
   
   [data.zip](https://github.com/user-attachments/files/21378978/data.zip)
   
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [x] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to