ypsah opened a new issue, #1937:
URL: https://github.com/apache/iceberg-python/issues/1937

   ### Apache Iceberg version
   
   0.9.0 (latest release)
   
   ### Please describe the bug 🐞
   
   Hi, thanks for writing `pyiceberg`.
   
   The bug is pretty described in the title: `table.scan(row_filter="x IN (0, 
1)")` does not include the values for which `x=0` when `x` is a `DoubleType` 
and a partition column.
   
   Here is a reproducer:
   
   ```bash
   pip install pyiceberg[sql-sqlite,pyarrow]
   ```
   
   ```python
   from pathlib import Path
   from tempfile import TemporaryDirectory
   
   import pyarrow
   from pyiceberg.catalog.sql import SqlCatalog
   from pyiceberg.schema import Schema
   from pyiceberg.transforms import IdentityTransform
   from pyiceberg.types import DoubleType, NestedField
   from pyiceberg.partitioning import PartitionSpec, PartitionField
   
   schema = Schema(
       NestedField(field_id=1, name="x", field_type=DoubleType()),
       NestedField(field_id=2, name="y", field_type=DoubleType()),
   )
   partition_spec = PartitionSpec(PartitionField(source_id=1, field_id=1001, 
transform=IdentityTransform(), name="x"))
   
   with TemporaryDirectory() as tmpdir:
       catalog = SqlCatalog(
           "local",
           uri=f"sqlite:///{tmpdir}/catalog.db",
           warehouse=f"file://{tmpdir}/warehouse",
       )
       catalog.create_namespace("test")
       table = catalog.create_table(
           "test.test", schema=schema, partition_spec=partition_spec
       )
   
       data = pyarrow.table(
           {
               "x": [0.0, 1.0, 2.0],
               "y": [0.0, 0.0, 0.0],
           }
       )
       table.overwrite(data)
   
       print("=== no filter ===")
       print(table.scan().to_arrow())
       print("=== x IN (0) ===")
       print(table.scan(row_filter="x IN (0)").to_arrow())
       print("=== x IN (0, 1, 2) ===")
       print(table.scan(row_filter="x IN (0, 1, 2)").to_arrow())
   ```
   
   Output:
   
   ```
   
/tmp/tmp.l2MLQFjC7C-05duO9h5/lib/python3.13/site-packages/pyiceberg/table/__init__.py:686:
 UserWarning: Delete operation did not match any records
     warnings.warn("Delete operation did not match any records")
   === no filter ===
   pyarrow.Table
   x: double
   y: double
   ----
   x: [[0],[1],[2]]
   y: [[0],[0],[0]]
   === x IN (0) ===
   pyarrow.Table
   x: double
   y: double
   ----
   x: [[0]]
   y: [[0]]
   === x IN (0, 1, 2) ===
   pyarrow.Table
   x: double
   y: double
   ----
   x: [[1],[2]]
   y: [[0],[0]]
   ```
   
   I expect output for `x in (0, 1, 2)` to match that of the `no filter` scan.
   
   Note that I could not reproduce when `x` is a `LongType` instead of a 
`DoubleType`.
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [x] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to