ryzhyk opened a new issue, #811: URL: https://github.com/apache/iceberg-rust/issues/811
I ran into a performance issue querying an Iceberg table in S3 via the datafusion provider. The table was created using pyiceberg with the following schema: ```python schema = Schema( NestedField(1, "id", LongType(), required=True), NestedField(2, "name", StringType(), required=False), NestedField(3, "b", BooleanType(), required=True), NestedField(4, "ts", TimestampType(), required=True), NestedField(5, "dt", DateType(), required=True), ) ``` The table is partitioned by date extracted from the `ts` column: ```python partition_spec = PartitionSpec( PartitionField( source_id=4, field_id=1000, transform=DayTransform(), name="date" ) ) ``` There are 10,000,000 records in the table spread evenly across ~200 partitions for dates between 2023-01-01 and 2023-08-02. I query the table using `iceberg-rust` via the datafusion table provider using range queries of the form: ```sql select * from my_table where ts >= timestamp '2023-01-05T00:00:00' and ts < timestamp '2023-01-06T00:00:00' ``` I expect this query to be very efficient, as it only needs to read one partition, however in reality it takes about as long as scanning the entire table with `select * from my_table` (approximately 10 seconds). It looks like predicate pushdown doesn't work here for some reason. Questions: * Is this a performance issue in `iceberg-rust` or am I doing something wrong? * Is there a better way to perform this query efficiently? I am using the latest `main` branch of this repo. Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org