Fokko commented on issue #7067: URL: https://github.com/apache/iceberg/issues/7067#issuecomment-1464229388
@asheeshgarg Ah nice, that works, but has some caveats that you need to be aware of. Iceberg tracks the columns by ID's instead of names. For example, if you rename a column, we do this on the table schema. When we read in the files, and we encounter a file that has the old column name, we update the name based on the ID of the column. Also, things likes deletes. This makes it quite an effort to implement Iceberg to engines like Polars as well (mostly because there is no rust implementation yet). With the upcoming 0.4.0 version we'll get even more performance because now we also have metrics evaluation (skipping Parquet files based on the upper- and lower bounds) and also positional deletes. I suggested creating a Polars dataframe from an Arrow table because then you'll get things like the projection and deletes for free :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org