asheeshgarg commented on issue #208: URL: https://github.com/apache/iceberg-python/issues/208#issuecomment-1912636932
@jqin61 I have also seen this behavior pyarrow.dataset.write_dataset(), its behavior removes the partition columns in the written-out parquet files. @syun64 above approach look reasonable to me. It would have been ideal if the partition write we can be done directly using arrow dataset API and use meta data based hidden partitioning using Pyiceberg API. But we need to do good amount of lift in order to that. Haven't seen support for bucket partitioning. I think we can add write directly using the Pyarrow API as suggested above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org