pvary commented on PR #12629: URL: https://github.com/apache/iceberg/pull/12629#issuecomment-2841493295
> I tried to move `PartitionStatsHandler` to `core` and use `InternalData`, I can't make test case run as it needs `iceberg-parquet` dependencies. The tests could run with Avro in core. How InternalData tested with Parquet? We could use the same approach with this as well > Other way is to move `PartitionStatsUtil` to `data` module, but since computing stats is core functionality, I don't want to move it to data. I have some concerns about the size of the `core` module. I have very bad experience in ever growing big modules. In Hive everything was in `ql` module, and the dependencies messed up. That is why, I was debating wether we could find a better place for this feature, but I don't yet have a strong feeling around this. > So, we need to drop the idea of this movement. We need to work with having `PartitionStatsUtil` in core and `PartitionStatsHandler` in data, what can be done to keep interface simple. I will work it and also finish the compaction test. I really want to avoid this. This will create public interfaces which are public just because we can not organize our code better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org