liurenjie1024 commented on PR #1455: URL: https://github.com/apache/iceberg-rust/pull/1455#issuecomment-2994077975
> But, do you mean that this procedure fits in this repo, but belongs in a different crate? Or that it doesn’t fit this repository? The procedure implementation relies on a distributed(or parallel) computing engine. You may argue expire snapshot doesn't, but when it comes others like compaction, rewrite data files, we do need a compute engine. The reason I suggest to implement it in datafusion is that, as described in your issue, we will provide a working version for the community. Also it could serve as an example of integrating with the core library for other compute engines. > Implementing this in datafusion (say, as a UDF) does make sense, but I don’t think it should be the only way to call this. This procedure should exist with public rust apis for integration into systems that don’t depend on datafusion. Ideally we could implement one, but it's difficult in rust. This involves designing an abstraction over different async runtime, memory management, etc. This is a complicated task which requires careful design, and collaboration from different compute engine community. For now, I don't see much value on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org