Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-03-05 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2702008341 Update: The implementation is ready for review, split across a series of PRs: * https://github.com/apache/iceberg-rust/pull/652 * https://github.com/apache/iceberg-rust/pull/9

Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-02-23 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2676912068 I've worked on an improved design for loading and parsing of delete files by the `DeleteFileManager`. The code for this can be seen in https://github.com/apache/iceberg-rust/pull/982.

Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-02-18 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2667616082 @ZENOTME I think that we'll want to do that at some point but it feels more of a day 2 task. We're not touching the disk anywhere in the library so far, as far as I know, and so it wo

Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-02-11 Thread via GitHub
ZENOTME commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2651384092 Thanks for this great job! @sdd Should we also consider the case that `enum Deletes` occupy too much space so we need to support spilling it into disk? -- This is an automated m

Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-02-08 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2644861873 OK, I have an improved design for loading of delete files in the read pgase that I'll share shortly. We introduce a DeleteFileManager, constructed when ArrowReader gets built a

Re: [I] Delete Files in Table Scans [iceberg-rust]

2025-02-07 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2644037450 Hi all. I'm resurrecting this issue now that @Fokko has kindly helped get the first part of this over the line by reviewing and merging https://github.com/apache/iceberg-rust/pull/652

Re: [I] Delete Files in Table Scans [iceberg-rust]

2024-09-27 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2378632325 Thanks for taking a look at the above, @liurenjie1024. I've just submitted a draft PR which outlines the second part of the approach - how we extend the filtering in the arrow reader

Re: [I] Delete Files in Table Scans [iceberg-rust]

2024-09-26 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2376563426 Thanks - I have some skeleton code for the required changes to reader.rs that I'm going to share over the next few days as well. -- This is an automated message from the Apache Git

Re: [I] Delete Files in Table Scans [iceberg-rust]

2024-09-26 Thread via GitHub
liurenjie1024 commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2376486699 Thanks @sdd for raising this. The general approach looks good to me. Challenging part of deletion file processing is to filter unnecessary deletion files in each task, which

Re: [I] Delete Files in Table Scans [iceberg-rust]

2024-09-17 Thread via GitHub
sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2357605851 I'm happy to add the partitioning result to the task. This is useful to the executor node when deciding how to distribute tasks, as it enables the use of a few different strategies, t

Re: [I] Delete Files in Table Scans [iceberg-rust]

2024-09-14 Thread via GitHub
xxhZs commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2350911899 Hi, I've recently implemented merge on read in my library using iceberg rust and submitted a working simplified version of the code, which looks somewhat similar to the `A naive app