sdd commented on issue #630: URL: https://github.com/apache/iceberg-rust/issues/630#issuecomment-2644861873
OK, I have an improved design for loading of delete files in the read pgase that I'll share shortly. We introduce a DeleteFileManager, constructed when ArrowReader gets built and provided with a FileIO. Reader keeps an Arc of this that it clones and passes to process_file_scan_task. process_file_scan_task calls an async method of DeleteFileManager, passing in the delete file list for its file scan task. DeleteFileManager loads and processes the delete files, deduplicating between multiple file scan tasks that reference the same delete files. DeleteFileManager exposes two methods that process_file_scan_task calls later on - one to retrieve the list of positional delete row indices that apply to a specified data file, and another to get a filter predicate derived from the applicable delete files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org