koenvo commented on PR #1995: URL: https://github.com/apache/iceberg-python/pull/1995#issuecomment-2932336792
> fwiw I think we should try to get this merged in at some point. Some ideas: > > 1. Make it a flag to use the batchreader or not, some users might have basically infinite memory > 2. Is there a more optimal way to batch data? Thinking along the lines of using partitions although that may already happen under the hood I've been thinking about what I (as a developer) want. The answer is: set max memory usage. Some ideas: 1. Determine which partitions can fit together in memory and batch load those together 2. Fetching of parquet files can happen parallel and only do loading sequential 3. Combine 1 and 2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org