singhpk234 commented on PR #13400: URL: https://github.com/apache/iceberg/pull/13400#issuecomment-3251375107
> client/server backpressure and how this would fit in for engines which immediately start task consumption/execution during planning (like Trino) My understanding was ParallelIterable could be helpful here as its aware of the both consumer and the producer, and handle backpressure via yields https://github.com/apache/iceberg/blob/be03c998d96d0d1fae13aa8c53d6c7c87e2d60ba/core/src/main/java/org/apache/iceberg/util/ParallelIterable.java#L200-L205 Agree need to think this more thoroughly also from the server POV, from my cursory reading of Trino source code (I am fairly new to it) https://github.com/trinodb/trino/blob/master/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergSplitManager.java#L149 split generation and consumption should mostly work, as this seems like this is built in engine itself. for engine which needs all the splits computed first we have no choice but to consume everything. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
