[PR] AWS: handle premature connection close [iceberg]

via GitHub Fri, 27 Mar 2026 11:13:08 -0700


kinolaev opened a new pull request, #15792:
URL: https://github.com/apache/iceberg/pull/15792


   During vectorized Parquet reads, `S3InputStream` opens an unbounded HTTP 
range request (`bytes=pos-`) and reads one row group eagerly into memory. While 
Spark processes that in-memory row group (which can take several minutes for 
large batches), the client stops reading from S3. The TCP receive buffer fills 
up, and S3 eventually tears down the stalled connection.
   
   When the next row group read begins, the connection is already dead and 
Apache HTTP client throws `ConnectionClosedException: Premature end of 
Content-Length delimited message body (expected: x; received: y)` (when using 
[apache http 
client](https://github.com/apache/httpcomponents-core/blob/rel/v5.4.2/httpcore5/src/main/java/org/apache/hc/core5/http/impl/io/ContentLengthInputStream.java#L176-L178)).
 This only affects files with multiple row groups (typically >128 MB).
   
   The existing retry policy handles `SSLException`, `SocketTimeoutException`, 
and `SocketException`, but not this case. This PR extends the retry predicate 
to reopen the stream at the saved position when this specific exception is 
encountered, while leaving all other `ConnectionClosedException` variants (e.g. 
from `abort()`) unaffected.
   
   Fixes https://github.com/apache/iceberg/issues/9674 and 
https://github.com/apache/iceberg/issues/9679.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] AWS: handle premature connection close [iceberg]

Reply via email to