SandeepSinghGahir commented on PR #10433:
URL: https://github.com/apache/iceberg/pull/10433#issuecomment-2353901444

   > > > > The S3 team (@ookumuso) just published what they have developed 
internally and is now used in Amazon EMR, Athena, GlueETL distributions of 
Iceberg: #11052, is there a way we can see how we could collaborate on this?
   > > > 
   > > > 
   > > > Sorry for the late reply, Absolutely @jackye1995 ! So it seems like 
the overlap between this and #11052 is primarily the retrying of the reading of 
the input stream. #11052 also includes some configurable properties and 
different configurations for the S3Client.
   > > > For the retrying of the input stream, I think the key thing I was 
trying to achieve was adding the RetryableInputStream primitive so that other 
FileIOs like GCS/Azure can also use that. How do we feel about that @jackye1995 
@ookumuso ? And then the other PR we can talk about the configuration/default 
updates?
   > > 
   > > 
   > > No worries and thanks for following up! I am okay with using the 
RetryableInputStream as long as we can we can close this out soonish since 
there seems to be a lot of people hitting the same issue. Have 2 options here:
   > > 
   > > * I can exclude the InputStream changes entirely from my PR and just 
keep the s3-retry config changes. Might be good to use the same retry configs 
that we provide in your RetryableInputStream as well
   > > * I can keep both depending on your timelines but can be replaced by 
your change, tests are mostly same but might give you some conflicts on 
InputStream.
   > > 
   > > What do you think @jackye1995 @amogh-jahagirdar ?
   > 
   > Sounds good, yeah let me get this to closure soon since it has been open 
for a while. I think I'd propose option 1 and we rebase #11052 on top of that. 
My only take is we really should try and minimize as much configuration as 
possible in Iceberg since the S3 properties is already quite complex. But I 
think we can address that in the other PR since not retrying the input stream 
is a concrete issue that quite a few folks are hitting.
   
   Sounds good. Yes, a lot of folks are hitting this issue, also mentioned 
here: https://github.com/apache/iceberg/issues/10340
   
   Do we have any expected time to merge this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to