cgpoh opened a new issue, #6606:
URL: https://github.com/apache/iceberg/issues/6606

   ### Apache Iceberg version
   
   1.1.0 (latest release)
   
   ### Query engine
   
   Flink
   
   ### Please describe the bug 🐞
   
   operating environment:
   
   Flink 1.15.2
   Iceberg 1.1.0
   Hadoop AWS 2.10.1
   MinIO S3 storage
   
   When running a Flink job streaming from an Iceberg table, after few hours, 
Flink will throw the following exception and unable to restart the job:
   
   `2023-01-17 05:34:55,864 WARN  org.apache.iceberg.util.Tasks                 
               [] - Retrying task after failure: Failed to open input stream 
for file: 
s3a://recordings/raw_2019/fpl/metadata/04726-837067cc-0731-44a9-a99c-68b4c7c9e8f8.metadata.json
   org.apache.iceberg.exceptions.RuntimeIOException: Failed to open input 
stream for file: 
s3a://recordings/raw_2019/fpl/metadata/04726-837067cc-0731-44a9-a99c-68b4c7c9e8f8.metadata.json
        at 
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:187) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:273) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:267) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:183)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:202)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:402) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:212) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:189) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:202)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:179)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:174)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:243)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:44) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.flink.TableLoader$CatalogTableLoader.loadTable(TableLoader.java:109)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.flink.source.IcebergSource.lazyTable(IcebergSource.java:125) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.iceberg.flink.source.IcebergSource.createReader(IcebergSource.java:142)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.flink.streaming.api.operators.SourceOperator.initReader(SourceOperator.java:286)
 ~[flink-dist-1.15.2.jar:1.15.2]
        at 
org.apache.flink.streaming.runtime.tasks.SourceOperatorStreamTask.init(SourceOperatorStreamTask.java:94)
 ~[flink-dist-1.15.2.jar:1.15.2]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:666)
 ~[flink-dist-1.15.2.jar:1.15.2]
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:643)
 ~[flink-dist-1.15.2.jar:1.15.2]
        at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
 [flink-dist-1.15.2.jar:1.15.2]
        at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:917) 
[flink-dist-1.15.2.jar:1.15.2]
        at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) 
[flink-dist-1.15.2.jar:1.15.2]
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) 
[flink-dist-1.15.2.jar:1.15.2]
        at java.lang.Thread.run(Unknown Source) [?:?]
   Caused by: java.io.InterruptedIOException: getFileStatus on 
s3a://recordings/raw_2019/fpl/metadata/04726-837067cc-0731-44a9-a99c-68b4c7c9e8f8.metadata.json:
 com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout 
waiting for connection from pool
        at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:141) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:117) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:1926) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1876)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1812) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:611) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787) 
~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
        at 
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:183) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        ... 27 more
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1264)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1086)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:1912) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1876)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1812) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:611) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787) 
~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
        at 
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:183) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        ... 27 more
   Caused by: 
com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: 
Timeout waiting for connection from pool
        at 
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:286)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:263)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at jdk.internal.reflect.GeneratedMethodAccessor21.invoke(Unknown 
Source) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown 
Source) ~[?:?]
        at java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
        at 
com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at com.amazonaws.http.conn.$Proxy61.get(Unknown Source) ~[?:?]
        at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1236)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1264)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1086)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:1912) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:1876)
 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1812) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:611) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787) 
~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
        at 
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:183) 
~[blob_p-ef9f7a65353e12b1d19241c408ca3df4fbd64570-24f467b241db1d55f5345b551c2cf4ed:?]
        ... 27 more`
   
   Setting the s3.connection.maximum in flink conf to 100 does not help


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to