thunderlike opened a new issue, #504:
URL: https://github.com/apache/doris-flink-connector/issues/504

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### Version
   
   flink1.14.4 flink-doris-connector-1.14 1.1.1 版本写入doris 2.1.6
   
   ### What's Wrong?
   
   使用 flink1.14.4 flink-doris-connector-1.14 1.1.1 版本写入doris 
2.1.6当一个be节点长时间失联时,flink job失败且无法从checkpoint恢复。
   核心报错如下:
   2024-10-12 14:13:48,049 ERROR 
org.apache.doris.flink.sink.committer.DorisCommitter         [] - commit 
transaction failed:
   org.apache.http.conn.HttpHostConnectException: Connect to 10.126.72.64:8040 
[/10.126.72.64] failed: Connection refused (Connection
   refused)
           at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
 ~[
   
blob_p-61f0320fe7cc111cc844c6575ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
 ~[bl
   
ob_p-61f0320fe7cc111cc844c6575ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
 ~[blob_p-61f0320fe7cc111cc844c65
   75ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) 
~[blob_p-61f0320fe7cc111cc844c6575ef7cc
   48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) 
~[blob_p-61f0320fe7cc111cc844c6575ef7cc48af
   8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) 
~[blob_p-61f0320fe7cc111cc844c6575ef7cc48af8a4b1f-
   139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) 
~[blob_p-61f0320fe7cc111cc844c6575ef7cc48af
   8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
 ~[blob_p-61f0320fe7cc111cc844c65
   75ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
 ~[blob_p-61f0320fe7cc111cc844c657
   5ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
 ~[blob_p-61f0320fe7cc111cc844c65
   75ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.doris.flink.sink.committer.DorisCommitter.commitTransaction(DorisCommitter.java:91)
 ~[blob_p-61f0320fe7cc11
   1cc844c6575ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.doris.flink.sink.committer.DorisCommitter.commit(DorisCommitter.java:71)
 ~[blob_p-61f0320fe7cc111cc844c6575
   ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.flink.streaming.runtime.operators.sink.StreamingCommitterHandler.commit(StreamingCommitterHandler.java:54)
   ~[flink-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.operators.sink.AbstractStreamingCommitterHandler.retry(AbstractStreamingCommitterHa
   ndler.java:99) ~[flink-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.operators.sink.AbstractCommitterHandler.retry(AbstractCommitterHandler.java:66)
 ~[f
   link-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.operators.sink.CommitRetrier.retry(CommitRetrier.java:80)
 ~[flink-dist_2.11-1.14.4.
   jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.operators.sink.CommitRetrier.lambda$retryAt$0(CommitRetrier.java:63)
 ~[flink-dist_2
   .11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1693)
 ~[flink-dist_2.
   11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$null$22(StreamTask.java:1684)
 ~[flink-dist_2.11-1.14.4.jar:
   1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
 ~[fl
   ink-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) 
~[flink-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProc
   essor.java:338) ~[flink-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
 ~[flink-dist_
   2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
 ~[flink-di
   st_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809)
 ~[flink-dist_2.11-1.14.4.jar:1
   .14.4]
           at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761) 
~[flink-dist_2.11-1.14.4.jar:1.14.4]
           at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
 [flink-dist_2.11-1.14.4.jar:1.14.
   4]
           at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937) 
[flink-dist_2.11-1.14.4.jar:1.14.4]
           at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) 
[flink-dist_2.11-1.14.4.jar:1.14.4]
           at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575) 
[flink-dist_2.11-1.14.4.jar:1.14.4]
           at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
   Caused by: java.net.ConnectException: Connection refused (Connection refused)
           at java.net.PlainSocketImpl.socketConnect(Native Method) 
~[?:1.8.0_322]
           at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) 
~[?:1.8.0_322]
           at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
 ~[?:1.8.0_322]
           at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) 
~[?:1.8.0_322]
           at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) 
~[?:1.8.0_322]
           at java.net.Socket.connect(Socket.java:607) ~[?:1.8.0_322]
           at 
org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
 ~[blob_p-
   61f0320fe7cc111cc844c6575ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
           at 
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
 ~[
   
blob_p-61f0320fe7cc111cc844c6575ef7cc48af8a4b1f-139883e44a5481fda2030ca9306e6ec8:?]
   
   看起来是从checkpoint恢复时,读取的host还是故障节点ip,所以访问连接不上
   
   ### What You Expected?
   
   我们没法升级flink1.14这个环境,请问 flink-doris-connector-1.14 
1.1.1版本这个问题可以修复下吗?系统对高可用较高,be节点当磁盘故障时,无法短时间恢复,那么这个问题就肯定会复现。
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to