ossDataEngineer opened a new issue, #17375:
URL: https://github.com/apache/dolphinscheduler/issues/17375

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   **When a worker-server service on cluster node goes down, we get multiple 
duplicate alert emails.** 
   
   Problem 1. The email alerts are being sent by each node which runs the alert 
server (multiple problem)
   Problem 2. Each individual alert server tries sending same alert multiple 
times (duplicate problem) sometimes fails for db unique constraint
   
   **Node 1 Alert Server logs**
   
   [INFO] 2025-07-25 15:19:32.225 +0000 
org.apache.dolphinscheduler.plugin.alert.email.EmailAlertChannel:[57] - alert 
send success
   [ERROR] 2025-07-25 15:19:32.281 +0000 
org.apache.dolphinscheduler.alert.AlertSenderService:[84] - alert sender thread 
error
   org.springframework.dao.DuplicateKeyException:
   ### Error updating database.  Cause: org.postgresql.util.PSQLException: 
ERROR: duplicate key value violates unique constraint "alert_send_status_unique"
     Detail: Key (alert_id, alert_plugin_instance_id)=(39, 1) already exists.
   ### The error may exist in 
org/apache/dolphinscheduler/dao/mapper/AlertSendStatusMapper.java (best guess)
   ### The error may involve 
org.apache.dolphinscheduler.dao.mapper.AlertSendStatusMapper.insert-Inline
   ### The error occurred while setting parameters
   ### SQL: INSERT INTO t_ds_alert_send_status  ( alert_id, 
alert_plugin_instance_id, send_status, log, create_time )  VALUES  ( ?, ?, ?, 
?, ? )
   ### Cause: org.postgresql.util.PSQLException: ERROR: duplicate key value 
violates unique constraint "alert_send_status_unique"
     Detail: Key (alert_id, alert_plugin_instance_id)=(39, 1) already exists.
   ; ERROR: duplicate key value violates unique constraint 
"alert_send_status_unique"
     Detail: Key (alert_id, alert_plugin_instance_id)=(39, 1) already exists.; 
nested exception is org.postgresql.util.PSQLException: ERROR: duplicate key 
value violates unique constraint "alert_send_status_unique"
     Detail: Key (alert_id, alert_plugin_instance_id)=(39, 1) already exists.
           at 
org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:247)
           at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:70)
           at 
org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:91)
           at 
org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:441)
           at com.sun.proxy.$Proxy124.insert(Unknown Source)
           at 
org.mybatis.spring.SqlSessionTemplate.insert(SqlSessionTemplate.java:272)
           at 
com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:59)
           at 
com.baomidou.mybatisplus.core.override.MybatisMapperProxy$PlainMethodInvoker.invoke(MybatisMapperProxy.java:148)
           at 
com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:89)
           at com.sun.proxy.$Proxy130.insert(Unknown Source)
           at 
org.apache.dolphinscheduler.dao.AlertDao.addAlertSendStatus(AlertDao.java:136)
           at 
org.apache.dolphinscheduler.alert.AlertSenderService.send(AlertSenderService.java:119)
           at 
org.apache.dolphinscheduler.alert.AlertSenderService.run(AlertSenderService.java:81)
   
   
   **Node 2 Alert Server Logs**
   
   
   [INFO] 2025-07-25 15:19:31.860 +0000 
org.apache.dolphinscheduler.plugin.alert.email.EmailAlertChannel:[57] - alert 
send success
   [INFO] 2025-07-25 15:19:32.392 +0000 
org.apache.dolphinscheduler.plugin.alert.email.EmailAlertChannel:[57] - alert 
send success
   
   
   ### What you expected to happen
   
   Only one alert email for each event.
   Eg. When one worker server crashes, I should get only one email alert
   
   
   
   ### How to reproduce
   
   1. Run zkeeper over 4 node cluster
   2. Run master+api+alert on nodes 1 & 2
   3. Run alert+worker on other 2 nodes i.e 3 & 4
   4. Stop the worker service on 1 node i.e node 4
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   3.1.x
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to