[I] Embedded Jobs Addon Job Execution retry attempts are stuck [incubator-kie-issues]

via GitHub Wed, 04 Feb 2026 07:23:32 -0800


jstastny-cz opened a new issue, #2237:
URL: https://github.com/apache/incubator-kie-issues/issues/2237


   There is apparent concurrency issue when scheduled timers are being deleted 
before they can be executed.
   
   The following BPMN process with boundary timer event with PT30S timeout on 
the User Task triggers a Script Task which throws exception - in such case the 
Job created for the boundary timer should be retried based on relevant 
jobs-service configuration:
   ```
   kogito.jobs-service.maxNumberOfRetries=5
   kogito.jobs-service.retryMillis=1000
   ```
   and perhaps also to reproduce the issue in timely manner
   ```
   kogito.jobs-service.schedulerChunkInMinutes=5
   ```
   
   <img width="1537" height="915" alt="Image" 
src="https://github.com/user-attachments/assets/dd0e4ba3-4b12-44f9-a5ba-5885d0dbe084";
 />
   
   
   The observed behavior in the setup described above is showing a blocked 
retry execution after the initial retry attempt.
   1. initial attempt:
      ```
      1. 13:26:03 DEBUG  [or.ki.ko.ap.jo.im.VertxJobScheduler] 
(vert.x-eventloop-thread-1) Executing timeout with timer Id 1 and jobId 
ee4ad0e6-0147-4fe1-8eab-a4f1acef1395
      2. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-1) Timeout 
task 1 with jobId ee4ad0e6-0147-4fe1-8eab-a4f1acef1395 newTimeoutTask
      (exception thrown from task and following)
      3. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-1) 
doRetryIfAny JobDetails ... retries=1
      4. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-1) Timeout 
1 with jobId ee4ad0e6-0147-4fe1-8eab-a4f1acef1395 will be updated and scheduled
      5. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-1) 
removeTimerInfo TimerInfo[jobId=ee4ad0e6-0147-4fe1-8eab-a4f1acef1395, 
timerId=1, timeout=Wed Feb 04 13:26:03 CET 2026]
      6. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-1) 
addTimerInfo JobDetails ... retries=1
      7. only now or.ki.ko.ap.jo.in.ErrorHandlingJobTimeoutInterceptor kicks in 
to report the failure.
      ```
   2. first retry attempt:
      ```
      1. 13:26:03 DEBUG  [or.ki.ko.ap.jo.im.VertxJobScheduler] 
(vert.x-eventloop-thread-1) Executing timeout with timer Id 2 and jobId 
ee4ad0e6-0147-4fe1-8eab-a4f1acef1395
      2. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) Timeout 
task 2 with jobId ee4ad0e6-0147-4fe1-8eab-a4f1acef1395 newTimeoutTask
      (exception thrown from task and following)
      3. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) 
doRetryIfAny JobDetails ... retries=1 (original)
          13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) 
doRetryIfAny JobDetails ... retries=2 (rescheduled)
      4. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) Timeout 
2 with jobId ee4ad0e6-0147-4fe1-8eab-a4f1acef1395 will be updated and scheduled
      5. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) 
removeTimerInfo TimerInfo[jobId=ee4ad0e6-0147-4fe1-8eab-a4f1acef1395, 
timerId=2, timeout=Wed Feb 04 13:26:03 CET 2026]
      6. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) 
addTimerInfo JobDetails ... retries=2
      7. 13:26:03 TRACE  [or.ki.ko.ap.jo.im.VertxJobScheduler] (Jobs-2) 
removeTimerInfo TimerInfo[jobId=ee4ad0e6-0147-4fe1-8eab-a4f1acef1395, 
timerId=3, timeout=Wed Feb 04 13:26:03 CET 2026]
      ```
       * See the last TRACE log - it immediately removes the timer it has just 
created.
   
   Investigation lead to conclusion that the unintended removeTimerInfo call 
belongs to the "previous" state of jobDetails, because the last execution 
attempt and should NOT cancel the scheduled timer.
   
   To remove the problem, the most straight-forward way is to extend TimerInfo 
record by the retry attempt ordinal to distinguish timers between the retry 
attempts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Embedded Jobs Addon Job Execution retry attempts are stuck [incubator-kie-issues]

Reply via email to