NguyenDinhThien-future-aavn opened a new issue, #4002:
URL: https://github.com/apache/incubator-kie-kogito-runtimes/issues/4002

   ### Describe the bug
   
   I am having issue with SonataFlow not being able to handle callback events 
for a workflow instance if there are multiple pods/instances of SonataFlow 
running.
   
   For context, I am using the latest snapshot version of SonataFlow built 
manually from the github repositories.
   I am deploying my SonataFlow app on a kubernetes cluster, and I am 
horizontally scaling the SonataFlow app to multiple pods.
   
   <img width="565" height="287" alt="Image" 
src="https://github.com/user-attachments/assets/8a03d853-6df0-46cb-8333-ad49fdc135b9";
 />
   
   My workflow consists of multiple "callback" states. Basically my SonataFlow 
app will orchestrate other services by sending messages to them via a messaging 
system.
   So, when the callback state in my workflow will send the message to other 
services, then the callback state will wait for a callback event.
   Once the other services finish its work, it will send a request to the 
corresponding event of the callback state so the workflow can proceed.
   
   There are times where the workflow after publishing the message, the 
workflow still not finished processing the original request that starts 
workflow,
   and during that in-between time, the other service sends a callback event to 
the workflow.
   
   With just 1 pod/instance of SonataFlow running, what I see is that when 
SonataFlow receive the callback event call, it will wait until the previous 
request to finish, and then it processes the callback event request.
   This leads to the workflow can proceed with the next states until it 
completes.
   
   But when my SonataFlow deployment horizontally scales to multiple 
pods/instances, and if the callback event call reaches the pod that is not the 
one that first initialize the first state and publish the message, the other 
pod that receives the callback event request cannot continue with the next 
states of the workflow. But the callback event request still respond with a 202 
http code.
   This leads to the messaging system considering the message for the callback 
event request is successful and acknowledged.
   
   <img width="1119" height="686" alt="Image" 
src="https://github.com/user-attachments/assets/8a546cef-f061-4fb3-942d-772dd1448cbc";
 />
   
   
   ### Expected behavior
   
   From my point of view, the callback event request should fail with a 500 
http code response, because it was not able to process that event.
   
   ### Actual behavior
   
   The callback event request to the 2nd pod still respond with a 202 http 
code, meaning the event was consumed but the workflow does not transition to 
the next state.
   
   ### How to Reproduce?
   
   To reproduce, I also prepare an example zip file containing the workflows 
with a readme to reproduce.
   
   
[callback-workflow.zip](https://github.com/user-attachments/files/21503434/callback-workflow.zip)
   
   ### Output of `uname -a` or `ver`
   
   _No response_
   
   ### Output of `java -version`
   
   17
   
   ### GraalVM version (if different from Java)
   
   _No response_
   
   ### Kogito version or git rev (or at least Quarkus version if you are using 
Kogito via Quarkus platform BOM)
   
   _No response_
   
   ### Build tool (ie. output of `mvnw --version` or `gradlew --version`)
   
   _No response_
   
   ### Additional information
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to