Hi Torsten, Yuri, I'm not a core developer of ZMQ but a ZMQ user since many years... here's my take on this:
Il giorno dom 9 ott 2022 alle ore 09:49 Torsten Wierschin < [email protected]> ha scritto: > Yuri <[email protected]> schrieb am Sa., 8. Okt. 2022, 20:46: > >> On 10/8/22 07:09, orzodk wrote: >> > My understanding of ZMQ is that the implementation details of the "under >> > the hood" socket are hidden from you intentionally. I'm not sure how one >> > would catch that. Hopefully someone else can answer. >> >> >> But the abstraction level is too deep and it prevents access to >> important and relevant information. >> > I agree. > > The scenerio is: > - connection established and working > - server now vanishes unintentionally > - client side is not able to reestablish connection iff server reappears > > The abstraction level at first seems unable to handle such a simple scenario perhaps. But if you read the ZMQ guide, one the concept that it conveys is that you need to build some protocol on top of ZMQ transport that fullfills all your application needs. In other words: if your need is to ensure that 100% of the time there is a point-to-point connection (server-client) "working" (usable to move bytes/information between the 2 points), then you should e.g. design "keep alive" frames (or "ping/pong") in your protocol so that both sides have the ability to detect unhealthy connection and re-act. In other words: in your scenario above, if you have ping/pong frames and logic to check how much time has elapsed since the last "ping", you will be able to understand that the TCP server has vanished on both application sides. You might debate on the fact that just handling the "listening TCP socket error" would be easier than building ping/pong frames, timeout logic, etc. However, consider that handling such TCP-server-level errors would be not enough to detect "stale connections" or dysfunctional networking; I'll make a very practical example for me: I've written applications that are deployed inside a Kubernetes using Istio service mesh ( https://istio.io/latest/docs/ops/deployment/architecture/): [image: image.png] In such context, all TCP connections between servers/clients are transparently redirected to the Envoy sidecar. Real data flow happens only between 2 Envoy sidecars. Sometimes it happens (for a number of reasons) that the TCP connection between 2 Envoys break. My app would never realize that by just handling "listening TCP socket error": the TCP listen server on "Service A" in that Istio arch picture is running just fine; if the problem happens on the green line "Mesh traffic", the only way you can detect that is to have ping/pong frames (or some other protocol-level indication). This is just an example of problems that does not impact directly the TCP sockets of the 2 servers where your application are running but that result in their inability to communicate. So in some sense I can say that the ZMQ abstraction forces you to write reliable protocols/applications that take an "holistic" approach to networking without restricting your focus on just the most obvious networking issues (like e.g. a server that cannot start with errno "port already in use") HTH, Francesco
_______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
