Luca: I hear what you’re saying but … I think I’m talking about a different situation.
If I understand your explanation correctly, you’re saying that setting ZMQ_RECONNECT_IVL to -1 should prevent a disconnected endpoint from *ever* reconnecting, under any set of circumstances. I would read the doc (4.2.2) more like the following (with addition in *bold*): > The ZMQ_RECONNECT_IVL option shall set the initial reconnection interval for > the specified socket. The reconnection interval is the period ØMQ shall wait > between attempts to *automatically* reconnect disconnected peers when using > connection-oriented transports. The value -1 means no reconnection. What I’m questioning is the interaction between ZMQ_RECONNECT_IVL == -1 and the behavior enforced by https://github.com/zeromq/libzmq/issues/788. (Also see here: https://www.mail-archive.com/[email protected]/msg21484.html). That commit is intended to prevent *duplicate* connections from the same endpoint, for certain socket types (e.g., pub/sub), where multiple connections (and their associated duplicate messages) don’t make sense. One scenario I’m concerned about is the one where: 1. Endpoint connects to us 2. Endpoint is disconnected for some reason 3. Setting ZMQ_RECONNECT_IVL=-1 disables *automatic* reconnect, so as far as we’re concerned the endpoint is dead 4. Subsequently the endpoint connects to us again (e.g., following a restart) 5. Because we still have a record of the endpoint, we will refuse the connection — even though the endpoint is dead from our point of view. In this scenario that endpoint can NEVER reconnect. So I get that setting ZMQ_RECONNECT_IVL should prevent us from reconnecting (automatically) to the disconnected endpoint, but I don’t see the benefit of preventing that endpoint from actively reconnecting at a later time. In this case, we’ve essentially blacklisted that endpoint (forever), and I’m having trouble coming up with a scenario where that would be intended behavior. Does this make sense? Am I missing something here? Also, to your point about adding a protocol layer on top of 0MQ — I would MUCH prefer to let 0MQ handle as much of the underlying connect/disconnect logic as possible. I’m concerned about the potential for the protocol’s view of the connection state getting out of sync with 0MQ’s view (not to mention a bunch of additional work on the protocol layer, but more about synchronization). Thanks for listening ... Bill > On Sep 17, 2017, at 6:39 AM, Luca Boccassi <[email protected]> wrote: > > On Sat, 2017-09-16 at 14:34 -0400, Bill Torpey wrote: >> Hi Luca: >> >> Just a gentle reminder to add an issue so this can be tracked (or let >> me know if you’d prefer that I do that). >> >> Thanks! >> >> Bill > > Thinking about this a bit more, I think it's expected behaviour after > all. From the doc: > > "The 'ZMQ_RECONNECT_IVL' option shall set the initial reconnection > interval for the specified 'socket'. The reconnection interval is the > period 0MQ shall wait between attempts to reconnect disconnected peers > when using connection-oriented transports. The value -1 means no > reconnection." > > So it is working as intended - if a peer goes away, it will never be > reconnected if that option is set. > > And it makes sense - in the context of a TCP connection, a dead peer is > a dead peer. If for an application a dead peer might be resurrected > after X amount of time, there's no way to know that. It needs to be > handled by the application. > > There are various tools you can use: > > 1) ZMTP heartbeats - see ZMQ_HEARTBEAT* socket options > 2) socket monitoring events (including connects and disconnects) - see > zmq_socket_monitor documentation > 3) Enhance your protocol - call zmq_disconnect(endpoint) on your > sockets when a particular message is received, or heartbeats are > missed, or a disconnect event happens. This way when you later call > zmq_connect(endpoint) and it happens to match a previous, dead peer, it > will work as expected > >>> On Sep 2, 2017, at 1:21 PM, Luca Boccassi <[email protected]> >>> wrote: >>> >>> On Sat, 2017-09-02 at 12:02 -0400, Bill Torpey wrote: >>>> Thanks again, Luca! >>>> >>>> For now, I’m going to go with disabling reconnect on the “data” >>>> sockets — that seems to be the best solution for my use case >>>> (connecting to endpoints that were returned by the peer binding >>>> to an >>>> unspecified (“wildcard”) port — e.g., "tcp://<interface>:*" in >>>> ZMQ). >>>> >>>> This assumes that ZMQ will completely forget about the endpoint >>>> if/when it is disconnected, if it is set not to >>>> reconnect. Otherwise >>>> I might run afoul of ZMQ’s silently ignoring connections to >>>> endpoints >>>> that it already knows about: https://github.com/zeromq/libzmq/is >>>> sues >>>> /788 <https://github.com/zeromq/libzmq/issues/788> (e.g., in the >>>> case >>>> where another process later happens to be assigned the same >>>> ephemeral >>>> port). >>>> >>>> I’ve done a quick scan of the libzmq code (v4.2.2) and it doesn’t >>>> appear that the endpoint is removed in the case of a (terminal) >>>> disconnect. If you can confirm/deny this behavior, that would be >>>> helpful. Failing that, I guess I’ll need to test this in the >>>> debugger — any hints on how best to do this would also be much >>>> appreciated. >>>> >>>> Regards, >>>> >>>> Bill >>> >>> Yes it doesn't look like it removes the endpoint - I guess it's a >>> corner case that's missed. I'll open an issue. >>> >>> BTW all these things are very quick and easy to try with Python on >>> Linux. Just install pyzmq, open a python3 terminal and: >>> >>> import zmq >>> ctx = zmq.Context.instance() >>> rep = ctx.socket(zmq.REP) >>> rep.bind("tcp://127.0.0.1:12345") >>> req = ctx.socket(zmq.REQ) >>> req.connect("tcp://127.0.0.1:12345") >>> req.send_string("hello") >>> rep.recv() >>> rep.send_string("hallo") >>> req.recv() >>> rep.unbind("tcp://127.0.0.1:12345") >>> rep.close() >>> rep = ctx.socket(zmq.REP) >>> rep.bind("tcp://127.0.0.1:12345") >>> req.send_string("hello") >>> rep.recv() >>> rep.send_string("hallo") >>> req.recv() >>> rep.unbind("tcp://127.0.0.1:12345") >>> rep.close() >>> req.close() >>> rep = ctx.socket(zmq.REP) >>> rep.bind("tcp://127.0.0.1:12345") >>> req = ctx.socket(zmq.REQ) >>> req.setsockopt(zmq.RECONNECT_IVL, >>> -1)req.connect("tcp://127.0.0.1:12345") >>> req.send_string("hello") >>> rep.recv() >>> rep.send_string("hallo") >>> req.recv() >>> rep.unbind("tcp://127.0.0.1:12345") >>> rep.close() >>> rep = ctx.socket(zmq.REP) >>> rep.bind("tcp://127.0.0.1:12345") >>> req.send_string("hello") >>> rep.recv() >>> >>> This last one won't receive the message >>> >>>>> On Sep 1, 2017, at 6:19 PM, Luca Boccassi <luca.boccassi@gmail. >>>>> com> >>>>> wrote: >>>>> >>>>> On Fri, 2017-09-01 at 18:03 -0400, Bill Torpey wrote: >>>>>> Thanks Luca! That was very helpful. >>>>>> >>>>>> Although it leads to a couple of other questions: >>>>>> >>>>>> - Can I assume that a ZMQ disconnect of a tcp endpoint would >>>>>> only >>>>>> occur if the underlying TCP socket is closed by the OS? Or >>>>>> are >>>>>> there >>>>>> conditions in which ZMQ will proactively disconnect the TCP >>>>>> socket >>>>>> and try to reconnect? >>>>> >>>>> Normally that's the case - you can set up heartbeating with the >>>>> appropriate options and that will kill a connection if there's >>>>> no >>>>> answer >>>>> >>>>>> - I see that there is a sockopt (ZMQ_RECONNECT_IVL) that can >>>>>> be >>>>>> set >>>>>> to -1 to disable reconnection entirely. In my case, the the >>>>>> “data” >>>>>> socket pair will *always* connect to an ephemeral port, so I >>>>>> *never* >>>>>> want to reconnect. Would this be a reasonable option in my >>>>>> case, >>>>>> do >>>>>> you think? >>>>> >>>>> If that makes sense for your application, go for it - in these >>>>> cases >>>>> the only way to be sure is to test it and see how it works >>>>> >>>>>> - Would there be any interest in a patch that would disable >>>>>> reconnects (controlled by sockopt) for ephemeral ports >>>>>> only? I’m >>>>>> guessing that reconnecting mostly makes sense with well-known >>>>>> ports, >>>>>> so something like this may be of general interest? >>>>> >>>>> If by ephemeral port you mean anything over 1024, then actually >>>>> in >>>>> most >>>>> applications I've seen it's always useful to reconnect, and the >>>>> existing option should be enough for those cases where it's not >>>>> desired >>>>> - we don't want to duplicate functionality >>>>> >>>>>> Thanks again! >>>>>> >>>>>> Bill >>>>>> >>>>>>> On Sep 1, 2017, at 5:30 PM, Luca Boccassi <luca.boccassi@gm >>>>>>> ail. >>>>>>> com> >>>>>>> wrote: >>>>>>> >>>>>>> On Fri, 2017-09-01 at 16:59 -0400, Bill Torpey wrote: >>>>>>>> I'm curious about how ZMQ handles re-connection. I >>>>>>>> understand >>>>>>>> that >>>>>>>> re-connection is supposed to happen "automagically" under >>>>>>>> the >>>>>>>> covers, >>>>>>>> but that poses an interesting question. >>>>>>>> >>>>>>>> To make a long story short, the application I'm working >>>>>>>> on >>>>>>>> uses >>>>>>>> pub/sub sockets over TCP. and works like follows: >>>>>>>> >>>>>>>> At startup: >>>>>>>> 1. connects to a proxy/broker at a well-known address, >>>>>>>> using >>>>>>>> a >>>>>>>> pub/sub socket pair ("discovery"); >>>>>>>> 2. subscribes to a well-known topic using the >>>>>>>> "discovery" >>>>>>>> sub >>>>>>>> socket; >>>>>>>> 3. binds a different pub/sub socket pair ("data") and >>>>>>>> retrieves >>>>>>>> the >>>>>>>> actual endpoints assigned; >>>>>>>> 4. publishes the "data" endpoints from step 3 on the >>>>>>>> "discovery" >>>>>>>> pub >>>>>>>> socket; >>>>>>>> >>>>>>>> When the application receives a message on the >>>>>>>> "discovery" >>>>>>>> sub >>>>>>>> socket, it connects the "data" socket pair to the >>>>>>>> endpoints >>>>>>>> specified >>>>>>>> in the "discovery" message. >>>>>>>> >>>>>>>> So far, this seems to be working relatively well, and >>>>>>>> allows >>>>>>>> the >>>>>>>> high-volume, low-latency "data" messages to be >>>>>>>> sent/received >>>>>>>> directly >>>>>>>> between peers, avoiding the extra hop caused by a >>>>>>>> proxy/broker >>>>>>>> connection. The discovery messages use the proxy/broker, >>>>>>>> but >>>>>>>> since >>>>>>>> these are (very) low-volume the extra hop doesn't >>>>>>>> matter. The >>>>>>>> use of >>>>>>>> the proxy also eliminates the "slow joiner" problem that >>>>>>>> can >>>>>>>> happen >>>>>>>> with other configurations. >>>>>>>> >>>>>>>> My question is what happens when one of the "data" peer >>>>>>>> sockets >>>>>>>> disconnects. Since ZMQ (apparently) keeps trying to >>>>>>>> reconnect, >>>>>>>> what >>>>>>>> would prevent another process from binding to the same >>>>>>>> ephemeral >>>>>>>> port? >>>>>>>> >>>>>>>> - Can I assume that if the new application at that port >>>>>>>> is >>>>>>>> not a >>>>>>>> ZMQ >>>>>>>> application, that the reconnect will (silently) fail, and >>>>>>>> continue to >>>>>>>> be retried? >>>>>>> >>>>>>> The ZMTP handshake would fail, so yes. >>>>>>> >>>>>>>> - What if the new application at that port *IS* a ZMQ >>>>>>>> application? Would the reconnect succeed? And if so, >>>>>>>> what >>>>>>>> would >>>>>>>> happen if it's a *DIFFERENT* ZMQ application, and the >>>>>>>> messages >>>>>>>> that >>>>>>>> it's sending/receiving don't match what the original >>>>>>>> application >>>>>>>> expects? >>>>>>> >>>>>>> Depends on how you handle it in your application. If you >>>>>>> have >>>>>>> security >>>>>>> concerns, then use CURVE with authentication so that only >>>>>>> authorised >>>>>>> peers can connect. >>>>>>> >>>>>>>> It's reasonable for the application to publish a >>>>>>>> disconnect >>>>>>>> message >>>>>>>> when it terminates normally, and the connected peers can >>>>>>>> disconnect >>>>>>>> that endpoint. But, applications don't always terminate >>>>>>>> normally >>>>>>>> ;-) >>>>>>> >>>>>>> That's a common pattern. But the application needs to >>>>>>> handle >>>>>>> unexpected >>>>>>> data somewhat gracefully. What that means is entirely up to >>>>>>> the >>>>>>> application - as far as the library is concerned, if the >>>>>>> handshake >>>>>>> succeeds then it's all good (hence the use case for CURVE). >>>>>>> >>>>>>>> Any guidance, hints or tips would be much appreciated -- >>>>>>>> thanks >>>>>>>> in >>>>>>>> advance! >>>>>>> >>>>>>> -- >>>>>>> Kind regards, >>>>>>> Luca >>>>>>> Boccassi_______________________________________________ >>>>>>> zeromq-dev mailing list >>>>>>> [email protected] <mailto:[email protected] >>>>>>> .org >>>>>>>> <mailto:[email protected] <mailto:zeromq-dev@li >>>>>>>> sts. >>>>>>> >>>>>>> zeromq.org>> >>>>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>> >>>>>> >>>>>> _______________________________________________ >>>>>> zeromq-dev mailing list >>>>>> [email protected] <mailto:[email protected] >>>>>> rg> >>>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>>> >>>>> -- >>>>> Kind regards, >>>>> Luca Boccassi_______________________________________________ >>>>> zeromq-dev mailing list >>>>> [email protected] <mailto:[email protected] >>>>>> >>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>> >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] >>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> >>> -- >>> Kind regards, >>> Luca Boccassi_______________________________________________ >>> zeromq-dev mailing list >>> [email protected] >>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] <mailto:[email protected]> >> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>_______________________________________________ > zeromq-dev mailing list > [email protected] <mailto:[email protected]> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
_______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
