Hi Luca: Sorry for not geting back sooner, but thanks again for listening, and the PR looks good to me!
Best Regards, Bill Torpey > On Sep 19, 2017, at 9:13 AM, Luca Boccassi <[email protected]> wrote: > > On Sun, 2017-09-17 at 12:29 -0400, Bill Torpey wrote: >> Luca: >> >> I hear what you’re saying but … I think I’m talking about a different >> situation. >> >> If I understand your explanation correctly, you’re saying that >> setting ZMQ_RECONNECT_IVL to -1 should prevent a disconnected >> endpoint from *ever* reconnecting, under any set of circumstances. >> >> I would read the doc (4.2.2) more like the following (with addition >> in *bold*): >> >>> The ZMQ_RECONNECT_IVL option shall set the initial reconnection >>> interval for the specified socket. The reconnection interval is the >>> period ØMQ shall wait between attempts to *automatically* reconnect >>> disconnected peers when using connection-oriented transports. The >>> value -1 means no reconnection. >> >> >> What I’m questioning is the interaction between ZMQ_RECONNECT_IVL == >> -1 and the behavior enforced by https://github.com/zeromq/libzmq/iss >> ues/788. (Also see here: https://www.mail-archive.com/zeromq- >> [email protected]/msg21484.html). That commit is intended to >> prevent *duplicate* connections from the same endpoint, for certain >> socket types (e.g., pub/sub), where multiple connections (and their >> associated duplicate messages) don’t make sense. >> >> One scenario I’m concerned about is the one where: >> >> 1. Endpoint connects to us >> 2. Endpoint is disconnected for some reason >> 3. Setting ZMQ_RECONNECT_IVL=-1 disables *automatic* >> reconnect, so as far as we’re concerned the endpoint is dead >> 4. Subsequently the endpoint connects to us again (e.g., >> following a restart) >> 5. Because we still have a record of the endpoint, we will >> refuse the connection — even though the endpoint is dead from our >> point of view. In this scenario that endpoint can NEVER reconnect. >> >> So I get that setting ZMQ_RECONNECT_IVL should prevent us from >> reconnecting (automatically) to the disconnected endpoint, but I >> don’t see the benefit of preventing that endpoint from actively >> reconnecting at a later time. In this case, we’ve essentially >> blacklisted that endpoint (forever), and I’m having trouble coming up >> with a scenario where that would be intended behavior. >> >> Does this make sense? Am I missing something here? >> >> Also, to your point about adding a protocol layer on top of 0MQ — I >> would MUCH prefer to let 0MQ handle as much of the underlying >> connect/disconnect logic as possible. I’m concerned about the >> potential for the protocol’s view of the connection state getting out >> of sync with 0MQ’s view (not to mention a bunch of additional work on >> the protocol layer, but more about synchronization). >> >> Thanks for listening ... >> >> Bill > > I see. I guess there's a terminology confusion issue here - when I > wrote about connections and disconnections, I meant the automated ones > that happen in the background in the I/O thread. But I guess it makes > sense that a manual call to zmq_connect should work as expected. > > A workaround for this behaviour would be for the application to > manually call zmq_disconnect before doing a connect to the same > endpoint. > > But it turns out fixing it to automatically do it is not too hard > (unless I've made some silly mistake): > > https://github.com/zeromq/libzmq/pull/2756 > <https://github.com/zeromq/libzmq/pull/2756> > >>> On Sep 17, 2017, at 6:39 AM, Luca Boccassi <[email protected] >>>> wrote: >>> >>> On Sat, 2017-09-16 at 14:34 -0400, Bill Torpey wrote: >>>> Hi Luca: >>>> >>>> Just a gentle reminder to add an issue so this can be tracked (or >>>> let >>>> me know if you’d prefer that I do that). >>>> >>>> Thanks! >>>> >>>> Bill >>> >>> Thinking about this a bit more, I think it's expected behaviour >>> after >>> all. From the doc: >>> >>> "The 'ZMQ_RECONNECT_IVL' option shall set the initial reconnection >>> interval for the specified 'socket'. The reconnection interval is >>> the >>> period 0MQ shall wait between attempts to reconnect disconnected >>> peers >>> when using connection-oriented transports. The value -1 means no >>> reconnection." >>> >>> So it is working as intended - if a peer goes away, it will never >>> be >>> reconnected if that option is set. >>> >>> And it makes sense - in the context of a TCP connection, a dead >>> peer is >>> a dead peer. If for an application a dead peer might be resurrected >>> after X amount of time, there's no way to know that. It needs to be >>> handled by the application. >>> >>> There are various tools you can use: >>> >>> 1) ZMTP heartbeats - see ZMQ_HEARTBEAT* socket options >>> 2) socket monitoring events (including connects and disconnects) - >>> see >>> zmq_socket_monitor documentation >>> 3) Enhance your protocol - call zmq_disconnect(endpoint) on your >>> sockets when a particular message is received, or heartbeats are >>> missed, or a disconnect event happens. This way when you later call >>> zmq_connect(endpoint) and it happens to match a previous, dead >>> peer, it >>> will work as expected >>> >>>>> On Sep 2, 2017, at 1:21 PM, Luca Boccassi <luca.boccassi@gmail. >>>>> com> >>>>> wrote: >>>>> >>>>> On Sat, 2017-09-02 at 12:02 -0400, Bill Torpey wrote: >>>>>> Thanks again, Luca! >>>>>> >>>>>> For now, I’m going to go with disabling reconnect on the >>>>>> “data” >>>>>> sockets — that seems to be the best solution for my use case >>>>>> (connecting to endpoints that were returned by the peer >>>>>> binding >>>>>> to an >>>>>> unspecified (“wildcard”) port — e.g., "tcp://<interface>:*" >>>>>> in >>>>>> ZMQ). >>>>>> >>>>>> This assumes that ZMQ will completely forget about the >>>>>> endpoint >>>>>> if/when it is disconnected, if it is set not to >>>>>> reconnect. Otherwise >>>>>> I might run afoul of ZMQ’s silently ignoring connections to >>>>>> endpoints >>>>>> that it already knows about: https://github.com/zeromq/libzm >>>>>> q/is >>>>>> sues >>>>>> /788 <https://github.com/zeromq/libzmq/issues/788> (e.g., in >>>>>> the >>>>>> case >>>>>> where another process later happens to be assigned the same >>>>>> ephemeral >>>>>> port). >>>>>> >>>>>> I’ve done a quick scan of the libzmq code (v4.2.2) and it >>>>>> doesn’t >>>>>> appear that the endpoint is removed in the case of a >>>>>> (terminal) >>>>>> disconnect. If you can confirm/deny this behavior, that >>>>>> would be >>>>>> helpful. Failing that, I guess I’ll need to test this in the >>>>>> debugger — any hints on how best to do this would also be >>>>>> much >>>>>> appreciated. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Bill >>>>> >>>>> Yes it doesn't look like it removes the endpoint - I guess it's >>>>> a >>>>> corner case that's missed. I'll open an issue. >>>>> >>>>> BTW all these things are very quick and easy to try with Python >>>>> on >>>>> Linux. Just install pyzmq, open a python3 terminal and: >>>>> >>>>> import zmq >>>>> ctx = zmq.Context.instance() >>>>> rep = ctx.socket(zmq.REP) >>>>> rep.bind("tcp://127.0.0.1:12345") >>>>> req = ctx.socket(zmq.REQ) >>>>> req.connect("tcp://127.0.0.1:12345") >>>>> req.send_string("hello") >>>>> rep.recv() >>>>> rep.send_string("hallo") >>>>> req.recv() >>>>> rep.unbind("tcp://127.0.0.1:12345") >>>>> rep.close() >>>>> rep = ctx.socket(zmq.REP) >>>>> rep.bind("tcp://127.0.0.1:12345") >>>>> req.send_string("hello") >>>>> rep.recv() >>>>> rep.send_string("hallo") >>>>> req.recv() >>>>> rep.unbind("tcp://127.0.0.1:12345") >>>>> rep.close() >>>>> req.close() >>>>> rep = ctx.socket(zmq.REP) >>>>> rep.bind("tcp://127.0.0.1:12345") >>>>> req = ctx.socket(zmq.REQ) >>>>> req.setsockopt(zmq.RECONNECT_IVL, >>>>> -1)req.connect("tcp://127.0.0.1:12345") >>>>> req.send_string("hello") >>>>> rep.recv() >>>>> rep.send_string("hallo") >>>>> req.recv() >>>>> rep.unbind("tcp://127.0.0.1:12345") >>>>> rep.close() >>>>> rep = ctx.socket(zmq.REP) >>>>> rep.bind("tcp://127.0.0.1:12345") >>>>> req.send_string("hello") >>>>> rep.recv() >>>>> >>>>> This last one won't receive the message >>>>> >>>>>>> On Sep 1, 2017, at 6:19 PM, Luca Boccassi <luca.boccassi@gm >>>>>>> ail. >>>>>>> com> >>>>>>> wrote: >>>>>>> >>>>>>> On Fri, 2017-09-01 at 18:03 -0400, Bill Torpey wrote: >>>>>>>> Thanks Luca! That was very helpful. >>>>>>>> >>>>>>>> Although it leads to a couple of other questions: >>>>>>>> >>>>>>>> - Can I assume that a ZMQ disconnect of a tcp endpoint >>>>>>>> would >>>>>>>> only >>>>>>>> occur if the underlying TCP socket is closed by the OS? >>>>>>>> Or >>>>>>>> are >>>>>>>> there >>>>>>>> conditions in which ZMQ will proactively disconnect the >>>>>>>> TCP >>>>>>>> socket >>>>>>>> and try to reconnect? >>>>>>> >>>>>>> Normally that's the case - you can set up heartbeating with >>>>>>> the >>>>>>> appropriate options and that will kill a connection if >>>>>>> there's >>>>>>> no >>>>>>> answer >>>>>>> >>>>>>>> - I see that there is a sockopt (ZMQ_RECONNECT_IVL) that >>>>>>>> can >>>>>>>> be >>>>>>>> set >>>>>>>> to -1 to disable reconnection entirely. In my case, the >>>>>>>> the >>>>>>>> “data” >>>>>>>> socket pair will *always* connect to an ephemeral port, >>>>>>>> so I >>>>>>>> *never* >>>>>>>> want to reconnect. Would this be a reasonable option in >>>>>>>> my >>>>>>>> case, >>>>>>>> do >>>>>>>> you think? >>>>>>> >>>>>>> If that makes sense for your application, go for it - in >>>>>>> these >>>>>>> cases >>>>>>> the only way to be sure is to test it and see how it works >>>>>>> >>>>>>>> - Would there be any interest in a patch that would >>>>>>>> disable >>>>>>>> reconnects (controlled by sockopt) for ephemeral ports >>>>>>>> only? I’m >>>>>>>> guessing that reconnecting mostly makes sense with well- >>>>>>>> known >>>>>>>> ports, >>>>>>>> so something like this may be of general interest? >>>>>>> >>>>>>> If by ephemeral port you mean anything over 1024, then >>>>>>> actually >>>>>>> in >>>>>>> most >>>>>>> applications I've seen it's always useful to reconnect, and >>>>>>> the >>>>>>> existing option should be enough for those cases where it's >>>>>>> not >>>>>>> desired >>>>>>> - we don't want to duplicate functionality >>>>>>> >>>>>>>> Thanks again! >>>>>>>> >>>>>>>> Bill >>>>>>>> >>>>>>>>> On Sep 1, 2017, at 5:30 PM, Luca Boccassi <luca.boccass >>>>>>>>> i@gm >>>>>>>>> ail. >>>>>>>>> com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Fri, 2017-09-01 at 16:59 -0400, Bill Torpey wrote: >>>>>>>>>> I'm curious about how ZMQ handles re-connection. I >>>>>>>>>> understand >>>>>>>>>> that >>>>>>>>>> re-connection is supposed to happen "automagically" >>>>>>>>>> under >>>>>>>>>> the >>>>>>>>>> covers, >>>>>>>>>> but that poses an interesting question. >>>>>>>>>> >>>>>>>>>> To make a long story short, the application I'm >>>>>>>>>> working >>>>>>>>>> on >>>>>>>>>> uses >>>>>>>>>> pub/sub sockets over TCP. and works like follows: >>>>>>>>>> >>>>>>>>>> At startup: >>>>>>>>>> 1. connects to a proxy/broker at a well-known >>>>>>>>>> address, >>>>>>>>>> using >>>>>>>>>> a >>>>>>>>>> pub/sub socket pair ("discovery"); >>>>>>>>>> 2. subscribes to a well-known topic using the >>>>>>>>>> "discovery" >>>>>>>>>> sub >>>>>>>>>> socket; >>>>>>>>>> 3. binds a different pub/sub socket pair ("data") >>>>>>>>>> and >>>>>>>>>> retrieves >>>>>>>>>> the >>>>>>>>>> actual endpoints assigned; >>>>>>>>>> 4. publishes the "data" endpoints from step 3 on the >>>>>>>>>> "discovery" >>>>>>>>>> pub >>>>>>>>>> socket; >>>>>>>>>> >>>>>>>>>> When the application receives a message on the >>>>>>>>>> "discovery" >>>>>>>>>> sub >>>>>>>>>> socket, it connects the "data" socket pair to the >>>>>>>>>> endpoints >>>>>>>>>> specified >>>>>>>>>> in the "discovery" message. >>>>>>>>>> >>>>>>>>>> So far, this seems to be working relatively well, and >>>>>>>>>> allows >>>>>>>>>> the >>>>>>>>>> high-volume, low-latency "data" messages to be >>>>>>>>>> sent/received >>>>>>>>>> directly >>>>>>>>>> between peers, avoiding the extra hop caused by a >>>>>>>>>> proxy/broker >>>>>>>>>> connection. The discovery messages use the >>>>>>>>>> proxy/broker, >>>>>>>>>> but >>>>>>>>>> since >>>>>>>>>> these are (very) low-volume the extra hop doesn't >>>>>>>>>> matter. The >>>>>>>>>> use of >>>>>>>>>> the proxy also eliminates the "slow joiner" problem >>>>>>>>>> that >>>>>>>>>> can >>>>>>>>>> happen >>>>>>>>>> with other configurations. >>>>>>>>>> >>>>>>>>>> My question is what happens when one of the "data" >>>>>>>>>> peer >>>>>>>>>> sockets >>>>>>>>>> disconnects. Since ZMQ (apparently) keeps trying to >>>>>>>>>> reconnect, >>>>>>>>>> what >>>>>>>>>> would prevent another process from binding to the >>>>>>>>>> same >>>>>>>>>> ephemeral >>>>>>>>>> port? >>>>>>>>>> >>>>>>>>>> - Can I assume that if the new application at that >>>>>>>>>> port >>>>>>>>>> is >>>>>>>>>> not a >>>>>>>>>> ZMQ >>>>>>>>>> application, that the reconnect will (silently) fail, >>>>>>>>>> and >>>>>>>>>> continue to >>>>>>>>>> be retried? >>>>>>>>> >>>>>>>>> The ZMTP handshake would fail, so yes. >>>>>>>>> >>>>>>>>>> - What if the new application at that port *IS* a ZMQ >>>>>>>>>> application? Would the reconnect succeed? And if >>>>>>>>>> so, >>>>>>>>>> what >>>>>>>>>> would >>>>>>>>>> happen if it's a *DIFFERENT* ZMQ application, and the >>>>>>>>>> messages >>>>>>>>>> that >>>>>>>>>> it's sending/receiving don't match what the original >>>>>>>>>> application >>>>>>>>>> expects? >>>>>>>>> >>>>>>>>> Depends on how you handle it in your application. If >>>>>>>>> you >>>>>>>>> have >>>>>>>>> security >>>>>>>>> concerns, then use CURVE with authentication so that >>>>>>>>> only >>>>>>>>> authorised >>>>>>>>> peers can connect. >>>>>>>>> >>>>>>>>>> It's reasonable for the application to publish a >>>>>>>>>> disconnect >>>>>>>>>> message >>>>>>>>>> when it terminates normally, and the connected peers >>>>>>>>>> can >>>>>>>>>> disconnect >>>>>>>>>> that endpoint. But, applications don't always >>>>>>>>>> terminate >>>>>>>>>> normally >>>>>>>>>> ;-) >>>>>>>>> >>>>>>>>> That's a common pattern. But the application needs to >>>>>>>>> handle >>>>>>>>> unexpected >>>>>>>>> data somewhat gracefully. What that means is entirely >>>>>>>>> up to >>>>>>>>> the >>>>>>>>> application - as far as the library is concerned, if >>>>>>>>> the >>>>>>>>> handshake >>>>>>>>> succeeds then it's all good (hence the use case for >>>>>>>>> CURVE). >>>>>>>>> >>>>>>>>>> Any guidance, hints or tips would be much appreciated >>>>>>>>>> -- >>>>>>>>>> thanks >>>>>>>>>> in >>>>>>>>>> advance! >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Kind regards, >>>>>>>>> Luca >>>>>>>>> Boccassi_______________________________________________ >>>>>>>>> zeromq-dev mailing list >>>>>>>>> [email protected] <mailto:[email protected] >>>>>>>>> romq >>>>>>>>> .org >>>>>>>>>> <mailto:[email protected] <mailto:zeromq-de >>>>>>>>>> v@li >>>>>>>>>> sts. >>>>>>>>> >>>>>>>>> zeromq.org>> >>>>>>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> zeromq-dev mailing list >>>>>>>> [email protected] <mailto:[email protected] >>>>>>>> mq.o >>>>>>>> rg> >>>>>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>>>>> >>>>>>> -- >>>>>>> Kind regards, >>>>>>> Luca >>>>>>> Boccassi_______________________________________________ >>>>>>> zeromq-dev mailing list >>>>>>> [email protected] <mailto:[email protected] >>>>>>> .org >>>>>>>> >>>>>>> >>>>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>>>> >>>>>> _______________________________________________ >>>>>> zeromq-dev mailing list >>>>>> [email protected] >>>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>>> >>>>> -- >>>>> Kind regards, >>>>> Luca Boccassi_______________________________________________ >>>>> zeromq-dev mailing list >>>>> [email protected] >>>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>> >>>> _______________________________________________ >>>> zeromq-dev mailing list >>>> [email protected] <mailto:[email protected]> >>>> <mailto:[email protected] <mailto:[email protected]>> >>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>>> <https://lists.zeromq.org/mailman/listinfo/zeromq- >>>> <https://lists.zeromq.org/mailman/listinfo/zeromq-> >>>> dev>_______________________________________________ >>> >>> zeromq-dev mailing list >>> [email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected]>> >>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> >>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev >>> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>> >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] <mailto:[email protected]> >> https://lists.zeromq.org/mailman/listinfo/zeromq-dev >> <https://lists.zeromq.org/mailman/listinfo/zeromq-dev> > > -- > Kind regards, > Luca Boccassi_______________________________________________ > zeromq-dev mailing list > [email protected] <mailto:[email protected]> > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
_______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
