On 8/8/12 9:59 AM, Steven McCoy wrote:
On 7 August 2012 13:33, Stuart Levy <[email protected]
<mailto:[email protected]>> wrote:
Thanks - could you explain a bit more? In the
PUB+SUB-on-same-port-on-same-host situation Pierre described -
which applies to me too - *all* NAKs will be lost, not just some
small fraction of them, if the SUB socket gets opened before PUB,
IIUC. Does that imply that recovery can never happen, regardless
of rate limiting, etc.? I don't want the loss of a single
multicast packet to mean that the whole system will need to be
restarted...
With multiple senders on the same host on the same PGM session only
the first will receive NAKs due to unicast routing. It's an OS and
protocol design feature.
I believe with 0MQ only one PGM publisher should ever be active on the
network: a bus topology is not supported by the 0MQ socket. This is
probably the more pertinent issue that voids a lot of the previous
discussion.
Aha - thanks *very* much for that clarification. A bus is exactly what
I was hoping to achieve, so if it's not intended to work with 0MQ,
that's good to know.
If so, then I don't see how I can use zeromq multicast to create a
barrier primitive - that I'd be better off just doing all the
low-level networking myself. That's a pity, if true.
Need a bit more clarification on what you are looking for? Atomic
multicast? I think there was a brief mention about this previously.
What I want: a distributed barrier primitive. I want to synchronize
graphical displays driven by a bunch of machines, so ~30Hz update rate
across maybe N=20 hosts. All hosts are on a common LAN, so IP multicast
works fine.
I want the barrier to introduce minimal extra delay. So would rather
avoid the straightforward scheme
a) all N nodes send Ready to central coordinator host, which waits
to hear from all of them
b) coordinator then sends Go to all N nodes
What I could do with raw multicast (and had hoped to do with 0MQ
PUB/SUB) is:
a) all N nodes bind to a single common multicast address/port (yep,
a bus!)
b) at barrier time, each node sends a message "node <i> is ready at
time <T>" to the mcast group
c) all nodes listen for Ready messages on the bus
d) when each node has heard from all others, with <T> matching its
own clock, the barrier is done
There's additional processing for newly-attached nodes (unexpected <i>),
or aberrant clocks (ignore old <T>; resynchronize our own clock if we
hear from future <T>),
and for timeouts if some expected nodes either drop out or aren't heard
from.
There's one process per host node, so the app could have exactly one
Unix/Windows UDP multicast socket => no ambiguity in packet delivery.
I'm not sure about atomic multicast. The messages needed would be short
enough that they would easily be single unfragmented UDP packets; is
that what you mean?
cheers
Stuart
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev