Re: [zeromq-dev] EPGM unrecoverable loss detection

Pieter Hintjens Sun, 31 Jan 2016 06:22:11 -0800

In general you should test solve each failure case that you care
about. You simulate, detect, and then recover.

You've got two possible failures here. One is server problems (e.g.
crashed, blocked, thrashing.) You can simulate that easily by adding
long sleeps to your publisher. The way to detect is to add heartbeats,
which all clients subscribe to. When heartbeats stop, you know the
server is having trouble. A monitor process can use this to switch to
another server.

Second problem is network congestion. This is the reason you would get
irrecoverable message loss. You cannot detect lost messages in normal
pub/sub cases. However you can do things like add timestamps to
messages and raise a red flag if the latency spikes. You can't solve
network congestion by switching to a different server. It needs
external intervention.

-Pieter

On Fri, Jan 29, 2016 at 4:00 AM, Simon Wollwage <[email protected]> wrote:
> Hi,
>
> as the title already says: is there a way to detect unrecoverable loss when
> using epgm transports in zmq? We need that detection to switch over to a
> standby server.
>
> Any hints/tips appreciated
>
> _______________________________________________
> zeromq-dev mailing list
> [email protected]
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] EPGM unrecoverable loss detection

Reply via email to