In general you should test solve each failure case that you care about. You simulate, detect, and then recover.
You've got two possible failures here. One is server problems (e.g. crashed, blocked, thrashing.) You can simulate that easily by adding long sleeps to your publisher. The way to detect is to add heartbeats, which all clients subscribe to. When heartbeats stop, you know the server is having trouble. A monitor process can use this to switch to another server. Second problem is network congestion. This is the reason you would get irrecoverable message loss. You cannot detect lost messages in normal pub/sub cases. However you can do things like add timestamps to messages and raise a red flag if the latency spikes. You can't solve network congestion by switching to a different server. It needs external intervention. -Pieter On Fri, Jan 29, 2016 at 4:00 AM, Simon Wollwage <[email protected]> wrote: > Hi, > > as the title already says: is there a way to detect unrecoverable loss when > using epgm transports in zmq? We need that detection to switch over to a > standby server. > > Any hints/tips appreciated > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > _______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
