Thanks in advance for assistance.

Using ZeroMQ C# bindings (4.1.0.21) with libzmq (4.1.5.0) on Windows (7, Server 
2008 R2, and Server 2012).

For request/response functionality I am using a single Router socket (tcp) in 
each of several interconnected processes.

Each process has an identifier (GUID) that is known by all other processes, 
which is used as the Identity on the socket.

Each process performs a Bind on a specific port (also known by all other 
processes), and also performs a Connect to each of the other Router sockets in 
the other processes.

For example, if 5 processes are used, visualize a completely interconnected 
star.

When all 5 processes are initially started, communication works great.  I can 
send and receive messages between any of the 5.

The problem comes if for some reason I must restart one of the 5.  Once that 
process is restarted, communication with the other 4 seems to go nowhere, for 
both incoming and outgoing traffic.

I have monitors on all of the sockets which seems to show that everything is 
reconnecting.

I also have RouterMandatory set to Report.  Sends from the newly restarted 
process do not Report when attempting to Send to the other processes, and the 
other processes do not Report when attempting to Send to the restarted process.

If I stop one of the other remaining processes, then attempts to Send to that 
process Report as expected.

I also have pub/sub socket connections, with each process publishing a 
heartbeat, and those sockets all successfully reconnect.

I am testing all of this from (in this example) a 6th process that simply 
connects to the other 5 processes (also using a Router socket, no Bind 
involved), and it successfully reconnects to the restarted process without any 
problems and can send messages (and receive subsequent response) with any of 
the other 5, including any processes that are restarted.  This 6th process does 
not publish to the other 5, but does subscribe to the heartbeats.  I am 
convinced that the fact that 6th process successfully reconnects is a clue to 
what I am doing wrong, but it has been insufficient for me to find out what the 
root problem really is.

The only other thing to note is that Linger is set to zero prior to 
Bind/Connect.

The code for all processes for setting up the Router socket is:

            using (_requestSocket = new ZSocket(_context, ZSocketType.ROUTER))
            {
                try
                {
                    _requestSocket.Monitor(Endpoint_RequestMonitor, out error);
                    Debug.Assert(error == ZError.None);

                    _requestSocket.Identity = Encoding.UTF8.GetBytes(_selfID);
                    _requestSocket.Linger = TimeSpan.Zero;
                    _requestSocket.RouterMandatory = RouterMandatory.Report;

                    _requestSocket.Bind(Endpoint_RequestInproc, out error);
                    Debug.Assert(error == ZError.None);

                    endpoint = string.Format("tcp://*:{0}", _requestPort);
                    _requestSocket.Bind(endpoint, out error);
                    Debug.Assert(error == ZError.None);
                }
                catch (Exception ex)
                {
                    Environment.Exit(-1);
                }

Thanks for any assistance or advice you can provide,

Aaron

_______________________________________________
zeromq-dev mailing list
[email protected]
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Reply via email to