I took your advice and printed the thread id for each call and indeed there were two threads calling the same socket. I wrapped it in a critical section and we are testing it now.
Thanks Josh > On Apr 20, 2016, at 5:29 PM, josh knox <[email protected]> wrote: > > When I was having issues, (using a gnarly pre-existing code base) I looked at > where each socket call was coming from and verified that it was only being > used on one thread. Basically just wrote a message to the console for each > socket call and printed the thread ID, then analyzed the output. > > In cases where refactoring to limit one per thread would be problematic, I > was able to use a mutex to allow exclusive access. This worked for me since > there were no performance implications. > > If you're not certain, it might be worth confirming that things are indeed > thread safe. > > FWIW, here's the thread where I muddled through this stuff previously: > > http://lists.zeromq.org/pipermail/zeromq-dev/2015-December/029445.html > <http://lists.zeromq.org/pipermail/zeromq-dev/2015-December/029445.html> > > My crashes were also happening in encoder.hpp. > > Josh > > > > On Wed, Apr 20, 2016 at 4:58 PM, Joshua Strickon <[email protected] > <mailto:[email protected]>> wrote: > Its mostly single threaded but there could be multiple threads for different > modules and dlls that it uses. It is a bit of a mess and I don’t think the > original developer fully tested it in the production environment. I was > hoping it would be something that upgrading to a later version of zmq > addresses without having to dig into the application code. > > Thanks > > Josh > >> On Apr 20, 2016, at 4:52 PM, josh knox <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Josh, >> >> Is your app multi-threaded? Could there be more than one thread hitting the >> socket? >> >> The times that I've had random memory errors with zmq were due to multiple >> threads using a socket. >> >> In my case, either isolating 1 thread per socket, or using other thread >> synchonization to prevent concurrent socket use has solved those issues for >> me. >> >> >> Josh >> >> On Wed, Apr 20, 2016 at 4:34 PM, Joshua Strickon <[email protected] >> <mailto:[email protected]>> wrote: >> I know this is old. I am working on getting an old project up and running >> for a client who >> built it on 2.0.2 and we are seeing these same errors. We are getting >> access violation errors >> and the app is crashing randomly. The windows dump files are pointing to >> these same lines of code as >> described below. What was the resolution on this issue? >> >> thanks >> >> Josh >> >> From: Martin Sustrik <sustrik <at> 250bpm.com <http://250bpm.com/>> >> Subject: Re: frequent ZeroMQ crashes - how to diagnose? >> <http://news.gmane.org/find-root.php?message_id=4C1C6BF7.6080006%40250bpm.com> >> Newsgroups: gmane.network.zeromq.devel >> <http://news.gmane.org/gmane.network.zeromq.devel> >> Date: 2010-06-19 07:04:23 GMT (5 years, 43 weeks, 5 days, 7 hours and 28 >> minutes ago) >> Nick, >> >> > ZeroMQ crashed today. >> > >> > This is a Win32 build of both ZMQ and myApp. >> > myApp was running fine with several thousand messages, when the memcpy >> > code line below threw the >> following exception. >> > >> > "Unhandled exception at 0x6404edd6 (msvcr90d.dll) in myApp.exe: >> > 0xC0000005: Access violation >> reading location 0xfeeefeee." >> > >> > debugging shows the following values: >> > - buffer 0x00d9b570 "%" unsigned char * >> > pos 2 unsigned int >> > + write_pos 0xfeeefeee <Bad Ptr> unsigned char * >> > to_copy 8190 unsigned int >> > >> > looks like a bad pointer. >> > >> > encoder.hpp >> > >> > // If there are no data in the buffer yet and we are able >> > to >> > // fill whole buffer in a single go, let's use zero-copy. >> > // There's no disadvantage to it as we cannot stuck >> > multiple >> > // messages into the buffer anyway. Note that subsequent >> > // write(s) are non-blocking, thus each single write >> > writes >> > // at most SO_SNDBUF bytes at once not depending on how >> > large >> > // is the chunk returned from here. >> > // As a consequence, large messages being sent won't block >> > // other engines running in the same I/O thread for >> > excessive >> > // amounts of time. >> > if (!pos && !*data_ && to_write >= buffersize) { >> > *data_ = write_pos; >> > *size_ = to_write; >> > write_pos = NULL; >> > to_write = 0; >> > return; >> > } >> > >> > // Copy data to the buffer. If the buffer is full, return. >> > size_t to_copy = std::min (to_write, buffersize - pos); >> > =======> memcpy (buffer + pos, write_pos, to_copy); >> > pos += to_copy; >> > write_pos += to_copy; >> > to_write -= to_copy; >> > if (pos == buffersize) { >> > *data_ = buffer; >> > *size_ = pos; >> > return; >> > } >> >> It looks like a memory overwrite either in 0MQ or the application. Do >> you have a test program to reproduce the problem? >> >> > Let me know what the error was so that I can fix it in the trunk. >> >> Have you managed to find out what the error code is? >> >> Martin >> >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] <mailto:[email protected]> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> <http://lists.zeromq.org/mailman/listinfo/zeromq-dev> >> >> _______________________________________________ >> zeromq-dev mailing list >> [email protected] <mailto:[email protected]> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev >> <http://lists.zeromq.org/mailman/listinfo/zeromq-dev> > > _______________________________________________ > zeromq-dev mailing list > [email protected] <mailto:[email protected]> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev > <http://lists.zeromq.org/mailman/listinfo/zeromq-dev> > > _______________________________________________ > zeromq-dev mailing list > [email protected] > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________ zeromq-dev mailing list [email protected] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
