assert in xrep.cpp when sending to terminating pipe
Description
Environment
built and run against a 64 bit Centos 5.5 server with g++ 4.4.0
Attachments
1
- 27 Sep 2011, 02:25 PM
Activity
Show:
PieterP September 28, 2011 at 1:26 PM
OK, backported to 3-0.
Martin Sustrik September 28, 2011 at 5:55 AM
Yes, please. I'm going to send the patch to the mailing list in the meantime.
PieterP September 27, 2011 at 7:31 PM
Martin, do you want me to apply this patch to 3-0 master?
Ben Gray September 27, 2011 at 5:04 PM
So far it stays up under my spam test, which always triggered it within a few seconds previously.
Considered it fixed.
Martin Sustrik September 27, 2011 at 2:25 PM
Please, try whether this patch helps.
I have repeatable assert that comes up under load in a service that results in the following callstack under gdb.
#0 0x0000003bd2830265 in raise () from /lib64/libc.so.6
#1 0x0000003bd2831d10 in abort () from /lib64/libc.so.6
#2 0x00002aaaaab22409 in zmq::xrep_t::xsend (this=0x7cc5e0, msg_=0x42802510, flags_=0) at xrep.cpp:168
#3 0x00002aaaaab0f355 in zmq::rep_t::xsend (this=0x7cc5e0, msg_=0x42802510, flags_=0) at rep.cpp:48
#4 0x00002aaaaab1a492 in zmq::socket_base_t::send (this=0x7cc5e0, msg_=0x42802510, flags_=0) at socket_base.cpp:521
#5 0x00002aaaaab254e7 in zmq_sendmsg (s_=0x7cc5e0, msg_=0x42802510, flags_=0) at zmq.cpp:266
#6 0x00002aaaaab2522e in zmq_send (s_=0x7cc5e0, buf_=0x848328, len_=134, flags_=0) at zmq.cpp:219
The service is a bound REP socket type and clients connect with REQ over tcp. The clients join and part with high frequency and although should always wait for replies there is no guaranty of this.
The relevent code here in xrep.cpp is towards the end of xsend
if (current_out) {
bool ok = current_out->write (msg_);
zmq_assert (ok);
The pipe write is returning false because current_out->state is 'terminating'.
Presumably this state was set as the client disconnects.
Although the assert is repeatable I have yet to get a minimal test case to work. Once/If I do I will attach that it as well.
In the meantime I am happy to run gdb commands against the crash here to provide data people might need.