Call to zmq_init with >0 thread count causes core dump


ZMQ Version : 2.1.0
OS : AIX 5.3
Compiler : IBM XLC/C++ v10.1

While calling the zmq_init() function with an argument >1, the program core dumps. It does not core dump when the argument sent is 0.

Here is the stack trace from dbx:

> dbx ./HelloWorldServer
Type 'help' for help.
[using memory image in core]
reading symbolic information ...

IOT/Abort trap in pthread_kill at 0xd005f734 ($t1)
0xd005f734 (pthread_kill+0x88) 80410014 lwz r2,0x14(r1)
(dbx) where
pthread_kill(??, ??) at 0xd005f734
_p_raise(??) at 0xd005f1a4
raise.raise(??) at 0xd02a38f0
abort() at 0xd0307778
myabort()() at 0xd0244aac
terminate()() at 0xd0242df0
terminate()() at 0xd024424c
__DoThrowV6() at 0xd02468f0
std::vector<zmq:oll_t::fd_entry_t,std::allocator<zmq:oll_t::fd_entry_t> >::_Xlen() const(this = 0x200091d0), line 317 in "vector"
std::vector<zmq:oll_t::fd_entry_t,std::allocator<zmq:oll_t::fd_entry_t> >::insert(std::_Ptrit<zmq:oll_t::fd_entry_t,long,zmq:oll_t::fd_entry_t*,zmq:oll_t::fd_entry_t&,zmq:oll_t::fd_entry_t*,zmq:oll_t::fd_entry_t&>,unsigned long,const zmq:oll_t::fd_entry_t&)(this = 0x200091d0, _P = &(...), _M = 2147483647, _X = &(...)), line 63 in "vector.t"
poll.std::vector<zmq:oll_t::fd_entry_t,std::allocator<zmq:oll_t::fd_entry_t> >::resize(unsigned long,zmq:oll_t::fd_entry_t)(this = 0x200091d0, _N = 2147483647, _X = (...)), line 193 in "vector"
poll.std::vector<zmq:oll_t::fd_entry_t,std::allocator<zmq:oll_t::fd_entry_t> >::resize(unsigned long)(this = 0x200091d0, _N = 2147483647), line 190 in "vector"
poll.std::vector<zmq:oll_t::fd_entry_t,std::allocator<zmq:oll_t::fd_entry_t> >::resize(unsigned long,zmq:oll_t::fd_entry_t)(this = 0x2ff21f80, _N = 4048429592, _X = (...)), line 48 in "poll.cpp"
zmq::io_thread_t::in_event()(this = 0xd68568c0, 0x20008650, 0x0), line 32 in "io_thread.cpp"
zmq::ctx_t::~ctx_t()(this = 0x2ff22080, __dtorFlags = -246531096), line 59 in "ctx.cpp"
zmq.zmq_init(io_threads_ = 1), line 243 in "zmq.cpp"
main(), line 18 in "HelloWorldServer.c"

After some digging around, I found out that in poll.cpp, the call to getrlimit() for RLIMIT_NOFILE resource returns 2147483647 when the ulimit for no of files per process is set to 'unlimited'. Thus, the next statement which tries to resize fd_table, core dumps.

As a work around, I have forced the ulimit to a reasonable number (256 for now) in my environment. However, I think that the code should handle this condition in a better way (probably use a sane default value, or return with an exit code/message).