Uploaded image for project: 'libzmq'
  1. libzmq
  2. LIBZMQ-496

Crash on heavy socket opening/closing: Device or resource busy (mutex.hpp:90)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.2.2
    • Fix Version/s: 3.2.3
    • Component/s: core
    • Labels:
    • Environment:

      CentOS release 6.3 (Final Santiago)

      Description

      On heavy subscribe socket opening/closing I experience an assert hit when pthread_mutex_destroy is called in mutex.hpp.

      Attached is a piece of code which demonstrates this.
      Compiled with:

      gcc -O zmqpub.c -o zmqpub -lzmq -lpthread
      gcc -O zmqsub.c -o zmqsub -lzmq -lpthread

      Executed with:

      ./zmqpub &
      ./zmqsub

      zmqsub process will be crashed but the timing is very rare.
      It was taken few days to reproduce this issue on my environment.

      This test needs many open files so you should change the limits.

      ulimit -n 8192
      ulimit -c unlimited

      I got a core file and printed all stack trace of threads on this process.
      Please see bt.txt. But I couldn't find another thread to lock the mutex.

      I thought this issue looks same as LIBZMQ-281.
      But the version is different so I create new issue for this problem.

      1. bt.txt
        238 kB
      2. bt2.txt
        131 kB
      3. zmqpub.c
        0.8 kB
      4. zmqsub.c
        1 kB

        Activity

        Hide
        kanekotky Takayuki Kaneko added a comment -

        I reproduce a core file again. (bt2.txt)

        In bt.txt, Thread 52 was running at epoll_wait(). In bt2.txt, Thread 48 was running at write() on signaler.cpp:119.

        Is this a timing issue between epoll thread and closing socket thread?

        Show
        kanekotky Takayuki Kaneko added a comment - I reproduce a core file again. (bt2.txt) In bt.txt, Thread 52 was running at epoll_wait(). In bt2.txt, Thread 48 was running at write() on signaler.cpp:119. Is this a timing issue between epoll thread and closing socket thread?
        Hide
        mika.fischer Mika Fischer added a comment -

        Hi,

        could you please try to reproduce it after applying this patch / using this branch:
        https://github.com/mika-fischer/zeromq3-x/commit/1a17eb392e353a0c7606b127ac3100075427e424

        I suspect it's exactly the same issue as LIBZMQ-281, just harder to trigger in zeromq3-x than it was in zeromq2-x. I wasn't able to trigger it using the test case in LIBZMQ-281, but we ran into it on one of our production systems with ZeroMQ 3.2.2.

        Show
        mika.fischer Mika Fischer added a comment - Hi, could you please try to reproduce it after applying this patch / using this branch: https://github.com/mika-fischer/zeromq3-x/commit/1a17eb392e353a0c7606b127ac3100075427e424 I suspect it's exactly the same issue as LIBZMQ-281 , just harder to trigger in zeromq3-x than it was in zeromq2-x. I wasn't able to trigger it using the test case in LIBZMQ-281 , but we ran into it on one of our production systems with ZeroMQ 3.2.2.
        Hide
        kanekotky Takayuki Kaneko added a comment -

        Hi Mika,

        I ran the same test to reproduce it after applying your patch.
        I couldn't reproduce ever.

        I know this patch is a workground as you said, but it is very usefull!
        I hope this patch is goint to be merged into the branch 3-x.

        Show
        kanekotky Takayuki Kaneko added a comment - Hi Mika, I ran the same test to reproduce it after applying your patch. I couldn't reproduce ever. I know this patch is a workground as you said, but it is very usefull! I hope this patch is goint to be merged into the branch 3-x.
        Hide
        mika.fischer Mika Fischer added a comment -

        Thanks for testing! I opened a pull request for the fix: https://github.com/zeromq/zeromq3-x/pull/79

        Show
        mika.fischer Mika Fischer added a comment - Thanks for testing! I opened a pull request for the fix: https://github.com/zeromq/zeromq3-x/pull/79
        Hide
        mika.fischer Mika Fischer added a comment -

        This has been merged into https://github.com/zeromq/zeromq3-x.

        Takeshi, could you check it's fixed for you with the latest version from the repository above and close this issue if that's the case.

        Thanks!

        Show
        mika.fischer Mika Fischer added a comment - This has been merged into https://github.com/zeromq/zeromq3-x . Takeshi, could you check it's fixed for you with the latest version from the repository above and close this issue if that's the case. Thanks!
        Hide
        kanekotky Takayuki Kaneko added a comment -

        Mika,

        Thanks for your quick response! I closed this issue.

        Regards,

        Show
        kanekotky Takayuki Kaneko added a comment - Mika, Thanks for your quick response! I closed this issue. Regards,

          People

          • Assignee:
            Unassigned
            Reporter:
            kanekotky Takayuki Kaneko
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: