libzmq
  1. libzmq
  2. LIBZMQ-270

A SUB socket with a message in queue should always have ZMQ_POLLIN set

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: core
    • Labels:
    • Environment:

      OSX, Linux

      Description

      When checking for events on a SUB socket with zmq_getsockopt(sock, ZMQ_EVENTS, &events, &size), the events variable should be set to ZMQ_POLLIN. In 3.0, it is set to 0.

      This only occurs when the SUB socket binds and the PUB connects. If you swap the bind/connect on the sockets, it works.

      This same code works in 2.1.x regardless of the order of bind/connect.

      (issue has been committed to the issues repository.)

        Activity

        Chuck Remes created issue -
        Hide
        Martin Sustrik added a comment -

        This problem is caused by storing subscriptions in the SUB's application thread. SUB sends a subscription, then PUB connects. SUB can't send the subscription as its application thread is busy doing something else. Consequently, the subscription doesn't get to the PUB socket and the socket filters out the message.

        What's needed is storing subscriptions in I/O thread(s).

        The pre-requisite for that is patch f78d9b6bfca13e298c29fadabbbc870b37a0a573. I'll ask Pieter to backport it to 3.0.

        Show
        Martin Sustrik added a comment - This problem is caused by storing subscriptions in the SUB's application thread. SUB sends a subscription, then PUB connects. SUB can't send the subscription as its application thread is busy doing something else. Consequently, the subscription doesn't get to the PUB socket and the socket filters out the message. What's needed is storing subscriptions in I/O thread(s). The pre-requisite for that is patch f78d9b6bfca13e298c29fadabbbc870b37a0a573. I'll ask Pieter to backport it to 3.0.
        Hide
        Martin Sustrik added a comment -

        It seems there are some problems backporting the patch.

        Show
        Martin Sustrik added a comment - It seems there are some problems backporting the patch.
        Hide
        Chuck Remes added a comment -

        This bug still exists on 3.1.0 beta release.

        Show
        Chuck Remes added a comment - This bug still exists on 3.1.0 beta release.
        Hide
        Martin Sustrik added a comment -

        Ack. Solving this problem requires moving some of the subscription forwarding functionality to the I/O thread and is not a trivial fix. I'll try to solve this problem later on.

        Show
        Martin Sustrik added a comment - Ack. Solving this problem requires moving some of the subscription forwarding functionality to the I/O thread and is not a trivial fix. I'll try to solve this problem later on.
        Chuck Remes made changes -
        Field Original Value New Value
        Priority Minor [ 4 ] Critical [ 2 ]
        Hide
        Martin Hurton added a comment -

        It's not that zmq_getsockopt fails to indicate the message is available but that the receiver socket fails to receive the message.
        The problem is that after the sleep, the application immediately sends the message and the receiver is not yet subscribed.
        The library needs your application to call some operation (e.g. send/receive/getsockopt ...) so that it can subscribe the socket on the topic.
        To fix this, we need to make some internal changes first, which, as Martin Sustrik indicated, are not trivial.

        Show
        Martin Hurton added a comment - It's not that zmq_getsockopt fails to indicate the message is available but that the receiver socket fails to receive the message. The problem is that after the sleep, the application immediately sends the message and the receiver is not yet subscribed. The library needs your application to call some operation (e.g. send/receive/getsockopt ...) so that it can subscribe the socket on the topic. To fix this, we need to make some internal changes first, which, as Martin Sustrik indicated, are not trivial.
        Hide
        Chuck Remes added a comment -

        Changed priority from "critical" to "major" since it will require significant modification to the internals.

        Show
        Chuck Remes added a comment - Changed priority from "critical" to "major" since it will require significant modification to the internals.
        Chuck Remes made changes -
        Priority Critical [ 2 ] Major [ 3 ]
        Hide
        Min RK added a comment -

        To be specific, SUB sockets that bind will not get any message before the first they ask for. If SUB binds and PUB connects, PUB can send as many messages as slowly as it likes, and none will ever arrive before SUB's first poll/recv call. This means that many SUB-binding cases are totally unusable. Would it not make sense for the PUB connection handshake to include subscriptions?

        Show
        Min RK added a comment - To be specific, SUB sockets that bind will not get any message before the first they ask for. If SUB binds and PUB connects, PUB can send as many messages as slowly as it likes, and none will ever arrive before SUB's first poll/recv call. This means that many SUB-binding cases are totally unusable. Would it not make sense for the PUB connection handshake to include subscriptions?
        Hide
        Pieter Hintjens added a comment -

        There is a workaround (taken from LIBZMQ-559), that I've tested:

        zmq_pollitem_t pollitems [] = {

        { sub, 0, ZMQ_POLLIN, 0 }

        };
        zmq_poll (pollitems, 1, 1);

        See https://gist.github.com/hintjens/7344533 for a test case.

        Show
        Pieter Hintjens added a comment - There is a workaround (taken from LIBZMQ-559 ), that I've tested: zmq_pollitem_t pollitems [] = { { sub, 0, ZMQ_POLLIN, 0 } }; zmq_poll (pollitems, 1, 1); See https://gist.github.com/hintjens/7344533 for a test case.
        Pieter Hintjens made changes -
        Comment [ Here is a workaround that works:

        - use XSUB instead of SUB
        - after XSUB connects allow 20 msec for connection to establish
        - send subscription manually, as message starting with 0x01
        - allow 20 msec for publisher to receive subscription (if you need to synch it)
        ]
        Pieter Hintjens made changes -
        Comment [ Sadly the workaround doesn't work consistently... :-/ ]

          People

          • Assignee:
            Martin Sustrik
            Reporter:
            Chuck Remes
          • Votes:
            3 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated: