Skip to content
This repository has been archived by the owner on Jul 8, 2022. It is now read-only.

Unsubscribe in push_event leads to API_EventTimeout #686

Closed
reszelaz opened this issue Feb 15, 2020 · 8 comments · Fixed by #817
Closed

Unsubscribe in push_event leads to API_EventTimeout #686

reszelaz opened this issue Feb 15, 2020 · 8 comments · Fixed by #817

Comments

@reszelaz
Copy link

We encountered this issue when investigating tango-controls/pytango#292.

The architecture is as following. We have one DS called DeviceServer which
exports in total two devices, each of a different class. These two classes are:

  • Device1 with one attribute attr1
  • Device2 with two attributes attr1 and attr2

What we will do:

  1. Subscribe to attr1 configuration events of the Device1 device.
  2. Subscribe to attr1 change events of the Device2 device.
  3. Subscribe to attr2 change events of the Device2 device
    with a callback which will unsubscribe from the previously subscribed
    attr1 configuration events of the Device1 device.

This leads to the situation where the client stops receiving events and just get's API_EventTimeout.

Thanks to the example provided by @schooft we could also reproduce it with C++ client.

More details and the exemplary code can be found in https://github.com/reszelaz/test-tango-unsub.

@andygotz
Copy link
Collaborator

Hi @reszelaz and everyone who worked on isolating this bug and reproducing it! I have added the bug report #686 to the list of ToDo issues. It will be worked on as soon as we have time.

@reszelaz
Copy link
Author

Hi,
I just wanted to give some update from our side.

I don't know if I have mentioned this already but I think that this bug is a regression since Tango 9.1 - see this forum thread. Maybe with this information it will be possible to track which change introduced this regression?

I have also evaluated some possibilities to workaround this issue in taurus:

  1. WIP: Delegate unsubscribe from events to Tango C++ (PyTango#292) taurus-org/taurus#1091
  2. WIP: Usubscribe from events in worker thread (PyTango#292) taurus-org/taurus#1093
  3. WIP: Usubscribe from events on the next subscription (PyTango#292) taurus-org/taurus#1095

Surprisingly the second option did not solve the issue what make me doubt if there are no more bugs - there the unsubscribes are executed from another thread and not the event consumer.

Only the third option does not fail, but in this case we are leaving subscriptions active so it is not really optimal.

Please let us know if we can help somehow.
Thanks!

bourtemb added a commit to bourtemb/cppTango that referenced this issue Mar 25, 2020
…ango-controls#686)

Subscribing or unsubscribing events in an event callback when this callback
was called during a subscribe_event() phase was leading to events and
heartbeat events no longer received by the event client.
A bug was fixed when several ZMQ_DELAY_EVENT tasks were received consecutively
by the ZMQ control task.
bourtemb added a commit to bourtemb/cppTango that referenced this issue Mar 25, 2020
…ango-controls#686)

Subscribing or unsubscribing events in an event callback when this callback
was called during a subscribe_event() phase was leading to events and
heartbeat events no longer received by the event client.
A bug was fixed when several ZMQ_DELAY_EVENT commands were received
consecutively by the ZMQ control socket.
@bourtemb
Copy link
Member

bourtemb commented Mar 25, 2020

Hi @reszelaz , thank you very much for creating the issue and for providing a reproducible example showing the issue.
Thanks to @schooft who provided a C++ client showing the problem. It was very useful.

There was indeed a bug in this very specific use case.
It looks like this bug was already there before, it is just that I think it is appearing only in the specific cases where you are unsubscribing (maybe subscribing too) from a push_event() callback and only when this callback is called during a subscription phase (callback called with the result of the read_attribute synchronous call which happens during a subscription).

I made a test with the C++ client from https://github.com/reszelaz/test-tango-unsub slightly modified to avoid to subscribe at the first call and it seems to work fine in this use case:

void push_event(Tango::EventData* event) {
        static int nb_events_received = 0;
        nb_events_received++;
        if (event->err) {
            Tango::Except::print_error_stack(event->errors);
        } else {
            try
            {
                if (nb_events_received > 1)
                {
                    dev->unsubscribe_event(id);
                    std::cout << "unsubscribed " << id << std::endl;
                }
            }
            catch (Tango::DevFailed &e)
            {
                Tango::Except::print_exception(e);
            }
        }
    };

I think I found a way to fix the bug.
The fix is available in #699

Could you please test it?

@reszelaz
Copy link
Author

Great news Reynald that you already found a solution for that!
I see on the PR that there is some discussion on how to improve it. Do you want us to try it already or you would like to make still some changes?

@bourtemb
Copy link
Member

Great news Reynald that you already found a solution for that!
I see on the PR that there is some discussion on how to improve it. Do you want us to try it already or you would like to make still some changes?

The origin of the problem is understood. Now we have to find the best solution to fix it. The fix I proposed might have some side effects we need to evaluate.

bourtemb added a commit to bourtemb/cppTango that referenced this issue Apr 8, 2020
…ango-controls#686)

Subscribing or unsubscribing events in an event callback when this callback
was called during a subscribe_event() phase was leading to events and
heartbeat events no longer received by the event client.
A bug was fixed when several ZMQ_DELAY_EVENT commands were received
consecutively by the ZMQ control socket.
@t-b
Copy link
Collaborator

t-b commented Apr 23, 2020

Create forward port of #699.

@reszelaz
Copy link
Author

reszelaz commented May 7, 2020

From the reporter point of view it can be closed:) It is in 9.3.4rc5 right?
Unless you wait for it in the official release?
Thanks again!

@bourtemb
Copy link
Member

bourtemb commented May 7, 2020

Thanks @reszelaz . It is in 9.3.4rc5 indeed.
It's just that we need to do the same in our tango-9-lts branch (future 9.4).
We will close this issue when the fix will be available in tango-9-lts branch too.

t-b pushed a commit to t-b/cppTango that referenced this issue Nov 26, 2020
…ango-controls#686)

Subscribing or unsubscribing events in an event callback when this callback
was called during a subscribe_event() phase was leading to events and
heartbeat events no longer received by the event client.
A bug was fixed when several ZMQ_DELAY_EVENT commands were received
consecutively by the ZMQ control socket.
@t-b t-b closed this as completed in #817 Dec 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants