-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix task event handler simul multi warnings #2807
Fix task event handler simul multi warnings #2807
Conversation
The bug: on receiving multiple simultaneous warning messages from a task job, only 1 warning event handler gets triggered. This fix ensures that multiple warning event handlers are triggered.
170f3d2
to
8fa52c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, tests as working - thanks - just one question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me, but alas I don't know much about event handlers in Cylc to give a more useful review here (yet!!!). Feel free to request others to review it too if you need anybody else besides @hjoliver reviewing it 👍
(@matthewrmshin - we're planning to remedy this situation tomorrow 💥 ... maybe no need to reassign) |
@matthewrmshin had a session on event handlers and on this pull request with @hjoliver (thanks!!!). Feeling like I know a bit more about how that works in Cylc and that I can try to help with this review again (bear with me a little longer). Successfully reproduced the problem on master (left; note that it printed But then decided to log fish twice... maybe because I had two files with the same name, or just wanted to raise that message twice. And it was logged once. @matthewrmshin , from what I understood, the solution in this pull request is now using the message as an identifier of the event? If that's correct, then I think this is the desired output. But would there be cases where users still want to submit the same message multiple times? Will review the code with more care from home either tonight or during this weekend (a break for when the weather is not so nice, or am too sunburned 🌤 ). Cheers |
@matthewrmshin - actually I think @kinow raises a good point. I think I can imagine use cases where the same WARNING event, for instance, happens multiple times within the same job, potentially with the same message (e.g. "database offline", or whatever). So maybe we do need to handle every single message as a unique event. |
(@kinow - don't feel obliged to work on the weekend! 🌞) |
In which case, I'll try to use a counter to number the non-unique events. |
If event is non-unique.
Tested, and works as expected @matthewrmshin ! Thanks for the quick fix! Here's my output kinow@ranma:/tmp/foo$ cylc cat-log foo | grep WARNING
2018-10-20T09:16:55+13 WARNING - [foo.1] -(current:running)> cat at 2018-10-20T09:16:54+13
2018-10-20T09:16:55+13 WARNING - [foo.1] -(current:running)> dog at 2018-10-20T09:16:54+13
2018-10-20T09:16:55+13 WARNING - [foo.1] -(current:running)> fish at 2018-10-20T09:16:55+13
2018-10-20T09:16:55+13 WARNING - [foo.1] -(current:running)> fish at 2018-10-20T09:16:55+13
kinow@ranma:/tmp/foo$ cylc cat-log -f a foo foo.1
[jobs-submit ret_code] 0
[jobs-submit out] 2018-10-20T09:15:59+13|1/foo/01|0|6273
2018-10-20T09:15:59+13 [STDOUT] 6273
[(('event-handler-00', 'warning-1'), 1) cmd] echo "HANDLED cat"
[(('event-handler-00', 'warning-1'), 1) ret_code] 0
[(('event-handler-00', 'warning-1'), 1) out] HANDLED cat
[(('event-handler-00', 'warning-3'), 1) cmd] echo "HANDLED fish"
[(('event-handler-00', 'warning-3'), 1) ret_code] 0
[(('event-handler-00', 'warning-3'), 1) out] HANDLED fish
[(('event-handler-00', 'warning-2'), 1) cmd] echo "HANDLED dog"
[(('event-handler-00', 'warning-2'), 1) ret_code] 0
[(('event-handler-00', 'warning-2'), 1) out] HANDLED dog
[jobs-submit ret_code] 0
[jobs-submit out] 2018-10-20T09:16:54+13|1/foo/01|0|6524
2018-10-20T09:16:54+13 [STDOUT] 6524
[(('event-handler-00', 'warning-1'), 1) cmd] echo "HANDLED cat"
[(('event-handler-00', 'warning-1'), 1) ret_code] 0
[(('event-handler-00', 'warning-1'), 1) out] HANDLED cat
[(('event-handler-00', 'warning-3'), 1) cmd] echo "HANDLED fish"
[(('event-handler-00', 'warning-3'), 1) ret_code] 0
[(('event-handler-00', 'warning-3'), 1) out] HANDLED fish
[(('event-handler-00', 'warning-2'), 1) cmd] echo "HANDLED dog"
[(('event-handler-00', 'warning-2'), 1) ret_code] 0
[(('event-handler-00', 'warning-2'), 1) out] HANDLED dog
[(('event-handler-00', 'warning-4'), 1) cmd] echo "HANDLED fish"
[(('event-handler-00', 'warning-4'), 1) ret_code] 0
[(('event-handler-00', 'warning-4'), 1) out] HANDLED fish Minor note, is that in the docs I think we would have to mention that the message would happen twice, but in no guaranteed order. if a user needed messages in order, we would have to include the time in the key? But there is a workaround of asking users to generate a message with the timestamp and sort out the ordering in their side, which could be a messaging system or log handler system. So LGTM +1 🎉 |
@kinow - regarding ordering, I think this is fine as-is. Normally (in real jobs) there would be a sufficient delay between messages that they will be processed in order by virtue of the fact that they are received one at a time by the suite daemon. But if messages are emitted near-simultaneously, users should probably not rely on the order. Plus I can't think of a situation where that would matter. |
Sure thing @hjoliver , only mentioned because a co-worker in a previous company was telling me about an interesting case he had to use Kafka and keep the order of messages (can't recall what for though). I think it is one of the guarantees in Kafka. But agree it may not be worth worrying about it now 👍 what's more important is that this fix appears to be working OK 🎉 |
(Note: event handlers are currently invoked using a parallel pool of subprocesses - so order cannot be guaranteed unless we impose a workflow on them.) |
Merging 🎆 |
@@ -18,6 +18,7 @@ | |||
|
|||
"""Provide a class to represent a task proxy in a running suite.""" | |||
|
|||
from collections import Counter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collections.Counter
is a Python2.7 feature!
(Retroactively approving this - I meant to/thought I had before merging ... @oliver-sanders' comment on Counter notwithstanding). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved (I guess the official approval switch does not appear, post merge!)
Good spotting (very). Can we just require 2.7 at this point 🐑-😁 |
Fix #2806.
(Plus new functionality: task output based events).