-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External triggering plugins? #2339
Comments
We could also allow clock offsets on the external triggers, to say "don't bother trying to satisfy this trigger until after this time" (for efficiency). |
For running the trigger plugins asynchronously, we'd need to ensure that we don't run the next check before the previous one has returned, and perhaps also allow a configurable interval between checks rather than each time through the scheduling loop. All this is easy enough to achieve, I think. (regards the external trigger alternative to suite-state polling tasks, the plugin would presumably just do a one-off check, with "polling" achieved now by successive invocations of the plugin function) [but see next comment] |
Actually, this change, with use of a message broker, will nicely decouple inter-dependent suites and make most suite-state polling entirely unnecessary! E.g. a suite that generates an important file used by other suites can just post a message to report that, with the file location in the message. Consumer suites can all look at the message broker (no need to know details of the producer suite). The external trigger plugin can extract the file location from the message, and use it as the "external trigger ID" that is broadcast to downstream tasks in the same cycle point. |
For completeness, we also need to consider sending messages. This is easier in that it could already be done within a task job or an event handler. However, it may be worth considering a similar plugin system for use as a built-in event handler, e.g. to post task event messages to Kafka. |
Does this supersede #1364? |
I think 1364 - defining external triggers and clock triggers in the graph - is a separate issue. |
Sorry, I meant to comment on #2333. |
No, again, this (like the existing mechanism for satisfying clock and external triggers) can be done initially without exposing them in the graph. That would be a nice follow-up though. |
@cylc/core - can I take it from the lack of objections that you agree it is a good idea? I'd like to push ahead with this soon if possible. |
Hi @hjoliver I have had a chat with @dpmatthews and we appear to be reading different things from the description of this issue! Do you think you can elaborate on this, especially how you expect users to configure their suites to use this new functionality? (Or perhaps we can discuss this in a video con session?) |
Not cylc/core, but this sounds awesome. I really like the idea of allowing brokers to eliminate the need for inter-suite polling (and it means that the suite controllers don't need to see each other - they only both need to be able to see the broker, which leads to network partitioning opportunities) |
@trwhitcomb - thanks for the feedback. That's exactly the idea. |
@matthewrmshin and @dpmatthews - I've probably obscured the suite configuration side of this by mentioning possible future extensions to allow exposure/configuration of external triggers via the graph. Initially it's a really minor change to current usage. Example: Here's a current suite with one task foo that has an external trigger:
Currently the external download system has to call Here's the corresponding example under this proposal:
Now, the suite daemon will load a plugin module according to the message prefix (called something like "ext_trig_file_exists" in this case) which knows how to determine if the file [Note this is example is for "file event" triggering (possibly the associated plugin could even use inotify). If the file generator was a another suite, it could post a message that includes the file location to a message broker - as I commented above - so in that case my suite would not even need to know the location of the file (or equally, it would not have to use suite state polling, which requires knowing the location and name of the upstream suite)] [Also, the plugin selector probably shouldn't be part of the message string. This might be better: [Finally, happy to do a VC if this still isn't clear...] |
I'm happy in principle but the details aren't clear to me. It's fine for task proxies to be checking the external message queue each time through the loop. However, for checking file systems (or message queues) we presumably need something cleverer. You've already mentioned the need for a configurable polling interval (configured where?). Might there also be need for other configuration (e.g. authentication)? Also, do we need to ensure that actions from multiple task proxies are combined? |
@dpmatthews - I think I've already addressed your concerns, if you read all my comments above. I suggested these plugin modules should run asynchronously (probably via the current process pool), and on a configurable interval rather than every iteration of the scheduling loop. We already do this sort of thing for regular polling, so it's not really new or difficult. However, keep in mind that this is only for external triggering - there should not be much of this in any given suite (and we would not recommend not using this for filesystem triggering instead of task inter-dependence within a suite!) - so I think a synchronous implementation/POC would be fine as a first step (with a warning that plugins must return very quickly, if we actually release that before going asynchronous). |
More thoughts:
|
That's up to the plugin writer. Cylc can just call the plugin function with the arguments provided in the trigger definition (e.g. kafka URL and message to look for). If the plugin function returns True, the trigger is satisfied.
Yes, fair enough. I've done some work on this and am close to putting up an initial PR for review. |
(Note that the PR #2423 arising from this issue has a better configuration format for the new triggers - a literal representation of the function call rather than just a string as above - and it does in fact expose the new triggers via the graph). |
This idea arises from discussions with BoM staff, who want to achieve inter-suite triggering from existing SMS suites to new Cylc suites, and vice versa, via the Kafka message broker (this is quite nice - neither suite has to know explicitly about the other).
However: Kafka collects messages and makes them available but it can't act on them. We could make "middle-man" message consumer daemons that watch Kafka for relevant messages and then execute
cylc ext-trigger
to update the target suite. BUT that's obviously a kludge. Instead we could support a suite daemon plugin mechanism that allows external triggers to be satisfied by various means.A message string prefix could select plugins by name, e.g.:
kafka:file X ready from SMS suite Y
- look for a message on kaftafile-exists:/path/to/file
- file events - close Support filesystem event triggering? #1469?file-get:source-url, target-url
- check for and retrieve a fileclock:PT1H
- unify with clock triggers!the quick brown fox
- no prefix, requires current CLI usepoll-suite:nwp1 foo:succeed
- replace suite-state polling tasks?!This could work nicely with #2333 too.
These plugins would probably need to execute asynchronously in the process pool, with a call-back to satisfy the trigger in the task proxy (possibly just be appending the message to the existing external trigger queue). Although, as first cut we could do it synchronously with the proviso that the plugins must be fast.
@cylc/core - what do you think?
The text was updated successfully, but these errors were encountered: