-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post-SoD refactor: remove pflag, runahead pool #4103
Conversation
d0f978e
to
e45eacc
Compare
(Tests passing, after some rebase nightmares; I'll update the description and un-Draft this tomorrow/Wednesday) |
5781f3d
to
bb366ef
Compare
bb366ef
to
f234234
Compare
(rebased) |
f234234
to
a4e7b04
Compare
Temporarily converted to Draft, pending fix of post-rebase test fails. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
Will test after the rebase, so will review my approval after a play.
(feedback addressed, except for one thing to follow up on, and conflicts). |
(There are no conflicts on this branch right now so long as you don't select "Rebase and merge".) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9/29 files viewed. Seems to be a couple of unresolved point from earlier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some quick changes that will reduce the impact of task iteration, have dumped them in a PR:
if released: | ||
LOG.debug( | ||
"Queue released:\n" | ||
+ '\n'.join(f"* {r.identity}" for r in released) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was logging to "debug" as a bullet-point list, now logging each task individually to "info"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This involves less iteration, since we're already iterating over the list for the preceding lines. Do you have a strong preference either for debug vs info on this? (It's debatable IMO).
""" | ||
self.release_runahead_tasks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unexpected side-effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't follow?
This is here in case there are runahead-limited tasks that could be released (because they are now below the runahead limit) at the time of stall check, in which case, we're not stalled. Unfortunately I don't recall why I thought that was necessary at this point. ... I'll go back and see if I can see why.
cylc/flow/scheduler.py
Outdated
if x.state(TASK_STATUS_WAITING) | ||
and not x.state.is_queued |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two filters are applied to these three iterations of the task pool so there is scope to reduce the amount of iteration here without making structural changes to the pool.
Haven't put this in my PR incase there's some reason not to, tasks changing state between each iteration? High memory watermark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason that I can think of. There shouldn't be state changes between each iteration. I've extensively refactored this bit along the lines you've suggested. See what you think.
task iteration improvements
@@ -672,8 +666,7 @@ def _load_pool_from_point(self): | |||
parent_points = tdef.get_parent_points(point) | |||
if not parent_points or all( | |||
x < self.config.start_point for x in parent_points): | |||
initial_tasks.append(TaskID.get(tdef.name, point)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not used anywhere).
0f42c2b
to
8281420
Compare
Apologies reviewers! In addition to my several embarrassing crimes again good iterative form (hopefully I would have found the worst of those myself if I had reviewed my own code properly!) ... there were some
Hopefully good to go now 🤞 It seems we must not have tests for force-triggering with queues - punting that to #4234 as I've run out of time today. |
cylc/flow/task_pool.py
Outdated
for itask in ( | ||
itask | ||
for cycle in self.main_pool.values() | ||
for itask in cycle.values() | ||
if itask.state.is_runahead | ||
if itask.state( | ||
TASK_STATUS_FAILED, | ||
TASK_STATUS_SUCCEEDED, | ||
TASK_STATUS_EXPIRED | ||
) | ||
or itask.is_manual_submit | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not very readable in my opinion. Don't know if you want to address it now or in a later follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used to agree, although I've pretty much got my eye in for these constructs now. In any case, you can blame @oliver-sanders ' side-PR for this one! Oliver, why exactly do you prefer this kind of thing over nested loops with conditional blocks in them? I doubt that comprehensions are any more efficient unless they only involve basic types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(However, the variable cycle
in this one is badly named, I'll fix that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used to agree
Ditto, however, once I got used to comprehensions I changed my mind (a short stint of Erlang is an effective conversion, there is no alternative!), I think they are more readable and the filtering is much clearer (especially than the if <cond>: continue
pattern (negative logic etc).
Three approvals should do it! |
This is a
smallSOD follow-up change (no specific Issue, but has been discussed variously)E.g., partially addresses #3874:
Make runahead limiting consistent with other forms of limiting (see above)
is_runahead
task indicator (c.f.is_held
,is_queued
)cylc tui
on this branch).A new "hidden pool" is used to hide partially-satisfied tasks (in order to get spawn-on-demand from spawn-on-outputs)
Remove an annoying vestige of the old SOS dependency matching process
task_events_mgr.pflag
("processing required flag") whenever a task event occurred that required a new round of dependency matching, then if the flag was set we would iterate through the task pool and do dependency matching and other things.[UPDATE] restores correct force-triggering behaviour in the presence of queues (broken since spawn-on-demand) implementation: triggering an unqueued tasks queues it; triggering a queued task causes it to run.
cylc tui
showing queue- and runahead-limited tasks:Requirements check-list
CONTRIBUTING.md
and added my name as a Code Contributor.- (mention visible runahead tasks)
- (mention visible runahead tasks, somewhere...)