Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation error with start-up and cycling tasks in same conditional #434

Closed
dpmatthews opened this issue Apr 21, 2013 · 11 comments
Closed
Assignees
Milestone

Comments

@dpmatthews
Copy link
Contributor

Consider the following suite:

[scheduling]
    initial cycle time = 2013010100
    [[special tasks]]
        start-up = prep
        cold-start = cold
    [[dependencies]]
        [[[0]]]
            graph = "prep & (bar[T-24] | cold) => bar"

Validation fails with:
(start-up or async) and (cycling) tasks in same conditional

However, this equivalent graph is fine:

            graph = """prep => bar
                       bar[T-24] | cold => bar"""

So is this, even though it mixes start-up & cycling tasks:

            graph = "prep & foo => bar"

Mixing start-up & cycling tasks in complex expressions involving OR (e.g. prep | foo => bar) probably shouldn't be allowed (or we should at least define what this means). However the validation check should be made consistent and the error message made clearer.

@hjoliver
Copy link
Member

This is related to the fact that cylc literally ignores start-up tasks after the first cycle, and it is a little more complex than it seems:

Cylc does not treat prep & foo => bar as a "conditional trigger" as such because it can be reduced to separate simple triggers prep => bar and foo => bar the first of which can just be ignored after the first cycle (i.e. we do not give that trigger to bar[T] at all beyond T=0). Any expression containing ORs cannot be reduced to just simple triggers though, and you can't just "ignore" part of a complex conditional expression after the first cycle ... and triggers like prep | foo => bar are probably nonsense in any case (if prep does not disappear/get-ignored after the first cycle, then all future instances of bar would just trigger immediately off prep). That's why I decided to disallow any use of start-up and cycling tasks in the same conditional trigger. Your example, however, fails because cylc does not know it can be picked apart to get a simple start-up trigger prep => bar and a genuine conditional one bar[T-24] | cold => bar (in general, with ORs and parentheses involved, we can't simply split the expression on '&'s to figure this out).

Currently cylc does not need to understand (parse) conditional trigger expressions - instead it transforms them into Python expressions and has the Python interpreter evaluate the result (an idea that I must say I was quite pleased with at the time :-) Changing this in order to distinguish bad use of start-up tasks in conditionals sounds potentially very difficult.

I think the solution might be to stop ignoring dependence on start-up tasks after the first cycle. i.e.

  • (a) prep & foo => bar will mean every bar[T] depends on foo[T] AND prep[0]
  • (b) prep | foo => bar will mean every bar[T] depends on foo[T] OR prep[0]

Case (a) is fine - the first instance of foo won't run if prep fails, so it does not matter that subsequent instances won't either. If anyone is silly enough to use case (b) - currently disallowed - they may not get what they really want, but at least they will get exactly what the expressions say. Unlike the status quo there will no longer be anything special about start-up, except that start-up tasks are non-cycling.

@dpmatthews - what do you think? Once the run-db is used to satisfy dependence (so that we don't need to keep finished task proxies in the suite) there will be no special implementation required to achieve this; in the short term I can just replace dependence on start-up tasks, in conditional triggers, with "True" after the first cycle (if they succeed).

@hjoliver
Copy link
Member

Regarding my suggestion above, I must say I don't really like the idea of every instance of bar[T] depending explicitly on prep[0]. Even if it works perfectly well it's not exactly aesthetically pleasing. The original idea that initial start-up tasks are ignored after start-up seems more sensible to me. But as explained above that is not compatible with use of start-up tasks in conditional triggers, and it may be (?) prohibitively difficult to distinguish valid cases such as your example.

Another option we could consider is a traditional pre-cycling suite section, with its dependencies disconnected from the cycling graph, that has to complete entirely before the cycling section starts. This isn't a a very "cylcy" way of doing things, and could lead to less efficient scheduling at start-up, but perhaps it would be simpler and more intuitive for users. Easy to validate: no start-up tasks would be allowed in a cycling graph section, and vice versa (and there would be no need for a special start-up task type anymore - pre-cycling tasks would just be identified by their graph section).

@hjoliver
Copy link
Member

Ah, I think I've got it. The graph is supposed to define trigger validity by section heading (e.g. triggers defined under [[[0,12]]] are valid for any 00z or 12z cycle). But we currently violate this simple rule by shoe-horning one-off start-up triggers into cycling graph sections. Consequently (1) a [special tasks] list is required to identify start-up tasks because they don't match the section heading; (2) the same graph string must be valid for both the start-up and general cycling case, and (as your example shows) if the general case is conditional (with OR) we need to be able to untangle the start-up trigger (in order to ignore it after start-up) and that is (very?) difficult to do; and (3) this makes graph parsing internals more complicated because I can't rely on the graph section heading to determine the trigger type. Here's the proper way to do it:

Start-up triggers should only be defined in an initial graph section, and cycling triggers only in cycling graph sections. For a particular task at T, all triggers defined in all graph sections valid at T will be applied with AND. (just as separate triggers from multiple cycling graph sections valid for the same T are already applied!).

This will work equally well for the proposed shutdown tasks (and cycl-range spinup tasks too):

[[dependencies]]
      [[[initial]]]
             graph = "prep => bar"
      [[[HoursOfTheDay(0)]]]
             graph = "bar[T-24] | cold => bar"
      [[[final]]]
             graph = "bar => baz"

Interpretation:

  • if T is the initial cycle time graph = "prep & ( bar[T-24] | cold ) => bar"
  • if T is the final cycle time graph = "bar[T-24] | cold => bar => baz
  • otherwise graph = bar[T-24] | cold => bar

Note that:

  • the graph section alone defines the validity of every trigger in the suite
  • start-up tasks can only appear under [initial], so no need for [special tasks]start-up anymore.
  • cycling sections contain only cycling tasks (and triggers for every cycle).
  • cycling tasks can appear in [initial] because it is the trigger, not the task, that the heading applies to.

It seems to me this is completely consistent (unlike the status quo), simpler, it covers all the bases, and it avoids the problem at the top of this page because the start-up trigger cannot be combined with the general cycling one.

@dpmatthews - what do you think?

@hjoliver
Copy link
Member

Note that the similarity with #404 (comment) is superficial - there I used the same new initial and final section headings but only to label task types - I had not moved the one-off start-up triggers out of the cycling sections.

@hjoliver
Copy link
Member

Taking this proposal to its logical conclusion, we could move the initial and final cycle time config items into the graph like this:

[[[start:2010080800]]]         # initial cycle
     graph = "prep => foo"
[[[0,12]]]                           # general cycling case
     graph = "foo => bar"
[[[2010081012]]]                # mid-stream one-off!
     graph = "bar => baz" 
[[[stop:2010090800]]         # final cycle
     graph = "bar => cleanup"

@hjoliver
Copy link
Member

As just discussed with @dpmatthews via email, this proposal leaves "cold-start" as the only start-up-related special task type that needs to be declared. And (in my opinion, but possibly still subject to discussion!) that is still necessary because:

  1. a cold-start task is by definition an alternative means of starting a normally warm-cycled model, and although the usual use-case for this is suite start-up, it can in principle be done in any cycle (if a problem disrupts warm cycling) and the graph construct that reflects this cleanly and intuitively is this:
[[[0,12]]]
      graph = "cold_foo[T]  |  foo[T-12] => foo[T]"

i.e. in any cycle T, foo could in principle trigger off either cold_foo[T] or foo[T-12].

  1. furthermore, because cold-start tasks have to appear in cycling sections like this, and not initial sections (because the trigger is valid in all cycles) they have to be declared so that cylc knows (a) they are not cycling tasks, and (b) which tasks not to run if the suite is warm-started (in which case cylc inserts them in the 'succeeded' state to stand in for the assumed previous cycle - which is arguably just an implementation detail).

@hjoliver
Copy link
Member

Regarding cold-start tasks, here's another possibility:

[[[start:2010080800]]]
      graph = "cold_foo => foo"
[[[0,12]]]
      graph = "foo[T-12] => foo"

and have cylc automatically assume that any dependence on T < Tstart is satisfied.

This would probably be considered easier to understand by most users, but it doesn't allow cold-starting foo mid-stream by simply inserting cold_foo[Tn] if foo[Tn-1] failed and could not be re-run successfully. The operator would have to insert cold_foo, wait for it to finish, and then manually trigger foo. So in my opinion this is an inferior way of handling cold-starts. But I guess could we could support it as an option ("ignore dependence on tasks prior to Tstart = True")??

@hjoliver
Copy link
Member

Also! if we allow an optional start T in cycling sections we could easily support on-sequence cold-starts (which were looking very difficult - ref @trwhitcomb and #149):

cold-start tasks = cold_foo
[[[start:2010080800]]]
    graph = "prep => cold_foo"
[[[HoursOfTheDay(0,12) from 2010080812]]]
    graph = "cold_foo[T-12] | foo[T-12] => foo => bar

@hjoliver
Copy link
Member

Which suggests, even for suites with no special behavior in the first or final cycles there's no need for distinct "initial/final cycle time = ..." config items:

[[[ HoursOfTheDay(0,12) from 2010080800 until 2010090800 ]]]
   graph = "foo => bar"

@hjoliver
Copy link
Member

Actually, the "start:" and "stop:" flags could also be dropped, leaving just triggers for specific single cycles, if all cycling sections set an explicit range:

[[[2010080800]]]                   # initial one-off
    graph = "prep => foo"
[[[HoursOfTheDay(0,12) from 2010080800 until 2010090800]]]
    graph = "foo => bar"
[[[2010080112]]]                   # internal one-off
    graph = "bar => surprise"
[[[2010090800]]]                   # final one-off
    graph = "bar => cleanup"

@benfitzpatrick
Copy link
Contributor

Start-up tasks are being superseded by #119.

@benfitzpatrick benfitzpatrick modified the milestones: cylc-6, later Jun 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants