Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cylc play #4040

Merged
merged 22 commits into from
Feb 15, 2021
Merged

cylc play #4040

merged 22 commits into from
Feb 15, 2021

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Jan 20, 2021

These changes partially address #3897

  • Replace cylc run and cylc restart with cylc play.

  • Remove the START_POINT optional positional argument.

  • Tidy up the aliases for cycle points options, so only these are used:

    --initial-cycle-point, --icp
    --final-cycle-point, --fcp
    --start-cycle-point, --stopcp
    --stop-cycle-point, --startcp
    

Note: Running cylc play on a workflow that has previously been run now has the effect of cylc restart. In order to do a cold start, you will have to either run cylc clean or remove the private & public databases before running cylc play (a function for the latter has been added to test_header for the functional tests)

Requirements check-list

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Appropriate tests included?
  • Appropriate change log entry included.
  • (master branch) Will open a documentation PR after cylc pause is implemented
  • No dependency changes.

@MetRonnie MetRonnie added this to the cylc-8.0b0 milestone Jan 20, 2021
@MetRonnie MetRonnie self-assigned this Jan 20, 2021
@MetRonnie MetRonnie force-pushed the cylc-play branch 2 times, most recently from a4e9d7f to 999769d Compare January 21, 2021 12:34
@MetRonnie MetRonnie marked this pull request as draft January 21, 2021 13:39
@MetRonnie MetRonnie force-pushed the cylc-play branch 2 times, most recently from 9b79233 to 838edcb Compare January 21, 2021 21:24
@MetRonnie MetRonnie force-pushed the cylc-play branch 2 times, most recently from 4a8517c to bcaf267 Compare January 29, 2021 11:32
@MetRonnie
Copy link
Member Author

https://github.com/cylc/cylc-flow/pull/4040/files#annotation_899603301

cylc/flow/scheduler.py#L1358
Added line #L1358 was not covered by tests

This is executed as part of tests/functional/restart/34-auto-restart-basic.t, but doesn't run on Actions as it requires shared filesystem

@MetRonnie MetRonnie marked this pull request as ready for review February 1, 2021 16:17
@MetRonnie MetRonnie marked this pull request as draft February 1, 2021 17:30
@MetRonnie MetRonnie mentioned this pull request Feb 1, 2021
7 tasks
@hjoliver
Copy link
Member

hjoliver commented Feb 1, 2021

Brilliant to have this ready to go. Should it be un-Drafted now @MetRonnie ?

@MetRonnie
Copy link
Member Author

Still some tests that I should run/add to this

@MetRonnie MetRonnie mentioned this pull request Feb 2, 2021
@MetRonnie MetRonnie marked this pull request as ready for review February 2, 2021 18:48
@MetRonnie
Copy link
Member Author

@hjoliver Should be ready now

Also remove the part on --icp=ignore, --fcp=ignore etc. Will need to 
revisit this later
Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, should save the remaining initial/final/start/stop questions for #4062 to avoid blocking this PR.

A few small comments...

Comment on lines 59 to 60
# Install $PWD/flow.cylc as $(basename $PWD) and start it.
$ cylc play
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this example should bite the dust now:

Suggested change
# Install $PWD/flow.cylc as $(basename $PWD) and start it.
$ cylc play

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should make the [REG] arg non-optional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please, it no longer makes sense since the change from registration to installation.

This feature will now only work if you are in the run directory for a suite which has either been written there or cylc installed and will fail in unhelpful ways otherwise.

@MetRonnie
Copy link
Member Author

Tests re-run timed out. but I ran them locally and they all pass

@hjoliver
Copy link
Member

hjoliver commented Feb 9, 2021

The same test batch has suddenly started timing out on #4035, after rebase. There might be a test broken by some recent change on master (if this branch is up to date too?)

@MetRonnie
Copy link
Member Author

MetRonnie commented Feb 10, 2021

I think it's just going to need a longer timeout on GH Actions, that's all. It's running the whole test battery in that job, including the flakyfunctional, and possibly the poll tests take longer than tcp.

Having said that, I've looked through the timed out test runs, and it seems there's more often than not a strange delay at the last couple of tests. Not sure why

[17:33:28] tests/f/events/11-cycle-task-event-job-logs-retrieve.t ..... ok   314944 ms ( 0.02 usr  0.03 sys + 161.12 cusr 18.52 csys = 179.69 CPU)
[17:39:09] tests/f/remote/03-polled-task-started.t ..... ok  1632815 ms ( 0.30 usr  1.48 sys + 876.69 cusr 105.05 csys = 983.52 CPU)

By the way, the time in ms seems to be 10 times too high? No, it really is taking that long!

@oliver-sanders
Copy link
Member

Yeah that's not a good sign, if a such a simple test takes so long then something is likely going badly wrong. Sadly there isn't enough evidence to tell what.

I've been stabilising a few tests and added a default workflow timeout in a branch I'm working on at the moment which might potentially help but I'm not sure.

I've seen to outrageously long test runs on polling systems though not as bad as that!

Comment on lines 67 to 68
A "restart" continues the workflow from the earliest cycle point that was
incomplete when shutdown occurred. Tasks recorded as submitted or running are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A "restart" continues the workflow from the earliest cycle point that was
incomplete when shutdown occurred. Tasks recorded as submitted or running are
A "restart" continues on from the most recent recorded state of the workflow.
Tasks recorded as submitted or running are

I know what you mean, but "continues from the [blah] cycle point ..." could be confusing. We've traditionally described warm starts that way. A restart often involves tasks at multiple cycle points.

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, tests as working 🎉

Still one @oliver-sanders comment to respond to? (on the SuiteServiceFileError exception)

@MetRonnie
Copy link
Member Author

Both should be sorted now

@@ -172,7 +172,6 @@ class Scheduler:
contact_data: Optional[dict] = None

# run options
is_restart: bool = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of specifying everything up here is so that we can see all of the Scheduler attributes in one place. Can always default to None until set later.

Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another Cylc8 brick in the wall, thanks @MetRonnie!

(Small comments above about an unused code block and removed dataclass attr definition).

Ooh, just spotted, a reference to cylc run has snuck in probs with rebase in cylc.flow.scripts.install#24.

And fix references to cylc run added by other PRs
@MetRonnie
Copy link
Member Author

Have amended last commit to address those

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MetRonnie, good job 🎉

@hjoliver hjoliver merged commit 9d65b9a into cylc:master Feb 15, 2021
@MetRonnie MetRonnie deleted the cylc-play branch February 16, 2021 11:17
@MetRonnie MetRonnie linked an issue Feb 17, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cli: play|pause|stop
3 participants