Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double-reload if command in job script #5865

Open
hjoliver opened this issue Dec 6, 2023 · 4 comments
Open

Double-reload if command in job script #5865

hjoliver opened this issue Dec 6, 2023 · 4 comments
Labels
bug? Not sure if this is a bug or not
Milestone

Comments

@hjoliver
Copy link
Member

hjoliver commented Dec 6, 2023

Bug introduced by this change: #5592

[scheduling]
    [[graph]]
        R1 = foo
[runtime]
    [[foo]]
        script = cylc reload $CYLC_WORKFLOW_ID

This results in two calls to the reload mutation, instead of one, and repeats the task status messages as well.

$ cylc log bug | grep received
2023-12-06T14:26:19+13:00 DEBUG - [1/foo submitted job:01 flows:1] (received)started
2023-12-06T14:26:20+13:00 DEBUG - [1/foo running job:01 flows:1] (received)succeeded
2023-12-06T14:26:20+13:00 DEBUG - [1/foo succeeded job:01 flows:1] (received)started
2023-12-06T14:26:20+13:00 DEBUG - [1/foo succeeded job:01 flows:1] (received)succeeded

$ cylc log bug | grep reload
2023-12-06T14:26:19+13:00 INFO - [command] reload_workflow
2023-12-06T14:26:20+13:00 INFO - [command] reload_workflow
2023-12-06T14:26:20+13:00 INFO - [1/foo running job:01 flows:1] reloaded task definition
2023-12-06T14:26:20+13:00 WARNING - [1/foo running job:01 flows:1] active with pre-reload settings
2023-12-06T14:26:20+13:00 INFO - Command actioned: reload_workflow()
2023-12-06T14:26:20+13:00 INFO - [1/foo running job:01 flows:1] reloaded task definition
2023-12-06T14:26:20+13:00 WARNING - [1/foo running job:01 flows:1] active with pre-reload settings
2023-12-06T14:26:20+13:00 INFO - Command actioned: reload_workflow()
@hjoliver hjoliver added the bug Something is wrong :( label Dec 6, 2023
@hjoliver hjoliver added this to the cylc-8.2.4 milestone Dec 6, 2023
@hjoliver
Copy link
Member Author

hjoliver commented Dec 6, 2023

This new block is the cause, but I haven't figured out exactly what's going on:

# flush out preparing tasks before attempting reload
self.reload_pending = 'waiting for pending tasks to submit'
while self.release_queued_tasks():
# Run the subset of main-loop functionality required to push
# preparing through the submission pipeline and keep the workflow
# responsive (e.g. to the `cylc stop` command).
# NOTE: this reload method was called by process_command_queue
# which is called synchronously in the main loop so this call is
# blocking to other main loop functions
# subproc pool - for issueing/tracking remote-init commands
self.proc_pool.process()
# task messages - for tracking task status changes
self.process_queued_task_messages()
# command queue - keeps the scheduler responsive
await self.process_command_queue()
# allows the scheduler to shutdown --now
await self.workflow_shutdown()
# keep the data store up to date with what's going on
await self.update_data_structure()
self.update_data_store()
# give commands time to complete
sleep(1) # give any remove-init's time to complete

@hjoliver
Copy link
Member Author

hjoliver commented Dec 6, 2023

Ah, some progress. It's because the same job gets submitted twice! (Not double-handling of queued commands during reload as I initially thought).

It only happens if the task state hasn't updated yet, after job submission. So not a very serious bug after all, but still a bug.

Workaround:

script = """
    cylc__job__wait_cylc_message_started
    cylc reload $CYLC_WORKFLOW_ID
"""

@oliver-sanders
Copy link
Member

It only happens if the task state hasn't updated yet

hmm, fishy. What state was the task in at this point? preparing or submitted?

@oliver-sanders oliver-sanders modified the milestones: cylc-8.2.4, 8.2.5 Jan 8, 2024
@MetRonnie MetRonnie self-assigned this Jan 23, 2024
@MetRonnie
Copy link
Member

I can't reproduce with your original example

INFO - [1/foo preparing job:01 flows:1] => submitted
INFO - [command] reload_workflow
INFO - PAUSING the workflow now: Reloading workflow
INFO - Reloading the workflow definition.
INFO - LOADING workflow parameters
INFO - + workflow UUID = d5194cff-9f4d-4400-975c-3b36de1023e3
INFO - + UTC mode = False
INFO - + cycle point time zone = Z
INFO - + paused = True
INFO - Reloading task definitions.
INFO - [1/foo submitted job:01 flows:1] reloaded task definition
WARNING - [1/foo submitted job:01 flows:1] active with pre-reload settings
INFO - LOADING job data
INFO - Reload completed.
INFO - RESUMING the workflow now
INFO - Command actioned: reload_workflow()
INFO - [1/foo submitted job:01 flows:1] => running
INFO - [1/foo running job:01 flows:1] => succeeded
INFO - Workflow shutting down - AUTOMATIC

@MetRonnie MetRonnie removed their assignment Feb 15, 2024
@MetRonnie MetRonnie added bug? Not sure if this is a bug or not and removed bug Something is wrong :( labels Feb 15, 2024
@oliver-sanders oliver-sanders modified the milestones: cylc-8.2.5, 8.2.x, cylc-8.x Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug? Not sure if this is a bug or not
Projects
None yet
Development

No branches or pull requests

3 participants