Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune/internal] Move signal handling into separate method #31004

Merged
merged 4 commits into from
Dec 12, 2022

Conversation

krfricke
Copy link
Contributor

Signed-off-by: Kai Fricke kai@anyscale.com

Why are these changes needed?

This PR improves the tune.py code structure:

  • Signal handling is set up in a separate method
  • Signal handling now uses a threading event instead of a global dict
  • The GPU check is moved to before a trial runner is instantiated (and thus restored) in order to fail faster if fail conditions are met
  • Trial executor creation is moved closer to where the trial executor is passed

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Kai Fricke <kai@anyscale.com>
@@ -127,6 +127,39 @@ def _report_progress(
reporter.report(trials, done, sched_debug_str, executor_debug_str)


def _setup_signal_catching() -> threading.Event:
original_handler = signal.getsignal(signal.SIGINT)
stop_event = threading.Event()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more specific name? tune_loop_interrupted_event or something alike?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're ok with it I'd like to keep this, as we set up general signal catching here and not just the event. Let me know if you disagree!
Meanwhile I'll kick of tune cloud release tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I mean the event name can be more specific? The method name _setup_signal_catching sounds fine to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see! Yes, thanks, will updated!

Copy link
Contributor

@xwjiang2010 xwjiang2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks!

Kai Fricke added 2 commits December 9, 2022 17:28
Signed-off-by: Kai Fricke <kai@anyscale.com>
@krfricke krfricke merged commit 392d137 into ray-project:master Dec 12, 2022
@krfricke krfricke deleted the tune/refactor-signal-handling branch December 12, 2022 20:43
WeichenXu123 pushed a commit to WeichenXu123/ray that referenced this pull request Dec 19, 2022
…t#31004)

This PR improves the tune.py code structure:

- Signal handling is set up in a separate method
- Signal handling now uses a threading event instead of a global dict
- The GPU check is moved to before a trial runner is instantiated (and thus restored) in order to fail faster if fail conditions are met
- Trial executor creation is moved closer to where the trial executor is passed

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
AmeerHajAli pushed a commit that referenced this pull request Jan 12, 2023
This PR improves the tune.py code structure:

- Signal handling is set up in a separate method
- Signal handling now uses a threading event instead of a global dict
- The GPU check is moved to before a trial runner is instantiated (and thus restored) in order to fail faster if fail conditions are met
- Trial executor creation is moved closer to where the trial executor is passed

Signed-off-by: Kai Fricke <kai@anyscale.com>
tamohannes pushed a commit to ju2ez/ray that referenced this pull request Jan 25, 2023
…t#31004)

This PR improves the tune.py code structure:

- Signal handling is set up in a separate method
- Signal handling now uses a threading event instead of a global dict
- The GPU check is moved to before a trial runner is instantiated (and thus restored) in order to fail faster if fail conditions are met
- Trial executor creation is moved closer to where the trial executor is passed

Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: tmynn <hovhannes.tamoyan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants