Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Fix the bug that tuner init error does not cause ERROR state #4953

Merged
merged 2 commits into from
Jun 22, 2022

Conversation

liuzhe-lz
Copy link
Contributor

Description

The tuner class is initialized before dispatcher connecting to NNI manager. If the initialization failed, dispatcher will never connect to NNI manager.
Therefore, NNI manager cannot detect such failure with PING. This PR adds a timeout on waiting connection to detect this kind of error.

This is a quick fix. In future setupTuner() should monitor the process' status.

Test Options

  • fast test
  • full test - HPO
  • full test - NAS
  • full test - compression

Checklist

  • test case
  • doc

How to test

@liuzhe-lz liuzhe-lz changed the base branch from master to v2.8 June 21, 2022 21:13
@QuanluZhang QuanluZhang merged commit 8b40044 into v2.8 Jun 22, 2022
@liuzhe-lz liuzhe-lz deleted the fix-tuner-init branch May 15, 2023 09:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants