-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[nas] fix issue introduced by the trial recovery feature #5109
[nas] fix issue introduced by the trial recovery feature #5109
Conversation
pass | ||
previous_max_param_id = self.recover_parameter_id(data) | ||
self.parameters_count = previous_max_param_id | ||
self._advisor_initialized = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that handld_add_customized_trial
is never called and _advisor_initialized
is never set to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! handle_add_customized_trial
is called when experiment is resumed (even no trial should be recovered), but will not be called when experiment is created, this is a bug introduced by me...
I moved this flag to handle_request_trial_jobs
, which means if trial is not requested, send_trial
will be blocked. And "request trial" will always be sent by nnimanager
@@ -41,7 +45,11 @@ def send_trial(parameters: dict, placement_constraint=None) -> int: | |||
Send a new trial. Executed on tuner end. | |||
Return a ID that is the unique identifier for this trial. | |||
""" | |||
return get_advisor().send_trial(parameters, placement_constraint) | |||
advisor = get_advisor() | |||
while not advisor.initialized: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest putting this into RetiariiAdvisor.send_trial
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
/azp run integration test - local - linux |
No pipelines are associated with this pull request. |
1 similar comment
No pipelines are associated with this pull request. |
/azp run integration test - remote - linux to linux |
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |
/azp run integration test - local - linux |
No pipelines are associated with this pull request. |
fix bugs in #4931