-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the implementation of uri & Fix async log bug #1364
Conversation
@ChiahungTai You are welcome to review this PR. |
logger = get_module_logger("workflow") | ||
|
||
|
||
class ExpManager: | ||
""" | ||
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow. | ||
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html) | ||
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChiahungTai You can start from these docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"we can have multiple Experiment
s with different uri" => IIUC, the words are imply the design of ExpManger is suggestion the user use different uri in each experiment?
From my understanding in MLFLow, the uri is like a store backend(sql, file...). The ExpManger is a single entry to lookup and manipulate the experiments in the uri.
So the user can change different uri to retrieve the different topic of experiements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
ExpManager is a singleton, but we can change its uri to get experiments from different uri.
I added some comments just now.
Please check if it is helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your new comment is more clear than before.
Fail the test in test_all_pipeline.py. |
Looks like the exception is the same as the test case #1363.
|
@@ -370,11 +369,11 @@ def uri_context(self, uri: Text): | |||
the temporal uri | |||
""" | |||
prev_uri = self.exp_manager.default_uri | |||
C.exp_manager["kwargs"]["uri"] = uri | |||
self.exp_manager.default_uri = uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the set of default_uri is not engouh. The self.exp_manager is not change the client uri if you only change default uri.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fixed now.
The _client will not be cached. No further maintenance is required for it.
qlib/workflow/expm.py
Outdated
if self.active_experiment is not None: | ||
self.active_experiment.end(recorder_status) | ||
self.active_experiment = None | ||
# When an experiment end, we will release the current uri. | ||
self._current_uri = None | ||
self._set_client_uri() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pump into an exception - No argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now.
qlib/workflow/expm.py
Outdated
@@ -322,18 +333,17 @@ def client(self): | |||
self._client = mlflow.tracking.MlflowClient(tracking_uri=self.uri) | |||
return self._client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might need to sync the returned self._client to the updated default_uri.
Or you have set the uri in every function in access the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed Now.
qlib/workflow/recorder.py
Outdated
@@ -329,7 +329,7 @@ def get_local_dir(self): | |||
def start_run(self): | |||
# set the tracking uri | |||
mlflow.set_tracking_uri(self.uri) | |||
# start the run | |||
# start the RuntimeError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the start the RuntimeError mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now
_ = mlflow.tracking.MlflowClient(tracking_uri=str(self.TMP_PATH)) | ||
end = time.time() | ||
elasped = end - start | ||
self.assertGreater(1e-2, elasped) # it can be done in less than 10ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertGreater or assertLesser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the direction. And it is more readable than before
logger = get_module_logger("workflow") | ||
|
||
|
||
class ExpManager: | ||
""" | ||
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow. | ||
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html) | ||
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your new comment is more clear than before.
@ChiahungTai I checked out your test I didn't get what you want to assert in the test. It seems that the exception looks reasonable. |
The test code is pass after assign the experiment_name and recorder_name.
|
Your new patch works fine in my own project.
|
The patch LGTM. You can merge it after fix the pylint error. |
* Optimize the implementation of uri * remove redundant func * Set the right order of _set_client_uri * Update qlib/workflow/expm.py * Simplify client & add test.Add docs; Fix async bug * Fix comments & pylint * Improve README
* Optimize the implementation of uri * remove redundant func * Set the right order of _set_client_uri * Update qlib/workflow/expm.py * Simplify client & add test.Add docs; Fix async bug * Fix comments & pylint * Improve README
* Optimize the implementation of uri * remove redundant func * Set the right order of _set_client_uri * Update qlib/workflow/expm.py * Simplify client & add test.Add docs; Fix async bug * Fix comments & pylint * Improve README
Description
Motivation and Context
How Has This Been Tested?
pytest qlib/tests/test_all_pipeline.py
under upper directory ofqlib
.Screenshots of Test Results (if appropriate):
Types of changes