Optimize the implementation of uri & Fix async log bug #1364

you-n-g · 2022-11-17T13:47:19Z

Description

Motivation and Context

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

Pipeline test:
Your own tests:

Types of changes

Fix bugs
Add new feature
Update documentation

you-n-g · 2022-11-17T14:02:33Z

@ChiahungTai You are welcome to review this PR.

you-n-g · 2022-11-17T14:03:17Z

qlib/workflow/expm.py

 logger = get_module_logger("workflow")


 class ExpManager:
    """
-    This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
-    (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
+        This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.


@ChiahungTai You can start from these docs.

"we can have multiple Experiments with different uri" => IIUC, the words are imply the design of ExpManger is suggestion the user use different uri in each experiment?

From my understanding in MLFLow, the uri is like a store backend(sql, file...). The ExpManger is a single entry to lookup and manipulate the experiments in the uri.

So the user can change different uri to retrieve the different topic of experiements.

Yes.
ExpManager is a singleton, but we can change its uri to get experiments from different uri.
I added some comments just now.
Please check if it is helpful

Your new comment is more clear than before.

qlib/workflow/expm.py

ChiahungTai · 2022-11-17T15:24:37Z

Fail the test in test_all_pipeline.py.
I have add the test case for you in #1363.

ChiahungTai · 2022-11-17T22:39:01Z

Looks like the exception is the same as the test case #1363.

        except MlflowException as e:
            raise ValueError(
                "No valid experiment has been found, please make sure the input experiment name is correct."
            ) from e

ChiahungTai · 2022-11-17T23:06:20Z

qlib/workflow/__init__.py

@@ -370,11 +369,11 @@ def uri_context(self, uri: Text):
            the temporal uri
        """
        prev_uri = self.exp_manager.default_uri
-        C.exp_manager["kwargs"]["uri"] = uri
+        self.exp_manager.default_uri = uri


I think the set of default_uri is not engouh. The self.exp_manager is not change the client uri if you only change default uri.

It is fixed now.
The _client will not be cached. No further maintenance is required for it.

ChiahungTai · 2022-11-17T23:07:15Z

qlib/workflow/expm.py

        if self.active_experiment is not None:
            self.active_experiment.end(recorder_status)
            self.active_experiment = None
-        # When an experiment end, we will release the current uri.
-        self._current_uri = None
+        self._set_client_uri()


Pump into an exception - No argument.

ChiahungTai · 2022-11-17T23:09:21Z

qlib/workflow/expm.py

@@ -322,18 +333,17 @@ def client(self):
            self._client = mlflow.tracking.MlflowClient(tracking_uri=self.uri)
        return self._client


I think you might need to sync the returned self._client to the updated default_uri.
Or you have set the uri in every function in access the client.

ChiahungTai · 2022-11-18T02:51:19Z

qlib/workflow/recorder.py

@@ -329,7 +329,7 @@ def get_local_dir(self):
    def start_run(self):
        # set the tracking uri
        mlflow.set_tracking_uri(self.uri)
-        # start the run
+        # start the RuntimeError


What does the start the RuntimeError mean?

ChiahungTai · 2022-11-18T02:53:46Z

tests/dependency_tests/test_mlflow.py

+            _ = mlflow.tracking.MlflowClient(tracking_uri=str(self.TMP_PATH))
+        end = time.time()
+        elasped = end - start
+        self.assertGreater(1e-2, elasped)  # it can be done in less than 10ms


assertGreater or assertLesser

I have changed the direction. And it is more readable than before

ChiahungTai · 2022-11-18T02:55:01Z

qlib/workflow/expm.py

 logger = get_module_logger("workflow")


 class ExpManager:
    """
-    This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
-    (The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
+        This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.


Your new comment is more clear than before.

you-n-g · 2022-11-18T03:44:16Z

@ChiahungTai
Hi, I have fixed all the comments and waiting for the CI.

I checked out your test

I didn't get what you want to assert in the test.

It seems that the exception looks reasonable.
When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given.
So the default exp name is used.
There is no experiment which is active or is named with the default exp name.
So an exception of No valid experiment has been found is raised

ChiahungTai · 2022-11-18T04:00:38Z

The test code is pass after assign the experiment_name and recorder_name.
But in my own project, I provide the experiment_name and recorder_name.
But the code is too long to write a simple test case.
with R.uri_context(uri=uri_path): recorder = R.get_recorder(experiment_name=EXP_NAME, recorder_name="original")
I am working on another version of cached_pipeline in my own project. I might test the new commit and try it my local first.
Also I will upload a new test if I can rewrite a simple test case.

@ChiahungTai Hi, I have fixed all the comments and waiting for the CI.

I checked out your test

I didn't get what you want to assert in the test.

It seems that the exception looks reasonable. When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given. So the default exp name is used. There is no experiment which is active or is named with the default exp name. So an exception of No valid experiment has been found is raised

ChiahungTai · 2022-11-18T04:10:16Z

Your new patch works fine in my own project.

The test code is pass after assign the experiment_name and recorder_name. But in my own project, I provide the experiment_name and recorder_name. But the code is too long to write a simple test case. with R.uri_context(uri=uri_path): recorder = R.get_recorder(experiment_name=EXP_NAME, recorder_name="original") I am working on another version of cached_pipeline in my own project. I might test the new commit and try it my local first. Also I will upload a new test if I can rewrite a simple test case.

@ChiahungTai Hi, I have fixed all the comments and waiting for the CI.
I checked out your test
I didn't get what you want to assert in the test.
It seems that the exception looks reasonable. When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given. So the default exp name is used. There is no experiment which is active or is named with the default exp name. So an exception of No valid experiment has been found is raised

ChiahungTai · 2022-11-18T04:17:13Z

The patch LGTM. You can merge it after fix the pylint error.

* Optimize the implementation of uri * remove redundant func * Set the right order of _set_client_uri * Update qlib/workflow/expm.py * Simplify client & add test.Add docs; Fix async bug * Fix comments & pylint * Improve README

you-n-g added 2 commits November 17, 2022 21:46

Optimize the implementation of uri

4c7d3d7

remove redundant func

ce7934a

you-n-g force-pushed the r_optm branch from b6c119d to ce7934a Compare November 17, 2022 13:53

Set the right order of _set_client_uri

8cde41b

you-n-g mentioned this pull request Nov 17, 2022

Sometimes the set of C.exp_manager["kwargs"]["uri"] will not take effect #1360

Closed

5 tasks

you-n-g commented Nov 17, 2022

View reviewed changes

qlib/workflow/expm.py Outdated Show resolved Hide resolved

Update qlib/workflow/expm.py

09d9203

ChiahungTai reviewed Nov 17, 2022

View reviewed changes

you-n-g added 2 commits November 18, 2022 10:30

Simplify client & add test.Add docs; Fix async bug

ecea697

Merge remote-tracking branch 'me/r_optm' into r_optm

508cdd7

ChiahungTai reviewed Nov 18, 2022

View reviewed changes

Fix comments & pylint

5d8fbb1

you-n-g requested a review from ChiahungTai November 18, 2022 03:44

you-n-g mentioned this pull request Nov 18, 2022

Test case. #1363

Closed

5 tasks

you-n-g changed the title ~~Optimize the implementation of uri~~ Optimize the implementation of uri & Fix async bug Nov 18, 2022

you-n-g changed the title ~~Optimize the implementation of uri & Fix async bug~~ Optimize the implementation of uri & Fix async log bug Nov 18, 2022

Improve README

d57bde8

you-n-g merged commit 994f893 into microsoft:main Nov 18, 2022

you-n-g deleted the r_optm branch November 18, 2022 05:11

you-n-g added the enhancement New feature or request label Dec 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize the implementation of uri & Fix async log bug #1364

Optimize the implementation of uri & Fix async log bug #1364

you-n-g commented Nov 17, 2022

you-n-g commented Nov 17, 2022

you-n-g Nov 17, 2022

ChiahungTai Nov 17, 2022

you-n-g Nov 18, 2022

ChiahungTai Nov 18, 2022

ChiahungTai commented Nov 17, 2022

ChiahungTai commented Nov 17, 2022

ChiahungTai Nov 17, 2022

you-n-g Nov 18, 2022

ChiahungTai Nov 17, 2022

you-n-g Nov 18, 2022

ChiahungTai Nov 17, 2022

you-n-g Nov 18, 2022

ChiahungTai Nov 18, 2022

you-n-g Nov 18, 2022

ChiahungTai Nov 18, 2022

you-n-g Nov 18, 2022

ChiahungTai Nov 18, 2022

you-n-g commented Nov 18, 2022

ChiahungTai commented Nov 18, 2022

ChiahungTai commented Nov 18, 2022

ChiahungTai commented Nov 18, 2022

		@@ -322,18 +333,17 @@ def client(self):
		self._client = mlflow.tracking.MlflowClient(tracking_uri=self.uri)
		return self._client

Optimize the implementation of uri & Fix async log bug #1364

Optimize the implementation of uri & Fix async log bug #1364

Conversation

you-n-g commented Nov 17, 2022

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

you-n-g commented Nov 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChiahungTai commented Nov 17, 2022

ChiahungTai commented Nov 17, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

you-n-g commented Nov 18, 2022

ChiahungTai commented Nov 18, 2022

ChiahungTai commented Nov 18, 2022

ChiahungTai commented Nov 18, 2022