Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(1) Fix /0 bug in double_ensemble, (2) remove _default_uri for R/expm, (3) support model size in pytorch models #314

Merged
merged 17 commits into from
Mar 11, 2021

Conversation

D-X-Y
Copy link
Contributor

@D-X-Y D-X-Y commented Mar 8, 2021

Four changes:

  • Fix /0 bug in double_ensemble
  • simplify the logic of default_uri for R/expm
  • support model size in pytorch models

Description

  • In double_ensemble models, it might potentially divide zero and raise an error.
  • Currently, _default_uri for R maybe inconsistent with that in C.
  • In pytorch-based contrib codes, there are some useless kwargs, such as verbose.
  • In pytorch-based contrib codes, add a function to count the number of parameters.

Motivation and Context

Make it more friendly for researchers.

How Has This Been Tested?

  • Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.

Screenshots of Test Results (if appropriate):

I run pytest -W ignore::DeprecationWarning qlib/tests/test_all_pipeline.py, below is the screenshot:
image

Types of changes

  • Fix bugs
  • Add new feature

@Derek-Wds Derek-Wds added bug Something isn't working enhancement New feature or request labels Mar 8, 2021
@@ -33,6 +33,10 @@ def __repr__(self):
name=self.__class__.__name__, duri=self._default_uri, curi=self._current_uri
)

def reset_default_uri(self, uri: Text):
self._default_uri = uri
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is any reason that you want to add this method? From my perspective, default_uri should be initialized and fixed when users call qlib.init. It seems there is no need to have such method here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple case is:
after we run many different experiments (with different uri), we may want to compare their performance. To query the metrics of each of these experiments, we need to let R know their uri.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@D-X-Y We have uri in C, default_uri and current_uri here.
I think this will make users confusing.
Do you think combining uri in C and default_uri is a good idea?

return self._artifact_uri

@property
def root_uri(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this method for? Can you give us more explanation about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to return the path of a specific record. For example, you can call R.get_recorder().root_uri and it will return: ml-runs/2/32792162f0f44951b1285c0b2e489252.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the related docs.

@@ -282,6 +285,12 @@ def get_uri(self):
"""
return self.exp_manager.uri

def reset_default_uri(self, uri: Text):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the one in expm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, to provide an api to reset default uri. Use to query experiments with different uri.

qlib/log.py Outdated
@@ -112,6 +108,27 @@ def set_log_with_config(log_config: dict):
logging_config.dictConfig(log_config)


def set_log_basic_config(filename: Optional[Text] = None, format: Optional[Text] = None, level: Optional[int] = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the loggers used in Qlib are initialized through the function get_module_logger. If you want to do some enhancement about setting the logger, you may want to make some modifications upon that function instead of creating a new one. This function seems to be a little bit redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation is that get_module_logger is called in different separate Python files, and it is hard for users to control the detailed behavior of get_module_logger function in each file.
This function provides a global control for logging, a case is that before each experiment runs, we can use this function to save the following logs into a file under the recorder directory (without caring about the kwargs of get_module_logger in each file).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We think config logging using Config will make it easier for user to manage all the configurations.
https://github.com/microsoft/qlib/blob/main/qlib/config.py#L266

@D-X-Y
You can set it in the C instead of create a new function

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We think the original Python logging system is not user-friendly.
Qlib may use loguru in the future #7

import torch.nn as nn


def count_parameters(models_or_parameters, unit="mb"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by mb?
If it is a storage size unit. the byte length of each number should be considered.
Otherwise, please give explanation of the b

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a storage size unit, please use 2**10 instead of 1e3

qlib/log.py Outdated
@@ -112,6 +108,27 @@ def set_log_with_config(log_config: dict):
logging_config.dictConfig(log_config)


def set_log_basic_config(filename: Optional[Text] = None, format: Optional[Text] = None, level: Optional[int] = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We think config logging using Config will make it easier for user to manage all the configurations.
https://github.com/microsoft/qlib/blob/main/qlib/config.py#L266

@D-X-Y
You can set it in the C instead of create a new function

qlib/log.py Outdated
@@ -112,6 +108,27 @@ def set_log_with_config(log_config: dict):
logging_config.dictConfig(log_config)


def set_log_basic_config(filename: Optional[Text] = None, format: Optional[Text] = None, level: Optional[int] = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We think the original Python logging system is not user-friendly.
Qlib may use loguru in the future #7

@@ -33,6 +33,10 @@ def __repr__(self):
name=self.__class__.__name__, duri=self._default_uri, curi=self._current_uri
)

def reset_default_uri(self, uri: Text):
self._default_uri = uri
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@D-X-Y We have uri in C, default_uri and current_uri here.
I think this will make users confusing.
Do you think combining uri in C and default_uri is a good idea?

def root_uri(self):
start_str = "file:"
if self.artifact_uri is not None:
xpath = self.artifact_uri.strip(start_str)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lstrip will be more accurate

return self._artifact_uri

@property
def root_uri(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the funtion aims to get the path in a filesystem, it should raise error(e.g. NotImplementError) when the uri is not file system.

Besides, the root_uri is not a good name for user.

@D-X-Y
Copy link
Contributor Author

D-X-Y commented Mar 9, 2021

@you-n-g do you have any suggested name for root_uri?

@D-X-Y D-X-Y changed the title (1) Fix /0 bug in double_ensemble, (2) add reset_default_uri func for R/expm, (3) support model size in pytorch models, (4) support re-direct log into files. (1) Fix /0 bug in double_ensemble, (2) remove _default_uri for R/expm, (3) support model size in pytorch models Mar 11, 2021
@D-X-Y D-X-Y requested review from you-n-g and Derek-Wds March 11, 2021 03:10
@D-X-Y
Copy link
Contributor Author

D-X-Y commented Mar 11, 2021

Following our offline discussion, I have:

  • temporarily removed set_log functions
  • simplify the logic of default uri in expm
  • rename root_uri to get_local_dir with more robust logic and doc
  • update count_parameters and add doc

Please let me know if you have any other concerns about the current version.

@D-X-Y D-X-Y requested a review from Derek-Wds March 11, 2021 04:54
Copy link
Contributor

@Derek-Wds Derek-Wds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks~

@D-X-Y
Copy link
Contributor Author

D-X-Y commented Mar 11, 2021

@Derek-Wds Thanks for your prompt response.

@you-n-g you-n-g merged commit e8beaa5 into microsoft:main Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants