Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't publish MlflowMetricsHistoryDataset to Remote tracking server #582

Closed
cariveroco opened this issue Aug 21, 2024 · 5 comments · Fixed by #591
Closed

Can't publish MlflowMetricsHistoryDataset to Remote tracking server #582

cariveroco opened this issue Aug 21, 2024 · 5 comments · Fixed by #591
Labels
bug Something isn't working

Comments

@cariveroco
Copy link

Description

Kedro pipeline run can't publish objects of type kedro_mlflow.io.metrics.MlflowMetricsHistoryDataset to a remote Mlflow tracking server.

Context

A kedro pipeline that can successfully publish a kedro_mlflow.io.metrics.MlflowMetricsHistoryDataset to a local Mlflow tracking server, is throwing an error when trying to pulish to a remote server. Previously, the same pipeline can successfully publish the metrics to both local and remote servers when the metrics was still configured to be of type kedro_mlflow.io.metrics.MlflowMetricsDataSet in kedro-mlflow v.0.11.10.

Based on the errors thrown, this may be related to this bug, where the suspected cause is that the get_all_metrics method is implemented for FileStore (local tracking server) but not for RestStore (remote tracking server).

Steps to Reproduce

  1. Create a kedro project with a pipeline that produces a metrics object that is configured to be of type kedro_mlflow.io.metrics.MlflowMetricsHistoryDataset.
  2. Point the mlflow.yml file to a remote Mlflow tracking server.
  3. Run the pipeline.

Expected Result

The pipeline execution is completed successfully, and objects configured to be of type kedro_mlflow.io.metrics.MlflowMetricsHistoryDataset are successfully published to the remote Mlflow tracking server.

Actual Result

The pipeline execution is completed successfully, but the run still throws back an error and can't publish the kedro_mlflow.io.metrics.MlflowMetricsHistoryDataset to the remote Mlflow tracking server. The error does not happen when running the same code (on exactly the same environment) with a local tracking server.

-- If you received an error, place it here.

[08/20/24 08:36:43] INFO     Completed 42 out of 42 tasks                                 sequential_runner.py:90
                    INFO     Pipeline execution completed successfully.                             runner.py:119
Traceback (most recent call last):
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/io/core.py", line 291, in exists
    return self._exists()
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro_mlflow/io/metrics/mlflow_metrics_history_dataset.py", line 122, in _exists
    all_metrics = client._tracking_client.store.get_all_metrics(
AttributeError: 'RestStore' object has no attribute 'get_all_metrics'
 
The above exception was the direct cause of the following exception:
 
Traceback (most recent call last):
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/bin/kedro", line 8, in <module>
    sys.exit(main())
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/framework/cli/cli.py", line 233, in main
    cli_collection()
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/framework/cli/cli.py", line 130, in main
    super().main(
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/framework/cli/project.py", line 225, in run
    session.run(
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/framework/session/session.py", line 408, in run
    hook_manager.hook.after_pipeline_run(
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_manager.py", line 480, in traced_hookexec
    return outcome.get_result()
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_result.py", line 100, in get_result
    raise exc.with_traceback(exc.__traceback__)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_result.py", line 62, in from_call
    result = func()
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_manager.py", line 477, in <lambda>
    lambda: oldcall(hook_name, hook_impls, caller_kwargs, firstresult)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro_mlflow/framework/hooks/mlflow_hook.py", line 365, in after_pipeline_run
    catalog.exists(dataset)
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/io/data_catalog.py", line 575, in exists
    return dataset.exists()
  File "/layers/dap-buildpacks_pip-install/site-packages/virtual-env/lib/python3.10/site-packages/kedro/io/core.py", line 296, in exists
    raise DatasetError(message) from exc
kedro.io.core.DatasetError: Failed during exists check for data set MlflowMetricsHistoryDataset(prefix=).
'RestStore' object has no attribute 'get_all_metrics'

Your Environment

  • kedro and kedro-mlflow version used (pip show kedro and pip show kedro-mlflow): kedro 0.19.6 and kedro-mlflow 0.12.2 (mlflow 2.12.1)
  • Python version used (python -V): 3.10.14
  • Operating system and version: Linux-4.18.0-553.8.1.el8_10.x86_64-x86_64-with-glibc2.35

Does the bug also happen with the last version on master?

The bug previously does not exist with the following setup:

  • kedro-mlflow 0.11.10 (using kedro_mlflow.io.metrics.MlflowMetricsDataSet)
  • kedro 0.18.4
  • mlflow 2.6.0
  • python 3.10.12
@cariveroco cariveroco changed the title Can't publish MlflowMetricsHistoryDataset objects to Remote Mlflow tracking server Can't publish MlflowMetricsHistoryDataset to Remote tracking server Aug 21, 2024
@Galileo-Galilei Galileo-Galilei added the bug Something isn't working label Aug 21, 2024
@Galileo-Galilei
Copy link
Owner

Indeed, thanks for raising this issue. I guess the right way is to use get_metric_history which seems implemented in all stores: https://github.com/search?q=repo%3Amlflow%2Fmlflow+get_metric_history&type=code

@mck-star-yar
Copy link

Just faced the same issue; pinning mlflow version to earlier one doesn't work due to compatibility with py3.10

@Galileo-Galilei
Copy link
Owner

I'll try to take a look at it next week. This is a bug and it should be corrected quickly. Thank you for your patience.

@Galileo-Galilei
Copy link
Owner

Hi, @cariveroco @mck-star-yar can you test pip install git+https://github.com/Galileo-Galilei/kedro-mlflow.git@582-metrics_history-dataset-to-server and tell me if it fixes the issue?

@cariveroco
Copy link
Author

Hi @Galileo-Galilei, it's working now on my end. Thank you very much!

Galileo-Galilei added a commit that referenced this issue Sep 24, 2024
…ible with all Mlflow stores (#582) (#591)

* 🐛 Replace get_all_metrics in MLflowMetricsHistoryDataset to be compatible with all Mlflow stores (#582)

* changelog

* bump changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: ✅ Done
3 participants