Confusing error message when trying to override `run_id` #549

astrojuanlu · 2024-05-07T13:33:16Z

Description

I read in the docs that

Consequently it will never be possible to save in a specific mlflow run_id if you launch a pipeline with the kedro run command because the MlflowHook creates a new run before each pipeline run.

...but I'm stubborn so I still wanted to try.

For that, I specified my run_id both in the configuration and in the dataset:

# catalog.yml
regressor:
  type: kedro_mlflow.io.models.MlflowModelTrackingDataset
  flavor: mlflow.sklearn
  run_id: ${runtime_params:mlflow_run_id}

# mlflow.yml
tracking:
  run:
    id: ${runtime_params:mlflow_run_id}

And when launching kedro run --params mlflow_run_id=xxx I get the following error:

kedro.io.core.DatasetError: 'run_id' cannot be specified if there is an mlflow active run.Run_id mismatch: 
 - 'run_id'=6ffe573f95124a65a04dcece1589fbdc
 - active_run id=6ffe573f95124a65a04dcece1589fbdc

Expected Result

If the active_run and the specified run_id are the same, proceed.

Actual Result

(The error above)

Your Environment

kedro and kedro-mlflow version used (pip show kedro and pip show kedro-mlflow): 0.19.5 and 0.12.2 respectively
Python version used (python -V): 3.11.5
Operating system and version: macOS Sonoma 14.4.1

Does the bug also happen with the last version on master?

Yes, it still does.

The text was updated successfully, but these errors were encountered:

Galileo-Galilei · 2024-05-15T21:47:37Z

This is definitely a feature, not a bug 😉 even if I acknowledge the error message could be better (but I did not imagine someone would ever do that!)

Essentially what happens is that :

you've run a first time your pipeline which has created a "regressor" in a new mlflow run 6ffe573f95124a65a04dcece1589fbdc
your run it a second time but instead of creating a new run, you log everything in the mlflow_run_id run because of

# mlflow.yml
tracking:
  run:
    id: ${runtime_params:mlflow_run_id}

This is very unlikely that you want to modify an existing run, but say you want to complete it

you're now trying to save a new regressor in the same run, which will corrupt the old run and definietly override the first regressor you trained. This is exactly the opposite of what you ar trying to achieve in experiment tracking: if you want to train another regressor, just create another mlflow run so you can compare the two and select the best one

However, I assume you want to load the regressor you've just trained. This is perfectly possible:

# catalog.yml
regressor:
  type: kedro_mlflow.io.models.MlflowModelTrackingDataset
  flavor: mlflow.sklearn
  run_id: ${runtime_params:mlflow_run_id}

# note that you don't specify anything in mlflow yml regarding the run id, you will either create a new mlflow run (or don't create a run if you added it to the blacklist)

and now you can run:

kedro run --pipeline inference --params mlflow_run_id=xxx (note that this is not the training pipeline, but a one that consumes the model)

(On a side note, it is likely a bad idea to deal specify the run id manually on each run, just specify a registered_model_name and let the model registry handle it for you)

Does it make sense?

astrojuanlu · 2024-05-21T14:30:21Z

My understanding from the beginning was that this is a feature 😄 The point was more around the error message.

In any case, thanks a bunch for the explanation 🙏🏼 it makes a ton of sense.

I leave it to you whether you can to keep this issue open for improving the error message, or just close it as won't fix.

(On a side note, it is likely a bad idea to deal specify the run id manually on each run, just specify a registered_model_name and let the model registry handle it for you)

You mean https://kedro-mlflow.readthedocs.io/en/stable/source/07_python_objects/01_DataSets.html#mlflowmodelregistrydataset ?

…n id is already active (#549)

astrojuanlu mentioned this issue May 7, 2024

Describe integration with MLflow kedro-org/kedro#3856

Merged

7 tasks

Galileo-Galilei added documentation Improvements or additions to documentation need-design-decision Several ways of implementation are possible and one must be chosen labels May 15, 2024

Galileo-Galilei mentioned this issue Aug 26, 2024

🔊 Use MlflowHook internal logger and log the active run (#567) #584

Merged

6 tasks

Galileo-Galilei added a commit that referenced this issue Aug 26, 2024

🔊 Improve error message for MlflowModelTrackingDataset.save when a ru…

cefe506

…n id is already active (#549)

Galileo-Galilei added a commit that referenced this issue Aug 26, 2024

🔊 Improve error message for MlflowModelTrackingDataset.save when a ru…

80ee273

…n id is already active (#549)

Galileo-Galilei closed this as completed in #584 Aug 26, 2024

Galileo-Galilei added a commit that referenced this issue Aug 26, 2024

🔊 Improve error message for MlflowModelTrackingDataset.save when a ru…

61864f0

…n id is already active (#549)

Galileo-Galilei moved this to ✅ Done in kedro-mlflow roadmap Oct 29, 2024

Galileo-Galilei added this to kedro-mlflow roadmap Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing error message when trying to override `run_id` #549

Confusing error message when trying to override `run_id` #549

astrojuanlu commented May 7, 2024

Galileo-Galilei commented May 15, 2024

astrojuanlu commented May 21, 2024

Confusing error message when trying to override run_id #549

Confusing error message when trying to override run_id #549

Comments

astrojuanlu commented May 7, 2024

Description

Expected Result

Actual Result

Your Environment

Does the bug also happen with the last version on master?

Galileo-Galilei commented May 15, 2024

astrojuanlu commented May 21, 2024

Confusing error message when trying to override `run_id` #549

Confusing error message when trying to override `run_id` #549