Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test/Sample test] Fix the imparity between experiment name setting of notebook samples and python samples #2362

Conversation

numerology
Copy link

@numerology numerology commented Oct 10, 2019

Fix #2148


This change is Reviewable

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 10, 2019

Maybe it would be easier to just fix #2292
I'll need a bit more time to debug it.

@@ -129,7 +129,9 @@ def _compile(self):
self._run_pipeline = raw_args['run_pipeline']

if self._run_pipeline:
nb_params['experiment_name'] = self._test_name + '-test'
if not raw_args['notebook_params']['experiment_name']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move all the yaml validation codes in one function, including the yamale validate?

@@ -129,7 +129,9 @@ def _compile(self):
self._run_pipeline = raw_args['run_pipeline']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some comments in the schema yaml about the run_pipeline:

  1. for python samples: setting it true means submitting the pipeline.
  2. for notebook samples: setting it true means telling the test infra that the sample will submit a pipeline run and the test infra will inject the experiment_name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words: run_pipeline means submitting the run for python sample and checking the results in the notebook sample.

nb_params['experiment_name'] = self._test_name + '-test'
if not raw_args['notebook_params']['experiment_name']:
raise ValueError('Experiment name is required if pipeline run is expected.')
nb_params['experiment_name'] = raw_args['notebook_params']['experiment_name']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the experiment_name value in the config yaml is not used(but generated by the test infra) should be noted in the schema yaml file, similar to the output value not being used. WDYT?

@numerology
Copy link
Author

/hold further work needed.

@gaoning777
Copy link
Contributor

/hold further work needed.

Could you specify what further work is required? Maybe put a link here?

@numerology
Copy link
Author

/hold further work needed.

Could you specify what further work is required? Maybe put a link here?

Sure. Since #2292 is done, I think it will be beneficial to incorporate the newly introduced mechanism here, which will reduce the need of injecting experiment names.

@gaoning777
Copy link
Contributor

/hold further work needed.

Could you specify what further work is required? Maybe put a link here?

Sure. Since #2292 is done, I think it will be beneficial to incorporate the newly introduced mechanism here, which will reduce the need of injecting experiment names.

I think it is beneficial to explicitly add the experiment_name in the notebook examples, though. Or else, what is the default experiment_name when the env_var is not configured? WDYT?

@numerology
Copy link
Author

/hold further work needed.

Could you specify what further work is required? Maybe put a link here?

Sure. Since #2292 is done, I think it will be beneficial to incorporate the newly introduced mechanism here, which will reduce the need of injecting experiment names.

Per my reading it should be 'Default'. Sure, we can specify the experiment name in the notebook, but IMO changing it through env var will allow us to get rid of some ad hoc logic in current sample test infra. For example, we do not need to decide whether to inject experiment_name or not based on run_pipeline flag, instead we just set the env var when executing the pipeline.

@gaoning777
Copy link
Contributor

/hold further work needed.

Could you specify what further work is required? Maybe put a link here?

Sure. Since #2292 is done, I think it will be beneficial to incorporate the newly introduced mechanism here, which will reduce the need of injecting experiment names.

Per my reading it should be 'Default'. Sure, we can specify the experiment name in the notebook, but IMO changing it through env var will allow us to get rid of some ad hoc logic in current sample test infra. For example, we do not need to decide whether to inject experiment_name or not based on run_pipeline flag, instead we just set the env var when executing the pipeline.

I'm all up for supporting the env-var. My point is for the experiment_name to be explicit in the samples. In terms of the ad hoc logic, isn't injecting the environment variable no different from adding the parameter to the papermill?

@numerology
Copy link
Author

/hold further work needed.

Could you specify what further work is required? Maybe put a link here?

Sure. Since #2292 is done, I think it will be beneficial to incorporate the newly introduced mechanism here, which will reduce the need of injecting experiment names.

Per my reading it should be 'Default'. Sure, we can specify the experiment name in the notebook, but IMO changing it through env var will allow us to get rid of some ad hoc logic in current sample test infra. For example, we do not need to decide whether to inject experiment_name or not based on run_pipeline flag, instead we just set the env var when executing the pipeline.

I'm all up for supporting the env-var. My point is for the experiment_name to be explicit in the samples. In terms of the ad hoc logic, isn't injecting the environment variable no different from adding the parameter to the papermill?

One difference, for example, is that we can set the env var no matter what, but we can only inject experiment_name to notebook when it's indeed needed.

@gaoning777
Copy link
Contributor

I don't think papermill would fail if there is no such parameters. What it does is to inject a cell below the cell with the 'parameters' tag. Therefore, we can inject the experiment_name no matter what.

@numerology
Copy link
Author

I don't think papermill would fail if there is no such parameters. What it does is to inject a cell below the cell with the 'parameters' tag. Therefore, we can inject the experiment_name no matter what.

You're right... Somewhat I got the wrong impression that it would raise errors. Now that it does not, I think I can further simplify the code a little bit.

@Ark-kun
Copy link
Contributor

Ark-kun commented Oct 14, 2019

I think it is beneficial to explicitly add the experiment_name in the notebook examples, though.

Let's brainstorm on the pros and cons of having experiment_name in the notebooks.
Some of the cons that come to mind:

  • Extra concept for users to learn (the "Experiment" concept).
  • If the experiment names are different then we're going to "spam" the "Experiments" section of the user's cluster.
  • Currently there is no way to delete experiments in the UX.
  • Extra line/configuration, which is not required for running the sample. The samples should be as and focused small as possible.

@gaoning777
Copy link
Contributor

I think it is beneficial to explicitly add the experiment_name in the notebook examples, though.

Let's brainstorm on the pros and cons of having experiment_name in the notebooks.
Some of the cons that come to mind:

  • Extra concept for users to learn (the "Experiment" concept).
  • If the experiment names are different then we're going to "spam" the "Experiments" section of the user's cluster.
  • Currently there is no way to delete experiments in the UX.
  • Extra line/configuration, which is not required for running the sample. The samples should be as and focused small as possible.

I think it is fine to remove the experiment_name and use the default experiment in the samples. And, #2292 is also merged.

@gaoning777
Copy link
Contributor

@numerology, could you resolve the comments and we can merge this PR?

…ix-sample-test-injection-imparity

� Conflicts:
�	test/sample-test/run_sample_test.py
�	test/sample-test/sample_test_launcher.py
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign gaoning777
You can assign the PR to them by writing /assign @gaoning777 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@numerology
Copy link
Author

@numerology, could you resolve the comments and we can merge this PR?

After merging #2292 it seems like there is not too much to do in this PR. WDYT?
After #2292 it will be the case that for both python and notebook samples the experiment names are set via env var no matter what.

@gaoning777
Copy link
Contributor

SGTM

@gaoning777
Copy link
Contributor

Feel free to close this.

@numerology
Copy link
Author

/close
Per discussion

@k8s-ci-robot
Copy link
Contributor

@numerology: Closed this PR.

In response to this:

/close
Per discussion

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@numerology numerology deleted the fix-sample-test-injection-imparity branch October 15, 2019 15:04
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sample test infra: inconsistent behavior of dynamically injected arguments for output and experiment_name
4 participants