Improve notebook check automation #2040

numerology · 2019-09-04T20:48:32Z

Part of #1750

Determine whether a sample is a notebook or python file based on the file extension instead of test name.
Move notebook params injection into config yaml files
Rename two samples to comply with naming conventions.
Add whether to run a pipeline as a flag in config yaml, instead of guessing based on whether we have an experiment name specified.

This change is

numerology · 2019-09-04T21:27:41Z

/test kubeflow-pipeline-sample-test

numerology · 2019-09-04T23:56:04Z

/test all

gaoning777 · 2019-09-05T19:59:39Z

Since the sample names change, could you make sure that the links to these samples are updated in this repo as well as the kubeflow/website?

numerology · 2019-09-05T20:01:48Z

Since the sample names change, could you make sure that the links to these samples are updated in this repo as well as the kubeflow/website?

Sure. Will do that shortly.

test/sample-test/check_notebook_results.py

test/sample-test/run_sample_test.py

gaoning777 · 2019-09-05T20:24:34Z

test/sample-test/sample_test_launcher.py

    else:
      subprocess.call(['dsl-compile', '--py', '%s.py' % self._test_name,
                       '--output', '%s.yaml' % self._test_name])

+
  def run_test(self):
    self._compile_sample()


We probably need to rename the functions since compile_sample for notebook is the execution. And then the check_result function is to execute the sample. Should we separate the steps as follows:
compile(): compile dsl to pipeline, conver ipynb to py
execute(): submit pipeline or run the notebook py
check()
WDYT?

That requires further refactoring in check_notebook_results.py and run_sample_test.py, in order to separate execute and check. Will let you know when I'm done.

BTW, There is now a nice function that compiles the pipeline, gets experiment and submits the run. All in a single line:
kfp.run_pipeline_func_on_cluster(automl_pipeline, arguments={})

Factorized into following 3 steps:
compile(): py to pipeline or notebook to py (after papermill preparation)
execute(): py retrieve config and submit for pipeline run if needed. run notebook (which usually contains pipline run)
check().

BTW, There is now a nice function that compiles the pipeline, gets experiment and submits the run. All in a single line:
kfp.run_pipeline_func_on_cluster(automl_pipeline, arguments={})

Thanks for the info! Will do that in a following PR.

test/sample-test/sample_test_launcher.py

Ark-kun · 2019-09-06T00:12:12Z

test/sample-test/configs/lightweight_component.config.yaml

+# limitations under the License.
+
+test_name: lightweight_component
+notebook_params:


I think we can have the same way of configuring arguments for both notebooks and pipelines.
BTW, we can also pass data to notebooks by using the environment variables (EXPERIMENT_NAME = os.environ['EXPERIMENT_NAME']). This way the user can set the environment variable once and then run samples without modifications.

The reason for the current implementation is that, these two sets of parameters are consumed by different things (one is papermill and another is kfp pipeline), and actually it's possible to have duplicate names across this two sets. Also, there functions are a bit of different.

Just want to make sure I under stand your second point correctly. Does it mean that we will encourage users to assign values in their notebook in this way? This config file will be used by the sample test infra only.

these two sets of parameters are consumed by different things (one is papermill and another is kfp pipeline)

Does papermill consume them itself or does it pass them to the notebook?

Just want to make sure I under stand your second point correctly.

The second point was about different ways the user or tests can set the required variables in a notebook. At this moment the user must manually insert variable values in a notebook and tests use papermill arguments. Maybe in future we can:

Reduce the number of variables that the user needs to specify to run the sample notebook (ideally to 0)

Maybe we can use environment variables for the rest. Papermill way of passing variables requires the sample notebooks to have specific structure which is not always the most readable (for example, previously all the component images had to be taken out of their components and specified in the first cell). This scheme would also be compatible with notebooks converted to python files.

I'm not asking you to implementing this. This is more like design thoughts that you might find relevant.

Does papermill consume them itself or does it pass them to the notebook?

papermill passes them to the notebook to substitute some constants defined in the notebook.

Reduce the number of variables that the user needs to specify to run the sample notebook (ideally to 0)

Agree this would be ideal.

Maybe we can use environment variables for the rest.

Does this require that something needs to be assigned using os.environ inside the notebook?

Does this require that something needs to be assigned using os.environ inside the notebook?

Yes. But this is something for another PR.

I think the explicit way of putting the argument at the top is in fact a good practice. Depending on the environment variable for input is implicit and might lead to negligence.

numerology · 2019-09-06T02:31:14Z

/test kubeflow-pipeline-e2e-test

…mprove-notebook-check-automation

Ark-kun · 2019-09-06T22:45:36Z

/lgtm

numerology · 2019-09-07T05:03:53Z

/approve

test/sample-test/sample_test_launcher.py

gaoning777 · 2019-09-09T20:13:56Z

/lgtm

gaoning777 · 2019-09-09T20:14:01Z

/approve

k8s-ci-robot · 2019-09-09T20:14:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gaoning777, numerology

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~samples/OWNERS~~ [gaoning777]
~~test/sample-test/OWNERS~~ [gaoning777,numerology]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

) * Fix canary rollout falling back to prev rolledout version Signed-off-by: Dan Sun <dsun20@bloomberg.net> * Fix lightgbm model path Signed-off-by: Dan Sun <dsun20@bloomberg.net>

numerology added 7 commits September 4, 2019 11:43

Add logic to detect extension name.

bbd33b3

Rename notebook samples

a4d2ffb

Change to use config yaml for papermill preprocess.

9433eaa

Remove ad hoc logic

13b3524

Remove duplicated logic

28a6bba

Refactor

e431348

Add run_pipeline flag in config yaml

3ed8024

k8s-ci-robot requested review from Ark-kun and gaoning777 September 4, 2019 20:48

k8s-ci-robot added the size/L label Sep 4, 2019

numerology assigned gaoning777 Sep 4, 2019

Add run pipeline flag for .py sample as well.

5a6dc0f

numerology added 2 commits September 4, 2019 15:24

Fix extension name

108387e

Fix

fe05e45

gaoning777 reviewed Sep 5, 2019

View reviewed changes

test/sample-test/check_notebook_results.py Outdated Show resolved Hide resolved

gaoning777 reviewed Sep 5, 2019

View reviewed changes

test/sample-test/run_sample_test.py Outdated Show resolved Hide resolved

Fix problems in docstring.

2a2d11e

gaoning777 reviewed Sep 5, 2019

View reviewed changes

test/sample-test/sample_test_launcher.py Show resolved Hide resolved

refactor run_sample_test.py into two functions

1673bdc

Ark-kun reviewed Sep 6, 2019

View reviewed changes

numerology added 2 commits September 5, 2019 17:19

Refactor the procedure into 3 steps

b308451

Fix bug in exit code format

0af25cd

Merge branch 'master' of https://github.com/kubeflow/pipelines into i…

2bba8dc

…mprove-notebook-check-automation

k8s-ci-robot assigned Ark-kun Sep 6, 2019

k8s-ci-robot added the lgtm label Sep 6, 2019

gaoning777 reviewed Sep 9, 2019

View reviewed changes

test/sample-test/sample_test_launcher.py Outdated Show resolved Hide resolved

gaoning777 reviewed Sep 9, 2019

View reviewed changes

test/sample-test/sample_test_launcher.py Outdated Show resolved Hide resolved

Remove two redundant functions.

529cfcc

k8s-ci-robot removed the lgtm label Sep 9, 2019

k8s-ci-robot added the lgtm label Sep 9, 2019

k8s-ci-robot added the approved label Sep 9, 2019

k8s-ci-robot merged commit 7b1c720 into kubeflow:master Sep 9, 2019

numerology deleted the improve-notebook-check-automation branch September 9, 2019 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve notebook check automation #2040

Improve notebook check automation #2040

numerology commented Sep 4, 2019 •

edited by jlewi

Loading

numerology commented Sep 4, 2019

numerology commented Sep 4, 2019

gaoning777 commented Sep 5, 2019

numerology commented Sep 5, 2019

gaoning777 Sep 5, 2019

numerology Sep 5, 2019

Ark-kun Sep 6, 2019

numerology Sep 6, 2019

numerology Sep 6, 2019

Ark-kun Sep 6, 2019

numerology Sep 6, 2019

Ark-kun Sep 6, 2019

numerology Sep 6, 2019 •

edited

Loading

Ark-kun Sep 6, 2019

gaoning777 Sep 9, 2019

numerology commented Sep 6, 2019

Ark-kun commented Sep 6, 2019

numerology commented Sep 7, 2019

gaoning777 commented Sep 9, 2019

gaoning777 commented Sep 9, 2019

k8s-ci-robot commented Sep 9, 2019

Improve notebook check automation #2040

Improve notebook check automation #2040

Conversation

numerology commented Sep 4, 2019 • edited by jlewi Loading

numerology commented Sep 4, 2019

numerology commented Sep 4, 2019

gaoning777 commented Sep 5, 2019

numerology commented Sep 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

numerology Sep 6, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

numerology commented Sep 6, 2019

Ark-kun commented Sep 6, 2019

numerology commented Sep 7, 2019

gaoning777 commented Sep 9, 2019

gaoning777 commented Sep 9, 2019

k8s-ci-robot commented Sep 9, 2019

numerology commented Sep 4, 2019 •

edited by jlewi

Loading

numerology Sep 6, 2019 •

edited

Loading