Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update xgboost_synthetic to 0.7 #655

Closed
jlewi opened this issue Oct 9, 2019 · 5 comments
Closed

Update xgboost_synthetic to 0.7 #655

jlewi opened this issue Oct 9, 2019 · 5 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Oct 9, 2019

In 0.7 we will use workload identity.

As such notebooks should no longer need to use/set GOOGLE_APPLICATION_CREDENTIALS

The notebook
https://github.com/kubeflow/examples/blob/master/xgboost_synthetic/build-train-deploy.ipynb

Is currently checking GOOGLE_APPLICATION_CREDENTIALS we will need to update that code to work with workload identity.

P0 because this is part of our demo script for 0.7.

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label kind/feature to this issue, with a confidence of 0.92. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@kunmingg
Copy link
Contributor

kunmingg commented Oct 17, 2019

Saw following error with image: gcr.io/kubeflow-images-public/tensorflow-1.14.0-notebook-cpu:v0.7.0
Seems some dependency are missing?


NameError Traceback (most recent call last)
in
----> 1 model = ModelServe(model_file="mockup-model.dat")
2 model.train()

in init(self, model_file)
17 self.model = None
18 self._workspace = None
---> 19 self.exec = self.create_execution()
20
21 def train(self):

in create_execution(self)
88
89 def create_execution(self):
---> 90 r = metadata.Run(
91 workspace=self.workspace,
92 name="xgboost-synthetic-faring-run" + datetime.utcnow().isoformat("T"),

NameError: name 'metadata' is not defined

jlewi pushed a commit to jlewi/examples that referenced this issue Oct 24, 2019
* Related to kubeflow#655 update xgboost_synthetic to use workload identity

* Related to to kubeflow#665 no signal about xgboost_synthetic

* We need to update the xgboost_synthetic example to work with 0.7.0;
  e.g. workload identity

* This PR focuses on updating the test infra and some preliminary
  updates the notebook

* More fixes to the test and the notebook are probably needed in order
  to get it to actually pass

* Update job spec for 0.7; remove the secret and set the default service
  account.

  * This is to make it work with workload identity

* Instead of using kustomize to define the job to run the notebook we can just modify the YAML spec using python.
* Use the python API for K8s to create the job rather than shelling out.

* Notebook should do a 0.7 compatible check for credentials

  * We don't want to assume GOOGLE_APPLICATION_CREDENTIALS is set
    because we will be using workload identity.

* Take in repos as an argument akin to what checkout_repos.sh requires

* Convert xgboost_test.py to a pytest.

  * This allows us to mark it as expected to fail so we can start to get
    signal without blocking

  * We also need to emit junit files to show up in test grid.

* Convert the jsonnet workflow for the E2E test to a python function to
  define the workflow.

  * Remove the old jsonnet workflow.
@krishnadurai
Copy link

krishnadurai commented Oct 24, 2019

Just another point to note:
I was testing with Istio 1.3.1 and ran into an issue installing the python package retrying (Step 2) saying the user does not have permissions to install it.

@jlewi jlewi changed the title Update xgboost_synthetic to use workload identiy Update xgboost_synthetic to 0.7 Oct 25, 2019
k8s-ci-robot pushed a commit that referenced this issue Oct 25, 2019
… 0.7.0 (#666)

* Update xgboost_synthetic test infra to use pytest and pyfunc.

* Related to #655 update xgboost_synthetic to use workload identity

* Related to to #665 no signal about xgboost_synthetic

* We need to update the xgboost_synthetic example to work with 0.7.0;
  e.g. workload identity

* This PR focuses on updating the test infra and some preliminary
  updates the notebook

* More fixes to the test and the notebook are probably needed in order
  to get it to actually pass

* Update job spec for 0.7; remove the secret and set the default service
  account.

  * This is to make it work with workload identity

* Instead of using kustomize to define the job to run the notebook we can just modify the YAML spec using python.
* Use the python API for K8s to create the job rather than shelling out.

* Notebook should do a 0.7 compatible check for credentials

  * We don't want to assume GOOGLE_APPLICATION_CREDENTIALS is set
    because we will be using workload identity.

* Take in repos as an argument akin to what checkout_repos.sh requires

* Convert xgboost_test.py to a pytest.

  * This allows us to mark it as expected to fail so we can start to get
    signal without blocking

  * We also need to emit junit files to show up in test grid.

* Convert the jsonnet workflow for the E2E test to a python function to
  define the workflow.

  * Remove the old jsonnet workflow.

* Address comments.

* Fix issues with the notebook
* Install pip packages in user space
  * 0.7.0 images are based on TF images and they have different permissions
* Install a newer version of fairing sdk that works with workload identity

* Split pip installing dependencies out of util.py and into notebook_setup.py

  * That's because util.py could depend on the packages being installed by
    notebook_setup.py

* After pip installing the modules into user space; we need to add the local
  path for pip packages to the python otherwise we get import not found
  errors.
@jlewi
Copy link
Contributor Author

jlewi commented Nov 8, 2019

#676 should hopefully fix the test

There is however one more issue with model deployment #673

jlewi pushed a commit to jlewi/examples that referenced this issue Nov 25, 2019
* install newer version of fairing
* modify preprocessor to use custom dockerfile
* use newer 0.7 base image.
* Fix endpoint.

Related to:

kubeflow#673 model doesn't deploy its crash looping
Related to kubeflow#655 update example to work with 0.7
k8s-ci-robot pushed a commit that referenced this issue Nov 25, 2019
#682)

* Fix issues with the xgboost_synthetic example and deploying the model.

* install newer version of fairing
* modify preprocessor to use custom dockerfile
* use newer 0.7 base image.
* Fix endpoint.

Related to:

#673 model doesn't deploy its crash looping
Related to #655 update example to work with 0.7

* Add some comments to the notebook.
@stale
Copy link

stale bot commented Feb 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Feb 6, 2020
@stale stale bot closed this as completed Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants