Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment documentation is broken #129

Closed
tejarora opened this issue Mar 11, 2022 · 6 comments
Closed

Deployment documentation is broken #129

tejarora opened this issue Mar 11, 2022 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@tejarora
Copy link

tejarora commented Mar 11, 2022

What is the URL of the document?
Issues pertain to 3 documents (3 different branches)

  1. https://github.com/awslabs/kubeflow-manifests/tree/main/docs/deployment/vanilla
  2. https://github.com/awslabs/kubeflow-manifests/tree/release-v1.3.1-aws-b1.0.0/distributions/aws/examples/vanilla
  3. https://github.com/awslabs/kubeflow-manifests/tree/v1.3-branch/distributions/aws/examples/vanilla

Which section(s) is the issue in?
See below

What needs fixing and describe the solution you'd like?
(a)
All 3 URLs above have a vague and incomplete step
"AWS IAM permissions to create roles and attach policies to roles."

(b)
The v1.3 branches have no configuration for kubeflow version, but the main branch does. In fact the main branch talks about cloning the kubeflow repo, but the branch documents do not.
How does one one set kubeflow version using v1.3* branches? Which branch is prescribed for deployment?
This is a mess

(c) The document for the branch release-v1.3.1-aws-b* has a step that is asking to use a different branch v1.3-branch

(d) The documents for the v1.3 branches have their deployment step as:
while ! kustomize build example
Is this deployment not for serious use? Is it just for experimentation and not ready for production??

(e) The v1.3 documents talk about changing a password, but there is no file whatsoever on the branches that contains the password. There is reference to a dex/base/config-map.yaml but this file does not exist in the branches. It actually exists in the kubeflow repo, which the branch documentation does not talk about at all....

Additional context
I would like to use kubeflow on a production EKS cluster, but the only documentation I found to help with that is well short of expectations.

@tejarora tejarora added the documentation Improvements or additions to documentation label Mar 11, 2022
@goswamig
Copy link
Member

@tejarora thanks for reporting this.

a) This is just a general pre-requisite that user/role which are trying to deploy the kubeflow has permission to create role and attach polices. This is the IAM user/role who is deploying the kubeflow, which could be "you (mac)" if "you(mac)" are trying to deploy kubeflow on your eks cluster or it could be "your(Ec2)" instance from where you are deploying kubeflow.

b/c) the git clone instruction has checkout branch

 git checkout v1.3-branch 

d) This step has been borrowed from official guide, however its basically retry the deployment if it fails. You can always opt for multiple steps deployment.

e) You're right, this repo does not contain that file, because this repo is manifest hosting repo. Manifests pulls different files/containers as it does the deployment. Agree that the instruction here is a bit misleading.

We will work on improving the documentation.

@Harikantipudi
Copy link

Harikantipudi commented Mar 12, 2022

@goswamig

Also the V1.4-branch-stale , does that mean all the future releases will now be from main and the check out to respective branch

From the below , what will AWS_MANIFESTS_BUILD value for 1.4/1.5 be considering , I now see only below branches , not sure if i am missing something here please ??

v1.3-branch, v1.3.1-aws-b1.0.0, [v1.4-branch-stale

Clone the awslabs/kubeflow-manifest repo, kubeflow/manifests repo and checkout the desired branches Substitute the value for KUBEFLOW_RELEASE_VERSION(e.g. v1.4.1) and AWS_MANIFESTS_BUILD(e.g. v1.4.1-b1.0.0) with the branch or tag you want to use

export KUBEFLOW_RELEASE_VERSION=<>
export AWS_MANIFESTS_BUILD=<>
git clone https://github.com/awslabs/kubeflow-manifests.git
cd kubeflow-manifests
git checkout ${AWS_MANIFESTS_BUILD}
git clone --branch ${KUBEFLOW_RELEASE_VERSION} https://github.com/kubeflow/manifests.git upstream

@ryansteakley
Copy link
Contributor

@Harikantipudi Hello, the AWS_MANIFESTS_BUILD values for 1.4/1.5 are not out as we have not released official support for those versions. As your question about V1.4-branch-stale all work for v1.4 is now occuring in the main branch. We are planning on releasing v1.4 support soon, you can track our progress here #27

surajkota added a commit that referenced this issue Mar 18, 2022
**Which issue is resolved by this Pull Request:**
Resolves #129

**Testing:**
- Cloning the branch works

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
surajkota added a commit that referenced this issue Mar 18, 2022
**Which issue is resolved by this Pull Request:**
Resolves #76, few questions from #129

**Description of your changes:**
- Add releases and versioning documentation and clarification in prerequisites
- Removes IAM instruction which does not add value. If we do need it, it needs to be many more policies and specific. Out of scope for this PR
- added dex configmap path relative to root of repo like other instructions

**Testing:**
- git clone works

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
@surajkota
Copy link
Contributor

surajkota commented Mar 18, 2022

Hi @tejarora, @Harikantipudi

initial question and (b): little bit of background here:

(a) Removed this instruction since it is not adding value(#141, #142). Various AWS resources like eks cluster, iam roles, rds, s3 are created depending on deployment option and each service already has their own documentation on the required policies

(c) Thanks for pointing the issue. Instructions do advice to checkout a release tag but the code is out of sync. This was a miss from our side but since the tag is cut, we wont we able to modify it(we are in process of migrating docs from kubeflow website). I have updated the branch for now which should help with future releases.

(e) addressed in #142 and #146 and created a PR for upstream where there error originally came from kubeflow/manifests#2179

Closing this issue. Please reopen or create a new one in case you still face issues

@surajkota
Copy link
Contributor

I have also added a badge on the top of main readme in the main branch which would reflect which Kubeflow version is being developed - https://github.com/awslabs/kubeflow-manifests/blob/main/README.md

@surajkota surajkota reopened this Mar 21, 2022
@surajkota surajkota self-assigned this Mar 24, 2022
judyheflin pushed a commit to judyheflin/kubeflow-manifests that referenced this issue Apr 20, 2022
@surajkota
Copy link
Contributor

All the feedback has been addressed and documentation is now available on the new website which should streamline the experience: https://awslabs.github.io/kubeflow-manifests/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants