[BUG] - Installing nebari locally (local with kind) fails. #1703

twaclaw · 2023-04-08T10:47:36Z

Describe the bug

Deploying nebari locally fails.

The health check for https://domain/argo/ fails.
Other endpoints work fine, and argo worksflows seems to be installed properly.

log3.8.txt

Expected behavior

Installation should go through.

OS and architecture in which you are running Nebari

Archlinux: 6.2.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 22 Mar 2023 22:52:35 +0000 x86_64 GNU/Linux. I tried with both Python 3.8 and 3.11.

How to Reproduce the problem?

Follow the steps in:
https://www.nebari.dev/docs/how-tos/nebari-local

Command output

nebari init local  --project projectname  --domain domain  --auth-provider password  --terraform-state=local

nebari deploy -c nebari-config.yaml --disable-prompt |tee log3.8.txt


[terraform]:   "keycloak" = {
[terraform]:     "health_url" = "https://domain/auth/realms/master"
[terraform]:     "url" = "https://domain/auth/"
[terraform]:   }
[terraform]:   "monitoring" = {
[terraform]:     "health_url" = "https://domain/monitoring/api/health"
[terraform]:     "url" = "https://domain/monitoring/"
[terraform]:   }
[terraform]: }
Attempt 1 health check failed for url=https://domain/argo/
Attempt 2 health check failed for url=https://domain/argo/
Attempt 3 health check failed for url=https://domain/argo/
Attempt 4 health check failed for url=https://domain/argo/
Attempt 5 health check failed for url=https://domain/argo/
Attempt 6 health check failed for url=https://domain/argo/
Attempt 7 health check failed for url=https://domain/argo/
Attempt 8 health check failed for url=https://domain/argo/
Attempt 9 health check failed for url=https://domain/argo/
Attempt 10 health check failed for url=https://domain/argo/
ERROR: Service argo-workflows DOWN when checking url=https://domain/argo/

Versions and dependencies used.

kind: 0.18.0

kubectl:
Client Version: v1.26.3
Kustomize Version: v4.5.7
Server Version: v1.21.10

nebari: 2023.1.1

Compute environment

kind

Integrations

Argo

Anything else?

No response

The text was updated successfully, but these errors were encountered:

dharhas · 2023-04-10T22:12:25Z

@pmeier is this the same issue you faced recently?

pmeier · 2023-04-11T09:28:48Z

Most likely. Just to confirm @twaclaw: if you create a new user and try to login, are you also seeing a HTTP 500 error?

This is happening, because we only patch /etc/hosts and thus our browser on the host knows how to handle this. However, this change does not apply to the pods inside the cluster and thus they fail to resolve the URL. We don't see this in CI, because we have a permanent DNS entry for

nebari/.github/workflows/kubernetes_test.yaml

Line 95 in daecbcf

sudo echo "172.18.1.100 github-actions.nebari.dev" | sudo tee -a /etc/hosts

As discussed in our last sync, we need to eliminate this since other users might not have access to a domain and more important shouldn't need to to deploy nebari locally.

@twaclaw As a quick workaround, destroy the cluster with nebari destroy -c nebari-config.yaml and rerun nebari init local with --domain 172.18.1.100. This should work here, but won't in 100% of the cases (see #1707).

If you drop the --disable-prompt flag from nebari deploy, you should see this message roughly 50% in:

Take IP Address 172.18.1.100 and update DNS to point to "172.18.1.100" [Press Enter when Complete]

If the IPs match, just confirm and your cluster should start up without issues. You can also remove the entry from /etc/hosts now and access the web UI through the IP directly.

twaclaw · 2023-04-11T11:16:39Z

@pmeier, thanks for the workaround. I confirm that passing --domain IP solves the issue.

twaclaw added needs: triage 🚦 Someone needs to have a look at this issue and triage type: bug 🐛 Something isn't working labels Apr 8, 2023

github-project-automation bot added this to 🪴 Nebari Project Management Apr 8, 2023

github-project-automation bot moved this to New 📬 in 🪴 Nebari Project Management Apr 8, 2023

twaclaw changed the title ~~[BUG] - <title>~~ [BUG] - Installing nebari locally (local with kind) fails. Apr 8, 2023

pmeier mentioned this issue Apr 11, 2023

[BUG] - local deploy cannot guaranteed be done without having access to a domain #1707

Closed

twaclaw closed this as completed Apr 11, 2023

github-project-automation bot moved this from New 📬 to Done 💪🏾 in 🪴 Nebari Project Management Apr 11, 2023

This was referenced Apr 14, 2023

don't set /etc/hosts #1728

Closed

don't set /etc/hosts in CI #1729

Merged

Adam-D-Lewis removed the needs: triage 🚦 Someone needs to have a look at this issue and triage label Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] - Installing nebari locally (local with kind) fails. #1703

[BUG] - Installing nebari locally (local with kind) fails. #1703

twaclaw commented Apr 8, 2023 •

edited

Loading

dharhas commented Apr 10, 2023

pmeier commented Apr 11, 2023 •

edited

Loading

twaclaw commented Apr 11, 2023

[BUG] - Installing nebari locally (local with kind) fails. #1703

[BUG] - Installing nebari locally (local with kind) fails. #1703

Comments

twaclaw commented Apr 8, 2023 • edited Loading

Describe the bug

Expected behavior

OS and architecture in which you are running Nebari

How to Reproduce the problem?

Command output

Versions and dependencies used.

Compute environment

Integrations

Anything else?

dharhas commented Apr 10, 2023

pmeier commented Apr 11, 2023 • edited Loading

twaclaw commented Apr 11, 2023

twaclaw commented Apr 8, 2023 •

edited

Loading

pmeier commented Apr 11, 2023 •

edited

Loading