Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus grafana & add cirun.io #733

Merged
merged 35 commits into from
Aug 13, 2021
Merged

Prometheus grafana & add cirun.io #733

merged 35 commits into from
Aug 13, 2021

Conversation

Adam-D-Lewis
Copy link
Member

@Adam-D-Lewis Adam-D-Lewis commented Jul 14, 2021

Initial integration of prometheus/grafana into qhub. I tested it locally and it seems to be working. The kubernetes tests appear to be broken for an unrelated reason.

@Adam-D-Lewis Adam-D-Lewis requested a review from aktech July 14, 2021 15:04
@Adam-D-Lewis Adam-D-Lewis requested a review from costrouc July 14, 2021 15:27
@Adam-D-Lewis Adam-D-Lewis linked an issue Jul 14, 2021 that may be closed by this pull request
@Adam-D-Lewis Adam-D-Lewis changed the title Prometheus grafana Prometheus grafana [WIP] Jul 14, 2021
@Adam-D-Lewis Adam-D-Lewis changed the title Prometheus grafana [WIP] Prometheus grafana Jul 14, 2021
@costrouc
Copy link
Member

costrouc commented Jul 14, 2021

I'd prefer if we could pull in the helm charts to minimize the number of files in qhub (also making updates easier) similar to https://github.com/Quansight/qhub/blob/main/qhub/template/%7B%7B%20cookiecutter.repo_directory%20%7D%7D/infrastructure/modules/kubernetes/services/jupyterhub/main.tf#L6. This still pins the chart to a particular version. Is there a reason that we are copying the entire chart locally?

@Adam-D-Lewis
Copy link
Member Author

Adam-D-Lewis commented Jul 14, 2021

@costrouc I was following the example of what was done for the clearml integration (https://github.com/Quansight/qhub/tree/main/qhub/template/%7B%7B%20cookiecutter.repo_directory%20%7D%7D/infrastructure/modules/kubernetes/services/clearml), but I don't think there is anything preventing us from importing the chart. @aktech any thoughts for or against putting the entire helm chart in the repo?

@aktech
Copy link
Member

aktech commented Jul 15, 2021

@Adam-D-Lewis I agree with @costrouc we shouldn't put helm charts in the repository. ClearML was a special case, as we had to modify the chart. Also, since the last update on ClearML, I'll see if we can remove the ClearML charts as well.

@Adam-D-Lewis
Copy link
Member Author

It seems like the following files can be removed:

values-monitoring-qhub.yaml
values-original.yaml
Also the diff between the original and overridden files is just following:

❯ diff values-original.yaml values-monitoring-qhub.yaml
628a629,637

grafana.ini:
server:
# The full public facing url you use in browser, used for redirects and emails
domain: github-actions.qhub.dev
root_url: "%(protocol)s://%(domain)s/monitoring"
server_from_sub_path: true
router_logging: true
enable_gzip: true
732c741
<


Which means you can set the particular values via terraform's helm set attribute.

I made the requested changes.

@aktech
Copy link
Member

aktech commented Aug 3, 2021

Waiting for the tests to pass here: #748 to see if the failed K8s tests in here are relevant or not.

@aktech
Copy link
Member

aktech commented Aug 3, 2021

Seems like the tests pass on main, can you check why k8s tests fail? I can rerun again to confirm,

@Adam-D-Lewis
Copy link
Member Author

Cyprus says there is insufficient cpu on the node with the additional pods in this PR. Maybe I can reduce the used cpu or maybe we can use CIrun

@Adam-D-Lewis
Copy link
Member Author

Blocked by #756

@aktech aktech force-pushed the prometheus_grafana branch from b58c3ba to 7cb1247 Compare August 12, 2021 22:41
aktech and others added 3 commits August 13, 2021 01:32
- move yaml to root
- fix syntax
- install python from miniconda
- install node
- install cypress dependencies
Copy link
Member

@costrouc costrouc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your hard work on this @Adam-D-Lewis. This looks great to me. Looks perfect to me. Also excited to have a bigger CI for testing the Kubernetes cluster. This should allow out tests to be more comprehensive.

@costrouc
Copy link
Member

Could you add an entry to the changelog?

@costrouc costrouc merged commit 265d6c5 into main Aug 13, 2021
@costrouc costrouc deleted the prometheus_grafana branch August 13, 2021 15:30
@tylerpotts tylerpotts mentioned this pull request Aug 19, 2021
3 tasks
@Adam-D-Lewis Adam-D-Lewis linked an issue Aug 31, 2021 that may be closed by this pull request
@aktech aktech changed the title Prometheus grafana Prometheus grafana & add cirun.io May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Monitoring of cluster stats via grafana and prometheus
4 participants