Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more Prometheus metrics #3682

Open
yuvipanda opened this issue Jun 13, 2018 · 27 comments
Open

Add more Prometheus metrics #3682

yuvipanda opened this issue Jun 13, 2018 · 27 comments

Comments

@yuvipanda
Copy link
Contributor

Now that #3490 has been merged, we should add more prometheus metrics to the notebook server!

Some ideas for metrics to add...

  1. Number of kernels running (labeled by type)
  2. Number of sessions open (not sure if this is useful?)
  3. Number of terminals open
  4. Mirror of the activity tracking metrics
  5. Kernel start / stop latency metrics
  6. 0mq metrics

Am sure there's more that I don't know of!

@yuvipanda
Copy link
Contributor Author

Can someone with rights mark this as 'good first issue'? :)

@dhirschfeld
Copy link
Contributor

I'm interested in RAM - per kernel and total. Not sure if that's already available though?

@rgbkrk
Copy link
Member

rgbkrk commented Jun 13, 2018

There's some light amount of collaboration around RAM on a kernel level in jupyter/jupyter#264, though it's on a spec level (especially since the actual kernel make be several child processes deep). @ivanov would probably enjoy having a collaborator on the Python side -- I'm looking forward to the UI portion of using it. The notebook server can use it as well since it has access to the messages as it transports them from ZeroMQ to WebSocket.

@yuvipanda
Copy link
Contributor Author

@dhirschfeld @rgbkrk we could possibly also do it from the default Kernel Manager, since it is just spawning local processes and knows how to collect metrics for them (and their children). This lets other kernel managers report their own metrics as they wish, and works across all kernels without any extra work. It would be complimentary to jupyter/jupyter#264.

@GoelJatin
Copy link

Hi @yuvipanda ,

If no one is working on this, then I would like to take it up.

I would like to start with the very first one for now.

But have a few doubts.

Do we want the number of kernels running at the time when the API is called / keep on collecting them over the complete period of time since the notebook server was started?

Please let me know accordingly.

CC @Madhu94

@GoelJatin
Copy link

Guys,

Any update on this?

@manuhortet
Copy link

Hey @GoelJatin, I'm up to work on this with you.

I'm interested in RAM - per kernel and total. Not sure if that's already available though?

As I think this will be the most useful metric to add, may I take it? :)

Will try to do it from the default Kernel Manager, as mentioned by @yuvipanda

@GoelJatin
Copy link

Hey @manuhortet , sure go ahead.

No concerns from my end. :)

@LiryChen
Copy link

LiryChen commented Sep 6, 2018

Hey, I am a first-timer looking for tasks to do too. I found this issue pretty interesting. Anything I can help with?

@konnermacias
Copy link

@manuhortet Have you been able to make any progress on adding that metric? I'm a first-timer as well and would love to help out!

@manuhortet
Copy link

manuhortet commented Sep 11, 2018

Hey! Sincerely I've been delaying some OS contributions in order to gain time for personal projects. I'm sorry I delayed you two too doing that! You can take this issue if you want to. In fact, feel free to ask me if you face any problems. Good luck! @konnermacias @LiryChen

@LiryChen
Copy link

Alright thanks! I see look into the issue and may ask you a few questions to understand the problem!

@Hyaxia
Copy link
Contributor

Hyaxia commented Sep 19, 2018

@manuhortet Hey, I would be glad to try and help too.
Not sure how all of this is done, but can I just choose one of the options above and start working on it?
Is there something left to work on?

@manuhortet
Copy link

@Hyaxia of course, you can. Choose some metric you feel relevant from the first comment on this issue and go for it.
I guess there are already people working on the "RAM - per kernel and total" one, so I'd try to avoid that one!

@LiryChen
Copy link

LiryChen commented Oct 2, 2018

Sorry, I don't think I will continue to work on this due to the limited time I have besides school :( I would like to pick something up in the future once I have more free time!

@Hyaxia
Copy link
Contributor

Hyaxia commented Oct 2, 2018

Few questions.
First, I wanted to clarify something about number 4.
Does the kind of tracking that is talked about is tracking the very last thing that the user did and its timestamp?

Second, is anyone still working on the RAM per kernel?

Third, what does number 6 mean?

Thanks.

@manuhortet
Copy link

For number 4, the last done thing and timestamp would be the logical thing IMO.
Don't think there's anyone working on the RAM per kernel metric, maybe @konnermacias ?

Can't really help on the explanation for number 6. Some help here @yuvipanda ?

@konnermacias
Copy link

@manuhortet I apologize, school has picked up and I was planning on working on it in a week or two when eveyrthing dies down. @Hyaxia feel free to go for it!

@Hyaxia
Copy link
Contributor

Hyaxia commented Oct 4, 2018

Ok then, I will start working on the ram per kernel metric in a few days.

@vinaycalastry
Copy link

Hello.. This issue seems to be open and there has been no status change since October. Can we get the current status on this please ?

@santosh2702
Copy link

anything i can help with

@Hyaxia
Copy link
Contributor

Hyaxia commented Oct 1, 2019

So I guess I'm the last one who was working on it.
When I opened the PR #4075, @minrk pointed out that it should go into the jupyter_client project.
Then I opened a PR in jupyter_client jupyter/jupyter_client#407 and you can read yourselves to the point we stopped.

TL;DR - goto jupyter/jupyter_client#407 , you should implement some kind of generic way to expose different kernel statistics.
I started to work on it in the PR, but just didn't have the time to finish it.
As the last comments there state, check how to do it using entry points.
For further information just read the comments in the PR itself, all of the information I had at that time is there.

GL.

@Franky12
Copy link

Hi can I try this? looking for some beginner-friendly issues

@sudo-k-runner
Copy link

Hey! not sure if anyone is still looking into this? I would like to work on this.

@kevin-bates
Copy link
Member

Hi @sudo-k-runner - thank you for your interest. In light of the fact that the primary notebook server will eventually be based on the jupyter server project, you will likely find better traction on metrics gathering via the jupyter telemetry project - which the jupyter server plans on utilizing.

@dhivyasreedhar
Copy link

Hi, can I work on this? I'm new so can someone guide me?

@kevin-bates
Copy link
Member

Hi @dhivyasreedhar - please see the previous comment. This repository is currently focused on bug fixes and security issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests