Kernel ram metric #4075

Hyaxia · 2018-10-07T23:47:21Z

I have added a metric for the RAM per kernel.
Each kernel appears alone identified by his id and type.
I have chose to implement the monitoring by a background task with the tornado framework.
As soon as you start the notebook it will all the running kernels.
Every 0.2 seconds it will set the RAM usage of the specific kernel to what is uses at that moment.
Additionally, the way i check the ram of each process is with the psutil package.
I have found that this was the easiest way, if you guys have something else in mind please

relevant issue #3682

update fork

Introducing the 'psutil' package as a new dependency. Added another 'Gauge' to the 'metrics.py' module. Added to the __init__ 'MappingKernelManager' a call to the function that will run the task of updating the metric as a background task. (NOT SURE IF THE WAY THAT THE BACKGROUND TASK IN IMPLEMENTED IS THE RIGHT ONE, BUT THAT IS WHAT I HAVE FOUND)

…p.py

kevin-bates

I really like the idea behind this feature - thank you for working on it! As I mention, this will break remote kernel applications like Jupyter Enterprise Gateway, but we could get to a good spot if there's a way to override capture.

Since this is potentially destabilizing, we should probably have it disabled by default.

Thanks.

kevin-bates · 2018-10-08T23:10:02Z

notebook/services/kernels/kernelmanager.py

+    @gen.coroutine
+    def kernel_ram_monitoring(self):
+        # Arbitrary time for waiting till next metric collection
+        yield gen.sleep(0.2)


Five passes per second seems a little excessive to me. I suspect that on highly loaded servers (with lots of kernels running), the set of kernels may not get completed before the next iteration will occur. If this interval could be configured, that would be nice. Seems like a default of 1 second would be frequent enough for the normal case (of a couple kernels).

kevin-bates · 2018-10-08T23:26:12Z

notebook/services/kernels/kernelmanager.py

+        # Update the relevant kernel's ram usage metric
+        def update_ram_metric():
+            pid = kernel.kernel.pid
+            kernel_process = psutil.Process(pid)


This is assuming that all kernels are local to the server - which is not the case for Enterprise Gateway - which derives its kernel management from these classes. In this case, EG will break since the launching process is no longer running.

I would recommend moving the code that gets the memory usage to a method on the individual kernel manager class - this way child classes can do their own means of capturing memory usage. Unfortunately, Notebook doesn't define a kernel manager class. If one was added, it would derive from jupyer_client's IOLoopKernelManager - then we'd need to fix up the hierarchy usage in Jupyter Kernel and Enterprise Gateway. I'd be happy to help with this if others agree. The other option is to abstract the kernel class that represents the process. We do this in Enterprise Gateway with our process proxy classes, but that's a bit more work. We'd then have a 'hook' to place capture code for our remote kernels.

Any way we can make the capture of per-kernel info more abstract would be great.

In the meantime, we should probably make this feature optional and, preferably, off by default so as to not break applications.

kevin-bates

I really like the idea behind this feature - thank you for working on it! As I mention, this will break remote kernel applications like Jupyter Enterprise Gateway, but we could get to a good spot if there's a way to override capture.

Since this is potentially destabilizing, we should probably have it disabled by default.

Thanks.

minrk · 2018-10-10T13:17:54Z

This should probably be implemented at the lower, KernelManager level in jupyter_client, so that when an implementation is swapped out, it can override the appropriate method.

Hyaxia · 2018-10-15T07:19:07Z

Thabk you both, @minrk & @kevin-bates.
I will check the jupyter client.

Hyaxia and others added 3 commits October 3, 2018 00:10

Merge pull request #1 from jupyter/master

1bc43f1

update fork

Added the 'psutil' package to the dependencies of the project in setu…

f8fb421

…p.py

kevin-bates reviewed Oct 8, 2018

View reviewed changes

Hyaxia closed this Oct 26, 2018

Hyaxia mentioned this pull request Nov 27, 2018

[WIP]Add method for kernel manager to retrieve statistics jupyter/jupyter_client#407

Closed

Hyaxia mentioned this pull request Oct 1, 2019

Add more Prometheus metrics #3682

Open

kevin-bates mentioned this pull request Jul 7, 2020

Jupyter Resource Usage Roadmap jupyter-server/jupyter-resource-usage#58

Open

13 tasks

github-actions bot added the status:resolved-locked label Mar 30, 2021

github-actions bot locked as resolved and limited conversation to collaborators Mar 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel ram metric #4075

Kernel ram metric #4075

Hyaxia commented Oct 7, 2018

kevin-bates left a comment

kevin-bates Oct 8, 2018

kevin-bates Oct 8, 2018

kevin-bates left a comment

minrk commented Oct 10, 2018

Hyaxia commented Oct 15, 2018

Kernel ram metric #4075

Kernel ram metric #4075

Conversation

Hyaxia commented Oct 7, 2018

kevin-bates left a comment

Choose a reason for hiding this comment

kevin-bates Oct 8, 2018

Choose a reason for hiding this comment

kevin-bates Oct 8, 2018

Choose a reason for hiding this comment

kevin-bates left a comment

Choose a reason for hiding this comment

minrk commented Oct 10, 2018

Hyaxia commented Oct 15, 2018