Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak detected (in Kubernetes) #26785

Closed
balazsorban44 opened this issue Jun 30, 2021 · 1 comment
Closed

Memory leak detected (in Kubernetes) #26785

balazsorban44 opened this issue Jun 30, 2021 · 1 comment
Labels
bug Issue was opened via the bug report template.

Comments

@balazsorban44
Copy link
Member

balazsorban44 commented Jun 30, 2021

What version of Next.js are you using?

10.2.3

What version of Node.js are you using?

14.6.1

What browser are you using?

Chrome (probably irrelevant)

What operating system are you using?

Ubuntu 18.04 (see description for more info)

How are you deploying your application?

Azure Kubernetes Service (AKS)

Describe the Bug

Recently we went live with our new Next.js site, where we have ~200-300 users/hour traffic at peak times (8.00 - 16.00), and almost none (probably less than 10) throughout the rest of the day.

Right after deployment, the K8S pod started consuming increasingly more and more memory. See the image below:

The interruptions are points where the pod itself either decided to restart, or we tried to fix the problem by pushing changes (see below). The memory usage drops to an initially low value when restarting, but it starts creeping up again. It seems to happen out of our peak hours as well (although most obvious during higher traffic).

I searched through issues and PRs, here are some potentially relevant ones:

We made sure to be on a version that included the fixes (10.2.3), and I replaced all our Image components with img. The problem did not disappear.

I can confirm that the memory leak comes from next because I can ssh into the pod and using pmap, I read that ~80-90% of the RAM is allocated to the process running next.

If anyone has any tips on how to debug this further, that would be very helpful!

Expected Behavior

I expect the RAM usage not to grow continuously, especially when there is a very low number of users out of peak hours.

To Reproduce

Using AKSUbuntu-1804containerd-2021.06.02 which has the following content:
https://github.com/Azure/AKS/blob/2021-06-03/vhd-notes/aks-ubuntu/AKSUbuntu-1804/2021.06.02.txt

I would like to provide even more information, but our codebase is private. I could try to ask us to set up a session/meeting if someone would be willing to help.

@balazsorban44 balazsorban44 added the bug Issue was opened via the bug report template. label Jun 30, 2021
@balazsorban44 balazsorban44 changed the title Memory leak detected in Kubernetes Memory leak detected (in Kubernetes) Jun 30, 2021
@jamsinclair
Copy link
Contributor

jamsinclair commented Jul 1, 2021

@balazsorban44 the best next step is to start capturing heap dumps/snapshots to see what is being retained in the memory and contributing to the large size.

You're using node.js v14 so you should be able to make use of the native API v8.getHeapSnapshot(), see https://dev.to/bengl/node-js-heap-dumps-in-2021-5akm. You'll also want to read up on other articles on how to analyze leaks from the snapshot data.

If you can't replicate the leak locally you could try:

  1. Create a private prod-like deployment
  2. Run load tests against it, similarly to your production traffic. I can recommend the tool k6
  3. Periodically capture heap snapshots from your app code
  4. Download the snapshots and analyse the objects in memory

The thing to diagnose is whether the leaking memory is related to your app code or a problem in Next.js itself. Best of luck! 🤞 (The next process is running your code, so it doesn't necessarily mean the problem is Next 😉 )

@vercel vercel locked and limited conversation to collaborators Jul 1, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Issue was opened via the bug report template.
Projects
None yet
Development

No branches or pull requests

3 participants