Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump heap from a fork #90

Merged
merged 1 commit into from
Feb 1, 2023
Merged

Dump heap from a fork #90

merged 1 commit into from
Feb 1, 2023

Conversation

casperisfine
Copy link

Dumping the heap can lock the VM for a very long time. If you attempt to do this on a webserver process with a timeout it's very likely you'll hit it and the process might get killed before the dump completes.

Worse, if the process was processing some requests or jobs, they might timeout because of it.

Using fork we can make a snapshot of the heap in a very short time (relative to the dump itself) and savely dump without disrupting the original process.

If you are familiar with how Redis does snapshotting, it's very similar.

@SamSaffron what do you think?

@SamSaffron
Copy link
Collaborator

Yeah I think this is worth a shot, certainly makes it a lot less intrusive .

Shouldn't the CLI block though until the dump is ready?

@casperisfine
Copy link
Author

Shouldn't the CLI block though until the dump is ready?

It doesn't right now, but that wouldn't be hard to do. I could call Process.waitpid in the parent process to block without holding the GVL, that's not a bad idea.

@casperisfine
Copy link
Author

Hum, actually I, the signal handler call Thread.new{ eval(code) }.join, so I think if we're to block, we'd block one application thread? That's what I'd want to avoid.

That said I think it makes sense to spawn a thread so that we cleanly Process.wait to reap the child.

@casperisfine
Copy link
Author

Ok, I updated the implementation, let me know what you think.

Dumping the heap can lock the VM for a very long time.
If you attempt to do this on a webserver process with a timeout
it's very likely you'll hit it and the process might get killed
before the dump completes.

Worse, if the process was processing some requests or jobs, they might
timeout because of it.

Using fork we can make a snapshot of the heap in a very short time
(relative to the dump itself) and savely dump without disrupting the
original process.

If you are familiar with how Redis does sanopshoting, it's very similar.
@casperisfine
Copy link
Author

Hum, I added a little bit more code, because it's true that it's a bit annoying to figure out when it's done dumping.

So now it writes to filename.tmp, and once done, move it to filename. This way you don't have to monitor the file size to know when it's done.

@SamSaffron
Copy link
Collaborator

Yeah I like this, nice one.

@SamSaffron SamSaffron merged commit aa3e716 into tmm1:master Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants