Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Avoid serializing minidumps over procspawn #508

Merged
merged 6 commits into from
Aug 2, 2021

Conversation

Swatinem
Copy link
Member

Instead of passing (and serializing) a byte buffer to procspawn, this will rather create a tempfile and pass that path via procspawn.

This seems to be a small win when processing a 20M minidump, though note that I was timing cargo run -p process-event so timings include overhead from both cargo and uploading of the minidump.

before:
Time (mean ± σ): 1.611 s ± 0.041 s [User: 274.8 ms, System: 102.0 ms]
Range (min … max): 1.548 s … 1.697 s 10 runs
after:
Time (mean ± σ): 1.526 s ± 0.024 s [User: 275.7 ms, System: 101.5 ms]
Range (min … max): 1.496 s … 1.573 s 10 runs

Due to some error handling and borrowcheck shenanigans, the minidumps are now persisted in the diagnostics cache for all errors during processing, while previously timeouts were ignored.

@Swatinem Swatinem requested a review from a team July 30, 2021 09:08
Comment on lines 1932 to 1934
sentry::protocol::Value::String(
path.to_string_lossy().to_string(),
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, we should make the SDK nicer to use ;)

Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Rather than passing in a named temp file, pass around the File, if possible. The OS should still keep it open.
  2. Stream into a file directly in the endpoint, then pass around the file. @flub we've had various versions of this in the past already, do you remember where this was?

@flub
Copy link
Contributor

flub commented Jul 30, 2021

  1. Stream into a file directly in the endpoint, then pass around the file. @flub we've had various versions of this in the past already, do you remember where this was?

hmm, no i don't. sorry

@Swatinem
Copy link
Member Author

now streaming directly to a file (and updating rustc), I get these timings, though I noticed they have high variability:

before:
Time (mean ± σ): 1.694 s ± 0.028 s [User: 288.2 ms, System: 106.9 ms]
Range (min … max): 1.656 s … 1.749 s 10 runs
after:
Time (mean ± σ): 1.621 s ± 0.030 s [User: 294.3 ms, System: 110.5 ms]
Range (min … max): 1.585 s … 1.676 s 10 runs

// constructor on `ProcessResult`. Passing in an empty buffer should result in
// the same error though.
let minidump =
ByteView::open(minidump_path).unwrap_or_else(|_| ByteView::from_slice(b""));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty buffer will now result in a generic "failed to read miniump" error with little information for debugging.

Is there no way you can wrap the error somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "minidump not found" is the appropriate error here. Also, not sure what to wrap this with, as also everything needs to go through serde, which std::io::Error does not.

crates/symbolicator/src/endpoints/minidump.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some elegance to writing it to disk straight away... but I'm not sure I really like this more than the solution before which didn't have this filesystem problem because it could directly save in the diagnostics cache. Though thinking of it, this made it impossible to detect what files in the diagnostics cache were actually failed and which ones were life, so maybe that was also not a good solution.

Sorry, only creating problems here it seems, not providing solutions 😕

Comment on lines +1913 to +1916
if let Some(dir) = self.diagnostics_cache.cache_dir() {
if let Some(file_name) = minidump_file.file_name() {
let path = dir.join(file_name);
match minidump_file.persist(&path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Temporary files cannot be persisted across filesystems.

Are we sure this is never going to be the case? It often is the case that /tmp is a different filesystem. And afaik our cache directory is a persistent volume so very likely to be a different filesystem to wherever tempfile defaults.

/cc @beezz: can you confirm whether this would work or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to save the files in $CACHE/tmp which is also used for other tmp files already, so that won’t run into this particular problem.

@Swatinem Swatinem merged commit 11d5ce7 into master Aug 2, 2021
@Swatinem Swatinem deleted the fix/avoid-minidump-serde branch August 2, 2021 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants