-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] - resource leak #3402
[fix] - resource leak #3402
Conversation
@@ -285,14 +285,9 @@ func HandleFile( | |||
} | |||
defer func() { | |||
// Ensure all data is read to prevent broken pipe. | |||
_, copyErr := io.Copy(io.Discard, rdr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only necessary when the underlying reader is a pipe. If the reader isn't fully consumed, cmd.Wait()
hangs indefinitely. Since HandleFile
is used by other parts of the codebase with different readers, the logic to consume the reader (if it's a pipe) has been moved to the caller.
eef8f1d
to
da46890
Compare
pkg/sources/git/git.go
Outdated
func combineErrors(err1, err2 error) error { | ||
switch { | ||
case err1 != nil && err2 != nil: | ||
return fmt.Errorf("%w; %w", err1, err2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, you can do this? Which error is returned when you call Unwrap()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you could do this—I was wrong. I can just go with errors.Join
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nvm 🤣 Just read In particular Unwrap does not unwrap errors returned by [Join](https://pkg.go.dev/errors#Join).
I guess it depends on if we plan on calling Unwrap
or using errors.Is
.
err = combineErrors(err, fmt.Errorf("command execution error: %w", waitErr)) | ||
}() | ||
|
||
return handlers.HandleFile(catFileCtx, stdout, chunkSkel, reporter, handlers.WithSkipArchives(s.skipArchives)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does HandleFile
call rdr.Close()
(and in this case rdr
is stdout
)? If yes, is it an issue to read from stdout
after Close()
has been called on it (in the defer
block above)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but the HandleFile
function does not directly close stdout
. Instead, it wraps stdout
in a BufferedReadSeeker
, and when HandleFile
calls rdr.Close()
, it only closes the BufferedReadSeeker
, not stdout
. The Close()
method of BufferedReadSeeker
manages internal resources like returning buffers to a pool or closing temporary files without affecting the underlying stdout
. here
So stdout
remains open until cmd.Wait()
completes in handleBinary
. The deferred function in handleBinary
ensures proper cleanup by reading any remaining data from stdout
and waiting for the command to finish, which effectively closes stdout
.
I feel like this approach effectively separates ownership: HandleFile
manages the custom reader, while handleBinary
is responsible for handling and closing stdout
via cmd.Wait()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so HandleFile
takes in an io.Reader
, not io.ReadCloser
. Closing is up to the caller.
Yeah, this is a great approach!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be an io.ReadCloser
, but I confused myself a few times by mistakenly closing the passed-in reader and then trying to close it again at the call site. That's when I decided to change it to an io.Reader
and have the callers handle closing. I think that's maybe why you also thought rdr.Close()
closed the underlying reader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! CommandContext is useful, we should update all usage to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am liking the clearer API boundaries, this is really nice!
Description:
This PR fixes an issue that can cause resource exhaustion when scanning binaries in Git. It ensures the
stdoutPipe
reader passed toHandleFile
is always drained. If the reader isn't fully consumed, thecmd.Wait()
method may hang, failing to release file descriptors.This PR adds a context timeout to the
git cat-file
command, with a brief wait to allow the process time to complete before the process is killed.Error msg:
scanner error: error running git cat-file: pipe2: too many open files
Checklist:
make test-community
)?make lint
this requires golangci-lint)?