Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive: report error from input stream #2012

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion pkg/archive/filter.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,14 @@ func tryProcFilter(args []string, input io.Reader, cleanup func()) (io.ReadClose
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a tangent but I am not sure why we're not using
https://pkg.go.dev/os/exec#Cmd.StdinPipe above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not used StdoutPipe to report explicitly the error with CloseWithError. I'll need to check if that is the same behavior with StdoutPipe

err := cmd.Run()
if err != nil && stderrBuf.Len() > 0 {
err = fmt.Errorf("%s: %w", strings.TrimRight(stderrBuf.String(), "\n"), err)
b := make([]byte, 1)
// if there is an error reading from input, prefer to return that error
_, errRead := input.Read(b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways here...is it generally defined behavior in Go to read from a reader after it's returned EOF? It would seem unusual to cache errors right?

Actually I am now a bit more confused as to the flow here - how did this break the ENOSPC in the first place? And how does this fix fix it? The input here is the http request reader, right? It wouldn't be the thing giving us ENOSPC...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is just a race condition. The decompressor is faster now and detects the error earlier, while before we were relying on catching the ENOSPC before the decompressor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level I don't understand: why in a test case that we're checking for ENOSPC are we hitting corruption in the compression stream? That doesn't make sense to me.


Different topic: My semi-understanding here is that the previous code was synchronously doing decompression and writes in a single goroutine. That will make the errors quite predictable indeed (in practice, they could only change if the decompressor happened to change to start reading larger chunks or something).

But here, we're suddenly doing decompression in a separate process (but logically, we could have done it in a separate goroutine before, sending decompressed chunks via a pipe or channel) - and it's that parallelization that makes things racy, not speed of decompression itself.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways here...is it generally defined behavior in Go to read from a reader after it's returned EOF?

The API does not explicitly say, but I would expect a Reader to infinitely return EOF on end-of-file.

It would seem unusual to cache errors right?

I wouldn’t generally expect a Read to continue returning a previously-reported error. Some implementations do that, but e.g. for a raw file access, I’d expect it to just issue a syscall again. So, yes, that might plausibly lose data, but we don’t care here — if we don’t see any error, we would report the one from the subprocess.


Actually I am now a bit more confused as to the flow here - how did this break the ENOSPC in the first place? And how does this fix fix it?

The input here is the http request reader, right? It wouldn't be the thing giving us ENOSPC...

I think here the input is an io.TeeReader actually writing to the restricted-space directory. The storageImageDestination.putBlobToPendingFile pipeline is

HTTP -> … -> Tee -> digest compressed -> decompress -> digest uncompressed
              |
              -> count size -> store to file

and a read from the Tee causes a write to the side branch, which fails with ENOSPC, and is reported to the consumer of the Tee.

(Yes, it would have been nice to have that explanation at the start of the PR, and to avoid the “let’s merge because we are in a rush, what do you mean we don’t understand whether this is actually a fix” controversy.)


and it's that parallelization that makes things racy, not speed of decompression itself.

It’s not really a race at all, it’s that going across the external process loses the Go error information available for the input of the decompression. The external process is just going to see an EOF on a pipe (or maybe we could send a signal?), there is no way to stuff a “we encountered ENOSPC when providing you data” side channel into the pipe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways here...is it generally defined behavior in Go to read from a reader after it's returned EOF? It would seem unusual to cache errors right?

It’s not really a race at all, it’s that going across the external process loses the Go error information available for the input of the decompression.

On second thought, I think this should be implemented in a different way, one that is clearly correct, reporting all errors and not introducing any risk of extra hangs:

Wrap the input stream in a new errorRecordingReader, which stores the first error value reported, if any. Pass that wrapped stream as cmd.Stdin. Then we can consult the reported error if cmd.Run() fails. (cmd.Run() ensures the goroutine consuming cmd.Stdin terminates before returning, so we don’t need to worry about concurrency.)

(It’s a bit ironic that cmd.Run() already has such an error-recording code in cmd.childStdin; but that error is only reported if cmd returns with exit status 0, the ExitError cause is preferred.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be implemented in a different way

I need to focus on something else; I have filed at least a placeholder #2022 for visibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and a read from the Tee causes a write to the side branch, which fails with ENOSPC, and is reported to the consumer of the Tee.

Thank you! That makes sense to me now. I was unaware of the use of Tee here as I currently just have a superficial awareness of this code, but I'm here to learn. The "reads cause writes via side effect" is definitely not something I am used to seeing.

This is a tangent...and I am sure you guys know what's coming next 😄 but in Go, TeeReader is pretty popular whereas there's no such thing in the Rust stdlib, and while people have written it, it's really vanishingly rare comparatively. It's way more common IME to do this type of thing via a custom Write implementation that writes to two places; e.g. ocidir-rs (my code) and there's similar in coreos-installer I think.

I suspect one reason for this is the "ownership" of things and error handling is much clearer if there's a single "super writer" instead of a tee.

Copy link
Collaborator

@mtrmac mtrmac Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*shrug* STREAMS (…) are hard. Go has ended up with two separate Reader and Writer abstractions, which nicely compose individually, but connecting a codebase built around a Reader to a codebase built around a Writer requires an awkward API translation layer; that might be io.Copy or io.TeeReder (easy), or a separate goroutine and an io.Pipe (doable but annoying and comparatively risky).

Whether something interfaces using a Reader or a Writer is frequently, but not always, fairly arbitrary, but often out of control of the caller. Here, the decompression is built around Readers, and using a TeeReader is to adapt in that direction is much easier than using pipes to adapt in the other direction.

[This is sort of isomorphic to the “XML streaming API design” dilemma, where, without concurrency, it’s easy to write an XML parser triggering hard-to-consume callbacks, vs. it’s harder to write an XML parser which returns an easy-to-consume stream of events; with XML it’s so much more visible because the API surface is much larger than just some byte arrays and errors.]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways here...is it generally defined behavior in Go to read from a reader after it's returned EOF? It would seem unusual to cache errors right?

It’s not really a race at all, it’s that going across the external process loses the Go error information available for the input of the decompression.

On second thought, I think this should be implemented in a different way, one that is clearly correct, reporting all errors and not introducing any risk of extra hangs:

Wrap the input stream in a new errorRecordingReader, which stores the first error value reported, if any. Pass that wrapped stream as cmd.Stdin. Then we can consult the reported error if cmd.Run() fails. (cmd.Run() ensures the goroutine consuming cmd.Stdin terminates before returning, so we don’t need to worry about concurrency.)

(It’s a bit ironic that cmd.Run() already has such an error-recording code in cmd.childStdin; but that error is only reported if cmd returns with exit status 0, the ExitError cause is preferred.)

thanks for the hint. Opened a PR: #2025

if errRead != nil && errRead != io.EOF {
err = errRead
} else {
err = fmt.Errorf("%s: %w", strings.TrimRight(stderrBuf.String(), "\n"), err)
}
}
w.CloseWithError(err) // CloseWithErr(nil) == Close()
cleanup()
Expand Down