Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect error message being logged #336

Closed
guanzo opened this issue Jun 28, 2023 · 4 comments · Fixed by #338
Closed

Incorrect error message being logged #336

guanzo opened this issue Jun 28, 2023 · 4 comments · Fixed by #338
Assignees
Labels
saturn L1-node Related to filecoin-saturn L1-Nodes

Comments

@guanzo
Copy link
Contributor

guanzo commented Jun 28, 2023

This code assumes the err is ErrExtraneousBlock if it's not EOF, but other errors like context canceled will lead to this confusing log message: error closing car writer: extraneous block in CAR. It makes me think the upstream provider is returning invalid CARs which isn't the case.

_, err = cbr.Next()
if !errors.Is(err, io.EOF) {
return 0, 0, ErrExtraneousBlock
}

I'm able to repro with

lassie daemon -p 8989 --vv

and

curl "localhost:8989/ipfs/bafkreifyxdnkixhcqz4agfx6jekd27ib7kwxvhaddbvh4ycj4n22jj5d7y?dag-scope=entity&format=car" -v

The context canceled error inconsistently occurs if curl has no output option, if you provide an output (-o /dev/null) then it works consistently. I guess curl cancels the request if it can't write the output to somewhere?

@rvagg
Copy link
Member

rvagg commented Jun 29, 2023

Nicely diagnosed. It probably should be err != nil && !errors.Is(err, io.EOF). I'll get that sorted.

@rvagg rvagg self-assigned this Jun 29, 2023
@rvagg
Copy link
Member

rvagg commented Jun 29, 2023

This is wilder than I imagined .. I think ipld/go-ipld-prime#524 is involved in this, an error not propagating from the traversal system. Multiple ways to address this but certainly not as simple as I thought when I first started tackling this.

rvagg added a commit that referenced this issue Jun 29, 2023
Closes: #336

* don't treat context cancellation as an ErrExtraneousBlock in the CAR verifier
* capture and properly handle block load errors that are missed by the
  go-ipld-prime traverser, ref: ipld/go-ipld-prime#524
* fix flaky case(s) in verifiedcar test suite where multi-level sharded
  directory is assumed but only a single block dir is produced
rvagg added a commit that referenced this issue Jun 29, 2023
Closes: #336

* don't treat context cancellation as an ErrExtraneousBlock in the CAR verifier
* capture and properly handle block load errors that are missed by the
  go-ipld-prime traverser, ref: ipld/go-ipld-prime#524
* fix flaky case(s) in verifiedcar test suite where multi-level sharded
  directory is assumed but only a single block dir is produced
@rvagg
Copy link
Member

rvagg commented Jun 29, 2023

This should address the problem: #338, the main fix here is to check for context cancellation errors in that position. Unfortunately testing this specific error path is very hard so it's not included there but I've verified that the correct error comes up for the case you've outlined—that the writer is borked.

While in there I've fixed a few other tangentially related things.

@hannahhoward hannahhoward added the saturn L1-node Related to filecoin-saturn L1-Nodes label Jun 29, 2023
rvagg added a commit that referenced this issue Jul 3, 2023
Closes: #336

* don't treat context cancellation as an ErrExtraneousBlock in the CAR verifier
* capture and properly handle block load errors that are missed by the
  go-ipld-prime traverser, ref: ipld/go-ipld-prime#524
* fix flaky case(s) in verifiedcar test suite where multi-level sharded
  directory is assumed but only a single block dir is produced
@rvagg
Copy link
Member

rvagg commented Jul 3, 2023

#343 is something not resolved by #338 but arises from this issue and should be looked at. It's not urgent, but a goroutine leak could lead to resource exhaustion over time.

rvagg added a commit that referenced this issue Jul 4, 2023
Closes: #336

* don't treat context cancellation as an ErrExtraneousBlock in the CAR verifier
* capture and properly handle block load errors that are missed by the
  go-ipld-prime traverser, ref: ipld/go-ipld-prime#524
* fix flaky case(s) in verifiedcar test suite where multi-level sharded
  directory is assumed but only a single block dir is produced
@rvagg rvagg closed this as completed in #338 Jul 4, 2023
rvagg added a commit that referenced this issue Jul 4, 2023
Closes: #336

* don't treat context cancellation as an ErrExtraneousBlock in the CAR verifier
* capture and properly handle block load errors that are missed by the
  go-ipld-prime traverser, ref: ipld/go-ipld-prime#524
* fix flaky case(s) in verifiedcar test suite where multi-level sharded
  directory is assumed but only a single block dir is produced
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
saturn L1-node Related to filecoin-saturn L1-Nodes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants