Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codecs: error upon encountering an 'absent' value during encodes. #196

Closed
wants to merge 1 commit into from

Conversation

warpfork
Copy link
Collaborator

@warpfork warpfork commented Jun 15, 2021

See the changelog entry in the diff for some more explanation.

This is one of the low-hanging-fruit/low-risk steps to address the worst of the issues outlined in #191 . It should fix the most dire of scenarios where accidental misuse of an API could cause lossy data coersions silently.

@warpfork warpfork requested a review from mvdan June 15, 2021 23:27
@warpfork warpfork force-pushed the do-not-coerce-absent-to-null branch from ec4f629 to 678beda Compare June 15, 2021 23:30
@warpfork warpfork requested a review from willscott June 15, 2021 23:34
Copy link
Contributor

@mvdan mvdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. It would have definitely caught our misuse in dealbot, even though it won't catch all misuses of non-repr nodes.

Out of curiosity, you mentioned that one might want to encode a non-repr node for the sake of debugging the non-repr node structure. Is that now kinda abandoned, given how we'd just error on absent fields etc?

@mvdan
Copy link
Contributor

mvdan commented Jun 15, 2021

Also, I just noticed after approving, but... Add a test?

@warpfork
Copy link
Collaborator Author

I like this. It would have definitely caught our misuse in dealbot, even though it won't catch all misuses of non-repr nodes.

Correct on both counts.

you mentioned that one might want to encode a non-repr node for the sake of debugging the non-repr node structure. Is that now kinda abandoned, given how we'd just error on absent fields etc?

Yes. I give up on / walk back on the idea of that "debug" story as a high priority goal. It was a bad idea. This coersion behavior has proven to be infinitely more dangerous and footgunny than the hypothetical utility of the debug behavior.

I'm still a bit sad to give up the idea of that debug story, but ditching it is looking like by far the lesser of two evils.

One can still kind of, even with this change, feed non-repr nodes to encoding, and get that sort of "debug" story... as long as they don't contain optional fields that are absent. But that's sufficiently likely to be a show-stopper for that story to be useful in practice that it's not really worth much; it's more just a coincidence, now.

Copy link
Member

@willscott willscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced that the answer is to return an error in the common case when we generally know what the end user wants to do. If i am marshaling an instance of a node, having it fail in a data-dependent way because an optional field happens to be absent seems like it's still huge foot gun. My expectation as someone asking to marshal that node is that i get the marshaling of what we call the representation version.
I've become pretty strongly opinionated that we should make that behavior the default because it's what a user is going to expect.

case ipld.Kind_Null:
if n.IsAbsent() {
return fmt.Errorf("cannot encode a node that is absent")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i worry that this error message, which will be at runtime, will be fairly opaque. Consider adding a hint like "perhaps you are trying to encode the non-representation version of a node?"

@mvdan
Copy link
Contributor

mvdan commented Jun 16, 2021

If i am marshaling an instance of a node, having it fail in a data-dependent way because an optional field happens to be absent seems like it's still huge foot gun. My expectation as someone asking to marshal that node is that i get the marshaling of what we call the representation version.

I actually came to the same conclusion this morning, after I was thinking some more about my reply. If encoding a non-repr node is almost never what you want, and it can fail depending on the presence of an absent, and we're giving up on using codecs as debug helpers... Why not just make the codecs consistently grab a node's representation by default?

Here's how I propose to do it: We strongly recommend that any codec, when encoding a node or decoding into a prototype, should check if it's typed. If it is, it should use its representation.

In the very rare case that someone does want to encode or decode the non-repr version, we can offer Prototype and Node wrappers that simply hide the top-level Representation method. This should be as simple as:

type NodeHidingRepr struct {
    ipld.Node
}

node := someTypedNode(...)

dagjson.Encode(node) // uses the representation, as per the default

dagjson.Encode(NodeHidingRepr{node}) // doesn't see a TypedNode, uses the non-repr view

@rvagg
Copy link
Member

rvagg commented Apr 19, 2022

putting this on my list to review and possibly close or merge (tbh I don't even recall this discussion! rough 2021)

@rvagg rvagg self-requested a review April 19, 2022 22:49
@rvagg rvagg self-assigned this Apr 19, 2022
@rvagg
Copy link
Member

rvagg commented May 10, 2022

TODO: does #246 get covered by this?

@rvagg
Copy link
Member

rvagg commented May 23, 2022

Sooooo we've had #232 since this was opened, which kind of addresses this, although we haven't been consistent in using that API (at least I haven't! tbh I missed #232 and keep on defaulting to direct codec usage).

Looking back over #191 that this comes from (as option 1) and considering #232 got added, I don't really buy the case for this, it's exchanging hostility for a notion of correctness; at least it's saying "no, because I'm seeing this thing, you're probably using the API wrong, so have an error", with no room for bargaining.

I'm going to take executive action on this one and close it, for now, and open an issue to adjust documentation throughout go-ipld-prime to recommend usage of the top-level Encode(), Decode() and their *Streaming() partners. Maybe also hunting down some usage across our repos and replace them there to help point people in the right direction instead of copy/pasting our bad usage. I'm pretty guilty of this and can think of many instances of this in the last couple of months.

Dan's suggestion of how to encode the representation form even works for those new APIs so we could potentially document that too.. although as we've said in here and other places, the case for encoding the typed (or ADL) form is pretty minimal.

Of course, if Eric or someone else wants to come back and re-litigate the case for this then you're welcome to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

4 participants