Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable conflicting encode options when marshaling cbor.Tag. #546

Merged
merged 1 commit into from
Jun 7, 2024

Conversation

benluddy
Copy link
Contributor

@benluddy benluddy commented May 31, 2024

Closes #545

Encode options, especially those that control the mapping from Go type to CBOR type, can result in output containing tag validity errors. For tag numbers that are built in, it's possible to "do the right thing" and override those options on a case-by-case basis. This can't and does not prevent tag validity errors for unrecognized tag numbers.

Description

PR Was Proposed and Welcomed in Currently Open Issue

Checklist (for code PR only, ignore for docs PR)

  • Include unit tests that cover the new code
  • Pass all unit tests
  • Pass all lint checks in CI (goimports, gosec, staticcheck, etc.)
  • Sign each commit with your real name and email.
    Last line of each commit message should be in this format:
    Signed-off-by: Firstname Lastname [email protected]
  • Certify the Developer's Certificate of Origin 1.1
    (see next section).

Certify the Developer's Certificate of Origin 1.1

  • By marking this item as completed, I certify
    the Developer Certificate of Origin 1.1.
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@benluddy benluddy marked this pull request as ready for review May 31, 2024 20:18
@benluddy
Copy link
Contributor Author

@fxamacker PTAL. I wanted to get a proposed solution up before the end of the week in order to avoid delaying the next release. I'm happy to resolve this in a different way if you prefer.

Copy link
Owner

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benluddy Thanks for opening this PR and great discussion in the issue comments.

For unrecognized tag numbers, the library does not have enough information to detect tag validity errors. The application ultimately has to be responsible for validity errors in any special Tag values it directly creates and marshals.

I agree!

To address the immediate issue, I propose special-casing this in encodeTag (and maintaining similar special cases as necessary to make Tag marshaling "do the right thing" for the built-in tags).

I like this approach. It is effective without breaking changes.

In addition to current built-in tags, I wonder if we should special case more tags defined in RFC 8949 section 3.4 table 5, specifically CBOR tags 24, 32, 33, 34, and 36.

It is very likely that codec will have built-in support for more tags defined in RFC 8949. It would be safer to disable settings that might create tag validity issues for those tags.

Thoughts?

tag_test.go Outdated Show resolved Hide resolved
@benluddy
Copy link
Contributor Author

benluddy commented Jun 3, 2024

In addition to current built-in tags, I wonder if we should special case more tags defined in RFC 8949 section 3.4 table 5, specifically CBOR tags 24, 32, 33, 34, and 36.

It is very likely that codec will have built-in support for more tags defined in RFC 8949. It would be safer to disable settings that might create tag validity issues for those tags.

Thoughts?

It's a good question. I'm wary by default about maintaining more special cases than necessary. The behavior of the default options hasn't changed, so at least existing programs that are processing those tags using Tag will continue to work. Special-casing the tags that are built-in and always validated strikes me as necessary because it prevents the library from marshaling data that it can't itself unmarshal. For other tags, my preference would be to help users migrate to an option that uses RawTag as the dynamic type of interface{} values when decoding unrecognized tags to guarantee that roundtripping won't introduce any new tag validity errors for unrecognized tags, and to add warnings about this to, or even deprecate, Tag. In fact, I will add additional documentation to Tag as part of this change to mention that options can affect how marshaling and unmarshaling of its content field.

Eventually (v3 wish list?), I think it would be great to arrive at a place like:

  • registering custom tag marshaling, unmarshaling, and validation is flexible enough to handle all existing built-ins
  • built-in tags use the same mechanism as user-registered tags
  • generic/unrecognized tag marshaling, unmarshaling, and validation is wired to the registry of known tags such that even marshaling a directly-constructed tag and unmarshaling directly into a tag can return a validity error

Encode options, especially those that control the mapping from Go type to CBOR type, can result in
output containing tag validity errors. For tag numbers that are built in, it's possible to "do the
right thing" and override those options on a case-by-case basis. This can't and does not prevent tag
validity errors for unrecognized tag numbers.

Signed-off-by: Ben Luddy <[email protected]>
Copy link
Owner

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Thanks for adding comments and updating tests.

@fxamacker fxamacker merged commit 0e2d14e into fxamacker:master Jun 7, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using ByteSliceMode (unreleased new feature) can encode invalid CBOR data
2 participants