Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: enforce strict parsing for RFC3339 for Time.Unmarshal #54580

Closed
dsnet opened this issue Aug 22, 2022 · 37 comments
Closed

time: enforce strict parsing for RFC3339 for Time.Unmarshal #54580

dsnet opened this issue Aug 22, 2022 · 37 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Aug 22, 2022

When calling:

time.Parse(time.RFC3339, s)

one would expect that this strictly parses s according to RFC 3339 as the constant suggests.
This parses all valid RFC 3339 timestamps (except those with leap-seconds),
however it fails to reject certain inputs that are not valid RFC 3339.

Parse differs from the RFC 3339 syntax in the following ways:

  1. Parse permits the hour fields to only have one digit, while RFC 3339 requires that it always be two digits.

  2. Parse permits a comma separator between seconds and sub-seconds, while RFC 3339 only permits a period.

  3. Parse permits time zone offsets with any range of values, while RFC 3339 limits hours to be less than 24 and minutes to be less than 60.

    • For example, 0000-01-01T00:00:00+24:60 is considered valid, when it should not be.

In all situations, the time format template provides no way to strictly parse RFC 3339.

Since we can't change the text template meaning of "15" and ".",
nor can we change the constant value of RFC3339 or RFC3339Nano,
I propose that we special-case Parse to use strict parsing when handling a format template
identical to RFC3339 and RFC3339Nano.

Justification for this special-casing:

  • RFC 3339 is by far the most popular time representation.
  • RFC 3339 exists because ISO 8601 had too many alternate representations.
    Being loose in what Go considers "valid" RFC 3339 is going against the goal of RFC 3339.

\cc @robpike

@apparentlymart
Copy link

I like the goal of having a strict parser for this format, but it feels surprising to me for Parse to retroactively treat specific format strings as special.

I would find it more intuitive to have a separate function specifically for parsing RFC 3339 per its specification, without any implication that it's "just another format string". Perhaps that could be accompanied by a deprecation of the constant RFC3339 to discourage its use in favor of the function, so that the existing constant will be less of an attractive nuisance for those new to the package looking for an RFC3339 parser. 🤔

This would also avoid changing behavior for any existing users of the current API, allowing existing callers to adopt the new function when they are ready.

@dsnet
Copy link
Member Author

dsnet commented Sep 27, 2022

The UnmarshalText and UnmarshalJSON methods are already documented as operating according to RFC 3339, what do we want to do with those? We either:

  1. Say that the Unmarshal methods are stuck on old incorrect behavior, OR
  2. We accept the fact that we have to change existing behavior.

I'm sympathetic to not changing the Parse function since special-cases are not pretty. I feel more strongly that we change the behavior of the Unmarshal methods, otherwise the unmarshaling of time.Time will perpetually be non-compliant with RFC 3339.

As for whether we are allowed to make this change, the Go 1 compatibility document says:

Specification errors. If it becomes necessary to address an inconsistency or incompleteness in the specification, resolving the issue could affect the meaning or legality of existing programs. We reserve the right to address such issues, including updating the implementations. Except for security issues, no incompatible changes to the specification would be made.

Since functionality that claims to be RFC 3339 do not operate according to RFC 3339, this is an issue of "legality" and the document gives the package the "right to address such issues".

Also, this could be an interesting use of the GODEBUG flags discussed in #55090. You could imagine having a GODEBUG=timestrictrfc3339=0 override that opts into the older behavior.

@apparentlymart
Copy link

I don't typically like to get into the business of rules-lawyering based on exactly what the docstrings say, but this seems like a fiddly enough case that using the existing documentation as guidance might be relevant.

The time.Parse function is documented as parsing a timestamp in a general way based on a format string with a particular syntax. The specific thing that would "feel surprising" to me is for that to retroactively gain an exception like "...unless the format string is exactly what's in time.RFC3339, in which case it uses a different parser".

On the other hand, the UnmarshalText method is explicit in its documentation that it expects RFC 3339 format and says nothing about how it achieves that. The fact that it currently uses time.Parse with time.RFC3339 is an implementation detail rather than an explicit contract.

The time.RFC3339 constant has no direct documentation of its own because all of the time format constants are documented together in one large documentation comment. The name of the symbol certainly implies that it's intending to be a description of the RFC3339 format though.

Based purely on the contract described in the docs then, it seems reasonable to me that there be a new time.ParseRFC3339 function which implements exactly the grammar in the RFC, and then UnmarshalText's implementation changes to call ParseRFC3339 instead. The documented contract for both Parse and UnmarshalText is preserved, and anyone who wants to strictly accept RFC 3339 could either use MarshalText (to rely on that documented contract) or could call time.ParseRFC3339 directly.

Any existing callers using time.Parse will still get the behavior as originally documented, which seems to be the closest superset of RFC3339 that the documented format syntax is capable of describing.

I would agree that the existence of the time.RFC3339 constant itself seems to be a "specification error" in the sense that the format string syntax used by time.Parse is not actually capable of representing the RFC 3339 requirements. That seems like reasonable justification for deprecating it in favor of a function that does correctly implement RFC 3339, even though there is no possible format string (per the currently-documented syntax) that can accurately describe those requirements. (I suppose this strategy would also require adding a Time.FormatRFC3339 method, because deprecating the constant would also deprecate using time.Format(time.RFC3339) even though that format string is technically capable of producing valid RFC 3339 strings.)

While I do appreciate that it would be possible to use GODEBUG to temporarily dynamically recover the existing implementation, that doesn't feel quite right to me in this case because:

  • Different parts of a program are likely to have differing requirements.

    For example, HashiCorp Terraform has a public-facing function which is documented to accept RFC 3339 strings but uses time.Parse(time.RFC3339, ...) and so is accidentally currently accepting the superset described by that format string. Terraform is also subject to compatibility promises for its own functions and so Terraform may need to, at least for some time, retain the more liberal parser and potentially generate deprecation warnings for anyone relying on the old implementation. However, other parts of Terraform consume wire formats with different tradeoffs where stricter parsing would not be problematic in the same way and would be better to switch to stricter parsing earlier.

  • The time.Parse function is specified not as an RFC 3339 parser but as a general parser for various time formats that can be described by a special microsyntax. Therefore it feels a bit dishonest to say that its failure to handle parsing rules outside of the bounds of its documented syntax is a "specification error": it's behaving as specified when given the particular format string that is specified in time.RFC3339.

    I think the time.RFC3339 constant is reasonable to describe as a specification error, because its name implies a promise it's unable to keep. However, a constant doesn't have any inherent behavior of its own so a GODEBUG-based approach wouldn't work for that, whereas deprecation in favor of a new explicit function could potentially work.

@dsnet
Copy link
Member Author

dsnet commented Sep 27, 2022

It seems we're in agreement that the Unmarshal methods should change behavior. I'm not quite fond of adding new API like ParseRFC3339. If we change the Unmarshal methods, that may be sufficient. The way to call it will be differ:

  • ParseRFC3339 would return a time.Time, while UnmarshalText mutates the receiver,
  • ParseRFC3339 would accept a string, while UnmarshalText accepts a []byte.

Different parts of a program are likely to have differing requirements.

That's true of all the GODEBUG flags. It is a global change of behavior that may be inconsistent in what a particular library wants. The use of GODEBUG will not fix all cases of changed behavior in the standard library but it provides users a knob that is better than nothing.

@apparentlymart
Copy link

Indeed, I think it's defensible that the current implementation of UnmarshalText is buggy in that it claims to implement RFC 3339 but it doesn't. It may still have impact on existing users, but in a way that can be clearly justified in terms of its existing specification.

It is indeed true that all GODEBUG flags are global in nature, but that's a more appropriate solution for some situations than others. For example, using GODEBUG to choose between resolver implementations seems reasonable because looking up hostnames is typically (though certainly not always) a global concern that the entire program should agree on. I hesitate to consider it an appropriate solution for all possible changes to the standard library, since there are various situations where the decision is inherently more localized.

My sense is that GODEBUG has traditionally been a last resort for situations where the new behavior is "obviously better" but where some small number of programs may need to retain the old behavior. I guess I'm not convinced that a special case in Parse to handle something it's never claimed to support is "obviously better", and that some specialized API that directly addresses the evident use-case (strictly parsing RFC 3339 exactly as it's specified) is both better for backward compatibly and, by my estimation, easier to learn and understand than a special exception buried inside the Parse docstring. (This "easier to learn and understand" is assuming that time.RFC3339 would also be deprecated so that authors will see feedback if they notice it and try to use it. Otherwise it's likely to be an attractive nuisance.)

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/444277 mentions this issue: time: implement strict RFC 3339 during marshal and unmarshal

@dsnet
Copy link
Member Author

dsnet commented Oct 19, 2022

I sent out https://go.dev/cl/444277 which only special-cases strict RFC 3339 for the Marshal and Unmarshal methods since they explicitly say that they operate according to RFC 3339.

There is also existing precedence for strict checking since the Marshal methods already check that the year is within the right range.

In my opinion, modifying just Marshal and Unmarshal is a justifiable bug fix that doesn't require a proposal anymore.

gopherbot pushed a commit that referenced this issue Oct 20, 2022
We add strict checking to marshal and unmarshal methods,
rather than Parse to maintain compatibility in Parse behavior.
Also, the Time.Format method has no ability to report errors.

The Time.Marshal{Text,JSON} and Time.Unmarshal{Time,JSON} methods
are already documented as complying with RFC 3339, but have
edge cases on both marshal and unmarshal where it is incorrect.
The Marshal methods already have at least one check to comply
with RFC 3339, so it seems sensible to expand this to cover
all known violations of the specification.
This commit fixes all known edge cases for full compliance.

Two optimizations are folded into this change:

	1. parseRFC3339 is made generic so that it can operate
	   directly on a []byte as well as string.
	   This avoids allocating or redundant copying
	   when converting from string to []byte.

	2. When marshaling, we verify for correctness based
	   on the serialized output, rather than calling
	   attribute methods on the Time type. For example,
	   it is faster to check that the 5th byte is '-'
	   rather than check that Time.Year is within [0,9999],
	   since Year is a relatively expensive operation.

Performance:

	name            old time/op  new time/op  delta
	MarshalJSON     109ns ± 2%    99ns ± 1%   -9.43%  (p=0.000 n=10+10)
	UnmarshalText   158ns ± 4%   143ns ± 1%   -9.17%  (p=0.000 n=9+9)

Updates #54580
Updates #54568
Updates #54571

Change-Id: I1222e45a7625d1ffd0629be5738670a84188d301
Reviewed-on: https://go-review.googlesource.com/c/go/+/444277
Reviewed-by: David Chase <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
Run-TryBot: Joseph Tsai <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@dsnet
Copy link
Member Author

dsnet commented Oct 20, 2022

I'm withdrawing this proposal. I think fixing the marshal/unmarshal methods is sufficient.

@dsnet dsnet closed this as completed Oct 20, 2022
romaindoumenc pushed a commit to TroutSoftware/go that referenced this issue Nov 3, 2022
We add strict checking to marshal and unmarshal methods,
rather than Parse to maintain compatibility in Parse behavior.
Also, the Time.Format method has no ability to report errors.

The Time.Marshal{Text,JSON} and Time.Unmarshal{Time,JSON} methods
are already documented as complying with RFC 3339, but have
edge cases on both marshal and unmarshal where it is incorrect.
The Marshal methods already have at least one check to comply
with RFC 3339, so it seems sensible to expand this to cover
all known violations of the specification.
This commit fixes all known edge cases for full compliance.

Two optimizations are folded into this change:

	1. parseRFC3339 is made generic so that it can operate
	   directly on a []byte as well as string.
	   This avoids allocating or redundant copying
	   when converting from string to []byte.

	2. When marshaling, we verify for correctness based
	   on the serialized output, rather than calling
	   attribute methods on the Time type. For example,
	   it is faster to check that the 5th byte is '-'
	   rather than check that Time.Year is within [0,9999],
	   since Year is a relatively expensive operation.

Performance:

	name            old time/op  new time/op  delta
	MarshalJSON     109ns ± 2%    99ns ± 1%   -9.43%  (p=0.000 n=10+10)
	UnmarshalText   158ns ± 4%   143ns ± 1%   -9.17%  (p=0.000 n=9+9)

Updates golang#54580
Updates golang#54568
Updates golang#54571

Change-Id: I1222e45a7625d1ffd0629be5738670a84188d301
Reviewed-on: https://go-review.googlesource.com/c/go/+/444277
Reviewed-by: David Chase <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
Run-TryBot: Joseph Tsai <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@liggitt
Copy link
Contributor

liggitt commented Jan 18, 2023

There is also existing precedence for strict checking since the Marshal methods already check that the year is within the right range.

In my opinion, modifying just Marshal and Unmarshal is a justifiable bug fix that doesn't require a proposal anymore.

Just got here via #57897

I'm not sure why wire-serialization changes wouldn't need compatibility consideration. We want people to be able to update go versions with very high confidence, and "ensure nothing calling your API written in go is hitting these RFC3339 edge cases" is a pretty big obstacle to doing that.

@bcmills
Copy link
Contributor

bcmills commented Jan 18, 2023

I'm not sure why wire-serialization changes wouldn't need compatibility consideration.

To be clear: @dsnet didn't say that it doesn't need compatibility consideration. He said that it doesn't require a proposal.

As we do with other bug-fixes that impact compatibility, I think it would be a good idea to add a GODEBUG setting to allow UnmarshalJSON to accept the invalid date strings that were previously accepted due to the bug.

@liggitt
Copy link
Contributor

liggitt commented Jan 18, 2023

I'm not sure why wire-serialization changes wouldn't need compatibility consideration.

To be clear: @dsnet didn't say that it doesn't need compatibility consideration. He said that it doesn't require a proposal.

Oh, I didn't realize that was used in a formal sense. I was just confused why a change to time.Time#MarshalJSON/UnmarshalJSON would require a different level of consideration than a change to time.Parse.

As we do with other bug-fixes that impact compatibility, I think it would be a good idea to add a GODEBUG setting to allow UnmarshalJSON to accept the invalid date strings that were previously accepted due to the bug.

that could be good, especially if paired with:

  • a clearer description in the go1.20 release notes of the specific data that would no longer be accepted (so people can detect whether they are impacted)
  • a description of the code changes required if someone needed to maintain the pre-1.20 behavior long term (maybe use time.Parse directly?)

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

Given that this is a "visible behavior change in existing functionality" (quoting https://go.dev/s/proposal), I think this change should have gone through the proposal process to allow a discussion about the pros and cons of making a breaking change like this among anyone interested. That's the entire role of the proposal process.

While we could patch it with a GODEBUG at this late stage, we could also revert for Go 1.20 and do the real proposal process for Go 1.21. Thoughts?

@liggitt
Copy link
Contributor

liggitt commented Jan 18, 2023

If there's not a compelling reason this must be in 1.20, reverting for 1.20 and working through a proposal for how to improve RFC compliance in 1.21 (ideally for Time#MarshalJSON/UnmarshalJSON and time.Parse) makes sense to me.

@dsnet
Copy link
Member Author

dsnet commented Jan 18, 2023

Given that this is a "visible behavior change in existing functionality"

Strictly speaking, under that definition, every bug fix is a an observable change and would therefore need to go through the proposal process. This issue originally proposed changing the behavior of Parse to have behavior different than what is specified, which would obviously need to go through the proposal process. In the end, I only changed the behavior of Time.UnmarshalText and Time.UnmarshalJSON to match the already specified behavior, which is to accept RFC 3339. In that situation, this falls under the category of a bug fix.

The purpose of specifications like RFC 3339 is so that different implementations (whether in Go, Rust, Python, or otherwise) can agree on what is a valid timestamp. Other protocols are built upon this and rely on the syntactic agreement of what is a valid timestamp. When one implementation accepts invalid input, it sets a bad precedence for other implementations to also accept invalid input, making specifications functionally unreliable (as is the case with HTTP). There ends up being the "what's specified" category and the "what's specified, but what's most empirically respected, but not really documented anywhere" category.

I stand by my original position that this was an okay change without going through the proposal process, but also with the pragmatic understanding that this change could break users. Barring any data of what would break, it still seems like the right move to have submitted the change and for the Go beta and RC to produce any reports of problems. Once those reports come in, we can evaluate again as we are doing right now. I feel strongly we should eventually fix this, but I'm okay with it being in Go 1.20 or in a later released. I agree that we should have a GODEBUG for this (the mechanism was not quite standardized at the time).

To be clear about the behavior change:

  • Timestamps serialized by Time.MarshalText and Time.MarshalJSON were never invalid. Thus, any data serialized (e.g., by "encoding/json") will be unmarshaled without problem. This would only be observable with inputs generated by other sources.
  • The list of now rejected syntax is concisely found here:
    // The parse template syntax cannot correctly validate RFC 3339.
    // Explicitly check for cases that Parse is unable to validate for.
    // See https://go.dev/issue/54580.
    num2 := func(b []byte) byte { return 10*(b[0]-'0') + (b[1] - '0') }
    switch {
    case b[len("2006-01-02T")+1] == ':': // hour must be two digits
    return Time{}, &ParseError{RFC3339, string(b), "15", string(b[len("2006-01-02T"):][:1]), ""}
    case b[len("2006-01-02T15:04:05")] == ',': // sub-second separator must be a period
    return Time{}, &ParseError{RFC3339, string(b), ".", ",", ""}
    case b[len(b)-1] != 'Z':
    switch {
    case num2(b[len(b)-len("07:00"):]) >= 24: // timezone hour must be in range
    return Time{}, &ParseError{RFC3339, string(b), "Z07:00", string(b[len(b)-len("Z07:00"):]), ": timezone hour out of range"}
    case num2(b[len(b)-len("00"):]) >= 60: // timezone minute must be in range
    return Time{}, &ParseError{RFC3339, string(b), "Z07:00", string(b[len(b)-len("Z07:00"):]), ": timezone minute out of range"}
    }
    default: // unknown error; should not occur
    return Time{}, &ParseError{RFC3339, string(b), RFC3339, string(b), ""}
    }

@dsnet dsnet changed the title proposal: time: special-case strict parsing for RFC3339 time: special-case strict parsing for RFC3339 Jan 18, 2023
@dsnet dsnet changed the title time: special-case strict parsing for RFC3339 time: enforce strict parsing for RFC3339 for Time.Unmarshal Jan 18, 2023
@dsnet dsnet modified the milestones: Proposal, Go1.20 Jan 18, 2023
@dsnet
Copy link
Member Author

dsnet commented Jan 18, 2023

Re-opening and marking as a "release-blocker".

@dsnet dsnet reopened this Jan 18, 2023
@aarzilli
Copy link
Contributor

Strictly speaking, under that definition, every bug fix is a an observable change and would therefore need to go through the proposal process

The RFC3339 constant is documented to be "2006-01-02T15:04:05Z07:00" and format strings are documented to do certain things, how is it a bug that it does what the documentation says it does and how would one discover that 15, in this particular string only, does a different thing?

@klauspost
Copy link
Contributor

I would prefer either...

A) Add this as a non-default option to json.Decoder. This will allow strict checks when needed without breaking existing content.
B) Revert for Go 1.20 and go through the proposal process for Go 1.21.

While I can accept it, if a majority decides so, having to set GODEBUG=strictrfc3339=0 on all future builds seems counterproductive.

I see there is a value in strict checking, but in my case that is by far outweighed by the potential problems this can bring. See #57897

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

Strictly speaking, under that definition, every bug fix is a an observable change and would therefore need to go through the proposal process.

That's true; we have to predict what will and won't be observed and significant. We do spend a lot of time on wire formats in proposal review, and not just because they often involve new methods. In retrospect, stricter RFC3339 parsing is a similar concern and would have been good to treat as a proposal, much as stricter IP address parsing was (#30999). What I take from this is that parsing and printing external wire formats is important for us to remember to flag in the future.

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

@klauspost, over on the other issue you wrote:

I have no idea whether there is 0 or a billion invalid time fields in the wild. This is a bad change with things potentially breaking just because of a Go upgrade.

That's true but we do have three release candidates that have all had this change and only got our first report today. So that's some evidence it's closer to the 0 end of the scale.

It would help to understand why this happens, which would let us estimate how likely it is. Where did the bad RFC3339 time stamps you found come from? Not Go, it seems. Same question to @liggitt.

@dsnet
Copy link
Member Author

dsnet commented Jan 18, 2023

That approach is concerning, since we know the cases that will break

We do? The fact that Parse accepted single digit hours was very esoteric. Evidently, this change went through all of Google's internal global testing for months and there were no reports that it broke anything.

Presently, we have evidence that it breaks things now, but I had no evidence of that when this change was made.

A) Add this as a non-default option to json.Decoder. This will allow strict checks when needed without breaking existing content.

I'm opposed to any options in downstream packages as this is not sustainable. If json.Decoder gets an option, what about the xml package? or any package that uses Time.Unmarshal by extension?

documentation says it does and how would one discover that 15, in this particular string only, does a different thing?

RFC 3339 is specified. A universal specification takes precedence over a particular implementation, especially if that implementation is non-compliant.

@aarzilli
Copy link
Contributor

RFC 3339 is specified. A universal specification takes precedence over a particular implementation, especially if that implementation is non-compliant.

The documentation of time does not reference the RFC3339 specification. It simply says that the RFC3339 constant is "2006-01-02T15:04:05Z07:00".

@klauspost
Copy link
Contributor

klauspost commented Jan 18, 2023

@rsc It is in a unit test written 5 years ago. I don't know where the author picked it up. I see the same date in AWS S3 documentation, but for XML data.

Edit: Here it is in an (unrelated) aws XML response: https://github.com/aws/aws-sdk-go/blob/main/service/s3/statusok_error_test.go#L30 - not sure if that is the source.

My worry is that there are clients out there that will pass these dates. There are hundreds of applications that connect through the S3 API. Similar, our S3 server provides serverside JSON queries, which can break. Our server is installed on-prem, so I cannot run any checks to see how many invalid timestamps exist. From our users perspective it will be a bug and we are pretty much forced to keep supporting it, even if it is technically invalid.

I would love some hard numbers, but we don't control or have access to the data of our users.

@dsnet
Copy link
Member Author

dsnet commented Jan 18, 2023

@aarzilli, the documentation of Time.MarshalText specifically mention "RFC 3339", not "RFC3339". The latter would imply it's referencing the locally defined constant. The former implies that it's referencing the wider specification.

It is in a unit test written 5 years ago. I don't know where the author picked it up. I see the same date in AWS S3 documentation, but for XML data.

@klauspost, you seem to be describing some magic timestamp that's being passed around in unit tests, what is the exact value?

@klauspost
Copy link
Contributor

@dsnet The value in #57897 - I added links above.

@dims
Copy link

dims commented Jan 18, 2023

github code search shows many more references : 2009-11-23T0:00:00Z

@cherrymui cherrymui added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jan 18, 2023
@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

Discussed at proposal review.

Given that we do not have #56986 landed in Go 1.20, creating a GODEBUG will not be sufficient to insulate users from getting the problem automatically when using a new toolchain (without go.mod edits).

This is an unintended breakage that affects at least one known, widely used package (aws), so we should roll it back for the release. That will probably mean roll back on tip, cherry-pick to 1.20 branch, roll forward on tip with GODEBUG and the right defaults for older releases (which will require waiting for CL 453605).

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/462286 mentions this issue: time: revert strict parsing of RFC 3339

gopherbot pushed a commit that referenced this issue Jan 18, 2023
CL 444277 fixed Time.UnmarshalText and Time.UnmarshalJSON to properly
unmarshal timestamps according to RFC 3339 instead of according
to Go's bespoke time syntax that is a superset of RFC 3339.

However, this change seems to have broken an AWS S3 unit test
that relies on parsing timestamps with single digit hours.
It is unclear whether S3 emits these timestamps in production or
whether this is simply a testing artifact that has been cargo culted
across many code bases. Either way, disable strict parsing for now
and re-enable later with better GODEBUG support.

Updates #54580

Change-Id: Icced2c7f9a6b2fc06bbd9c7e90f90edce24c2306
Reviewed-on: https://go-review.googlesource.com/c/go/+/462286
Reviewed-by: Bryan Mills <[email protected]>
Run-TryBot: Joseph Tsai <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Auto-Submit: Russ Cox <[email protected]>
Reviewed-by: Russ Cox <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

To respond to some of the discussion above about the propriety of making the change at all, we simply don't have accurate data one way or the other. What we have is anecdotes: inside Google, nothing was affected; in open source, we know at least some tests in the AWS API implementations fail. We don't know what will be affected in practice, we don't know which systems generate these non-conforming time stamps, and so on. There are good security and parser alignment reasons to reject non-standard RFC3339 times, and there are good compatibility reasons not to.

Part of the motivation for #56986 (now accepted) is to give us a more graceful path forward in borderline cases like this.

I am grepping the latest versions of all modules on the proxy for almost-RFC3339 times and will report back.

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

Filed #57912 for proposing a roll-forward. Once the rollback is committed on the release branch I will close this issue.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/462675 mentions this issue: [release-branch.go1.20] time: revert strict parsing of RFC 3339

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Jan 18, 2023
gopherbot pushed a commit that referenced this issue Jan 18, 2023
CL 444277 fixed Time.UnmarshalText and Time.UnmarshalJSON to properly
unmarshal timestamps according to RFC 3339 instead of according
to Go's bespoke time syntax that is a superset of RFC 3339.

However, this change seems to have broken an AWS S3 unit test
that relies on parsing timestamps with single digit hours.
It is unclear whether S3 emits these timestamps in production or
whether this is simply a testing artifact that has been cargo culted
across many code bases. Either way, disable strict parsing for now
and re-enable later with better GODEBUG support.

Updates #54580

Change-Id: Icced2c7f9a6b2fc06bbd9c7e90f90edce24c2306
Reviewed-on: https://go-review.googlesource.com/c/go/+/462286
Reviewed-by: Bryan Mills <[email protected]>
Run-TryBot: Joseph Tsai <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Auto-Submit: Russ Cox <[email protected]>
Reviewed-by: Russ Cox <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
Reviewed-on: https://go-review.googlesource.com/c/go/+/462675
Reviewed-by: Joseph Tsai <[email protected]>
Run-TryBot: Ian Lance Taylor <[email protected]>
Auto-Submit: Ian Lance Taylor <[email protected]>
Run-TryBot: Russ Cox <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
@ianlancetaylor
Copy link
Contributor

Rolled back on tip and release branch. Roll forward is #57912.

@mangalaman93
Copy link

Hi, I am trying to figure out why the change in MarshalJSON has not been reverted yet. We are considering upgrading our tool chain in Dgraph to 1.20 (here is the PR dgraph-io/dgraph#8423) and our test fails because time zone is outside of the range [0, 23]. Unfortunately though, when inserting data to our database, we use time.Parse function and parsing works fine with no error. While when querying the database, we use MarshalJson on time.Time object for returning JSON results that returns an error and causes unexpected behavior. Is there a plan to revert the change in the behavior of MarshalJson or any recommendation of function to use instead so that this feels more natural?

@bcmills
Copy link
Contributor

bcmills commented Feb 6, 2023

@mangalaman93, the revert was merged for UnmarshalJSON. MarshalJSON was not reverted, I assume because nobody brought up any instances of Go programs intentionally marshaling non-RFC-3339 timestamps using the MarshalJSON method.

To unblock your upgrade, you should be able to bypass the error using a custom type or custom MarshalJSON method (described in more detail in #57912 (comment)).

@dsnet
Copy link
Member Author

dsnet commented Feb 6, 2023

@mangalaman93 If you're using time.Parse to read the timestamps, then I highly recommend also using time.Time.Format to format the timestamps according to the same format string used for both calls.

@mangalaman93
Copy link

We are using format time.RFC3339 at the time of parsing and that goes through perfectly fine,https://github.com/dgraph-io/dgraph/blob/main/types/scalar_types.go#L211. Though, when we call MarshalJson it falls flat here https://github.com/dgraph-io/dgraph/blob/main/query/outputnode.go#L680. I understand the work around but this feels odd to me that everything works for a non-compliant timestamp except MarshalJson.

@golang golang locked and limited conversation to collaborators Feb 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests