Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing ISO 8601 / RFC 3339 datetime string? #44

Open
Boscop opened this issue Mar 2, 2018 · 8 comments
Open

Parsing ISO 8601 / RFC 3339 datetime string? #44

Boscop opened this issue Mar 2, 2018 · 8 comments

Comments

@Boscop
Copy link

Boscop commented Mar 2, 2018

What's the correct way to parse a ISO 8601 / RFC 3339 datetime string?
This is very common in json communication.
On the server side we are using Rust for our API and DateTime::to_rfc3339() to convert the datetimes to String for the json API, which can also be expressed with the format string "%+":
> %+: Same to %Y-%m-%dT%H:%M:%S%.f%:z, i.e. 0, 3, 6 or 9 fractional digits for seconds and colons in the time zone offset.

So it has a variable number of digits for the fractional seconds, depending on the timestamp in question.
If it falls on a second boundary, it has 0 fractional second digits, like "1970-01-01T00:00:00+00:00".
Also it has the timezone at the end.

How can I parse this ISO 8601 / RFC 3339 datetime string in my PureScript frontend?

@garyb
Copy link
Member

garyb commented Mar 5, 2018

I think at the moment writing a parser using purescript-parsing or something like that is probably your best bet, as I guess the format language we have in here at the moment isn't expressive enough for that.

@safareli
Copy link
Contributor

safareli commented Mar 15, 2018

@Boscop you could build multiple formats and use unformatParser like this:

myParser 
  =  try (unformatParser format1)
 <|> try (unformatParser format2)
 <|> unformatParser format3

parse str = runParser str myParser 

Note, you might wanna use try, but it should be possible to order formats in such a way that it's not needed. actually I think you need try and it can't be avoided.

Also you can just use unformat and <|>:

parse str
  =  unformat format1 str
 <|> unformat format2 str
 <|> unformat format3 str

@Boscop
Copy link
Author

Boscop commented Jun 9, 2018

@safareli Thanks. But I also need support for microseconds like "2017-11-21T05:16:29.120116+00:00" and it doesn't support that (only milliseconds):
https://github.com/slamdata/purescript-formatters/blob/v3.0.0/src/Data/Formatter/DateTime.purs#L122
Would it be possible to add support for microseconds (6 digits) (and maybe nanoseconds (9 digits))? :)

Also, is there a way that I only have to parse the format string once at the first use, and then not on subsequent uses? With a lazy variable somehow?

@garyb
Copy link
Member

garyb commented Jun 9, 2018

There'll be a bit of a problem there since the DateTime representation that is being parsed/formatted is only millisecond-precise.

You could just create the format string at the top level and re-use it, then the parse cost is at startup. Lazy might well be another option. But I'd suggest constructing the format commands directly rather than using the string parsing method as another option: #22 🙂

@Boscop
Copy link
Author

Boscop commented Jun 10, 2018

@garyb But how can I make it re-use the evaluated value?
I currently do this:

fmt_rfc3339 = parseFormatString "YYYY-MM-DDTHH:mm:ss+00:00"
fmt_german = parseFormatString "DD.MM.YYYY, HH:mm"

humanTime s = either id id do
  decode <- fmt_rfc3339
  encode <- fmt_german
  datetime <- unformat decode s
  pure $ format encode datetime

Is that the most efficient way to do it?


There'll be a bit of a problem there since the DateTime representation that is being parsed/formatted is only millisecond-precise.

That's ok, it can round to the nearest millisecond.. Or even just truncate/ignore them. It should still be able to parse it though.. :)

@safareli
Copy link
Contributor

safareli commented Jun 11, 2018

Yes parseFormatString parses format string into Format value. if you are declaring format on top level you can also do this so if format was invalid for some reason you get an error on start up:

fmt_rfc3339 :: Format
fmt_rfc3339 = case parseFormatString "YYYY-MM-DDTHH:mm:ss+00:00" of
  Left err -> unsafeCrushWith $ "format must have been valid " <> show err
  Right x -> x
fmt_german :: Format
fmt_german = case parseFormatString "DD.MM.YYYY, HH:mm" of
  Left err -> unsafeCrushWith $ "format must have been valid " <> show err
  Right x -> x

humanTime s = either id id do
  datetime <- unformat fmt_rfc3339 s
  pure $ format fmt_german datetime

Also as @garyb noted you can just build this formats like this #22 and you woulnd't need the parseFormatString.

@safareli
Copy link
Contributor

If you {nano,micro}seconds are in the end of the input string, and you are willing to play with parser combinatorics you can use unformatParser to get datetime and then discard rest of the string. (runPwhich use used to create unformat function adds eof parser to unformatParser)

@vlatkoB
Copy link

vlatkoB commented Oct 25, 2019

Would you accept a PR that adds formatters (UUU,MicrosecondsRounded) and (NNN,NanosecondsRounded)?
Currently, I can't parse this: "2019-08-07T10:16:58.055246Z"

EDIT: Sign/constructor change to better reflect that rounding takes place

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants