Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further JSON processing #159

Open
annevk opened this issue Oct 4, 2017 · 14 comments
Open

Further JSON processing #159

annevk opened this issue Oct 4, 2017 · 14 comments

Comments

@annevk
Copy link
Member

annevk commented Oct 4, 2017

Context:

Example of the complexity needed today:

Folks involved: @mgiuca @tobie @marcoscaceres @kenchris @domenic @zcorpan @dhausknecht

Not having to deal with JavaScript seems like a big step up and something we should cover. Basically convert the output to IDL/Infra types. Note that we'll have to keep the existing hook that exposes just JavaScript, as that's what Fetch and XMLHttpRequest need.

Then there's the question if we should cover more and what kind of model we want. I basically think what we want universally is to ignore a field if it's value is an unknown type. And we don't want 123 to become "123" if the field is of type string (IDL's behavior for JavaScript).

It seems like offering some kind of validation language could be useful as otherwise everyone has to reinvent it which opens up potential differences in processing, which is unlikely what web platform consumers want.

@annevk
Copy link
Member Author

annevk commented Mar 19, 2018

And we don't want 123 to become "123" if the field is of type string (IDL's behavior for JavaScript).

Note that if we do accept this, we could probably reuse most of the IDL dictionary infrastructure here.

cc @mnot @mikewest

@domenic
Copy link
Member

domenic commented Mar 19, 2018

I'll want this infrastructure for https://github.com/domenic/package-name-maps and will probably give a shot at generalizing it while I work on that.

I don't think we want to allow type conversions.

@annevk
Copy link
Member Author

annevk commented Mar 20, 2018

https://www.w3.org/TR/tracking-dnt/#status-representation is another example of something browsers might consume.

@domenic
Copy link
Member

domenic commented Apr 19, 2018

I think there are potentially two issues here:

  1. How to write normative spec text that parses JSON into spec-level data structures (infra ordered maps/lists, plus the string/number/null/boolean primitives (see Define numbers (waiting on Number / BigInt) #87))
  2. How to express the expected structure of your JSON, and validate that the parsed JSON conforms to that structure.

(1) I think is straightforward: we basically use the existing parse JSON operation which gives a JS object, then add some stuff in Infra that crawls the JS object and produces spec-level data structures from it. Then no other spec ever has to worry about JS objects; Infra just abstracts that away into a simple bytes/string -> spec-level data structures operation.

(2) is trickier, and perhaps belongs in a separate issue. For simple cases, you can just check in prose; this will probably be necessary anyway in most cases (e.g. checking that the contents of a string are in the expected format).

Currently some people use Web IDL for (2), but this has pretty bad semantics. (E.g. for a field that expects a string, you could give it { foo: "bar" } instead, and your algorithm would see the string "[object Object]" due to IDL's conversion.) JSON schema seems like it should be the answer, but it's impressively ugly and unreadable... or more charitably, it's very unlike what most readers of web specs are used to.

I'm curious on others' thoughts on (2), as this does continually keep coming up.

@annevk
Copy link
Member Author

annevk commented Apr 19, 2018

You don't want to validate. If something is a number instead of a string, what you most likely want is to ignore that member and not have it in your returned data structure. As some kind of forward-compatible "error" handling.

And yeah, that would mean defining yet another schema/IDL of sorts.

@domenic
Copy link
Member

domenic commented Apr 19, 2018

I suppose that's true, if the field is optional anyway. If the field is required and you're expecting a string, you want to error out.

@annevk
Copy link
Member Author

annevk commented Apr 19, 2018

Yeah, at which point you kinda need a schema available while converting (or you have a schema for specification data structures so you can do post-conversion cleanup).

@tobie
Copy link
Collaborator

tobie commented Apr 19, 2018

So I’ve used JSON schema for specifying and testing the data structures for Specref. I agree it’s not the best thing ever, but let’s be fair, neither is WebIDL. I got the hang of it pretty quickly, and it mostly does the job it’s designed for. I don’t know that there are many existing options that are better.

@annevk
Copy link
Member Author

annevk commented Apr 19, 2018

@tobie does JSON schema do the kind of thing we need though? We need something that's IDL-like, not DTD/RelaxNG/etc.-like.

@tobie
Copy link
Collaborator

tobie commented Apr 20, 2018

@annevk JSON schema doesn't do any conversion for you, iirc.

domenic added a commit that referenced this issue May 6, 2019
Part of #159. This provides the basic parsing framework, although no validation or type-checking.
domenic added a commit that referenced this issue May 6, 2019
Part of #159. This provides the basic parsing framework, although no validation or type-checking.
domenic added a commit that referenced this issue May 10, 2019
Part of #159. This provides the basic parsing framework, although no validation or type-checking.
domenic added a commit that referenced this issue May 10, 2019
Part of #159. This provides the basic parsing framework, although no validation or type-checking.
@domenic
Copy link
Member

domenic commented May 10, 2019

For folks watching this popular thread, we've just added a parse JSON into Infra values algorithm. This gives you back Infra ordered maps/lists/booleans/numbers/strings/nulls, which your spec can operate on without having to (ab)use JavaScript abstract operations or Web IDL dictionaries.

You can see a not-quite-finished usage example of this in a parsing setting in WICG/import-maps#130 (preview), or hopefully in a day or two, in https://wicg.github.io/import-maps/.

I think there is perhaps room for a future JSON schema-based solution for validation, or standardized documentation format for JSON, but for the purpose of specifying processing models, I suspect this is going to be the best way for a while, and will closely match implementations.

@foolip
Copy link
Member

foolip commented Sep 30, 2020

Does this issue also track adding a variant of https://infra.spec.whatwg.org/#serialize-json-to-bytes which would take Infra values as the input instead of JS values? We might need it in w3c/webdriver-bidi#56.

@annevk
Copy link
Member Author

annevk commented Sep 30, 2020

No, this is really about doing validation in a consistent manner.

@foolip
Copy link
Member

foolip commented Oct 1, 2020

OK, I've filed #336 for that instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants