Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilities for querying and updating the data as a set of files #6585

Closed
foolip opened this issue Aug 28, 2020 · 9 comments
Closed

Utilities for querying and updating the data as a set of files #6585

foolip opened this issue Aug 28, 2020 · 9 comments
Labels
infra 🏗️ Infrastructure issues (npm, GitHub Actions, releases) of this project

Comments

@foolip
Copy link
Collaborator

foolip commented Aug 28, 2020

Background: BCD is organized as a set of JSON files split across many directories, which are combined into a single object with all data for data consumers.

Problem: The mapping from files to a single object is not easily reversible. Code in this repository (linter) and elsewhere (mdn-bcd-collector, mdn-confluence) sometimes wants to work with the data as a whole, but then report on or update the files themselves.

Concrete use cases / examples (please edit to add):

  • Updating the version at object path javascript.builtins.Intl.ListFormat.formatToParts and saving the change to file javascript/builtins/intl/ListFormat.json (note "Intl" vs. "intl") (Provide utils to save in-memory BCD to files #3617)
  • More generally, make any change, removal or addition to the object, call something.writeToFile() and have that work.
  • Ensuring that api.DOMTokenList is supported if api.Element.classList is and reporting it as an error on api/DOMTokenList.json if not. (Linter: check consistency of dependent features #6571)
  • Enumerate all features while applying path-based filters, which at least npm run mirror already does
  • Semantic BCD diffs. For example, load two different sets of BCD JSON files, and be able to enumerate the features which have changed between them (e.g., for generating human-readable summaries in PRs)

Inspired by the discussion in #6571 and supersedes #3617.

@foolip foolip added the infra 🏗️ Infrastructure issues (npm, GitHub Actions, releases) of this project label Aug 28, 2020
@foolip
Copy link
Collaborator Author

foolip commented Aug 28, 2020

Other nice-to-haves that could follow:

  • Iterating browsers releases in the right order
  • Querying all features by their support status in different browsers, listing which files (and ideally lines) those are defined by
  • Iteration and querying with filters like removing anything behind a flag or in a future release, to make "has it shipped" an easy question to answer
  • Helpers to update compat data if needed, something like feature.support.set('firefox', '54', options) which would ensure that the data reflects that it's supported in version 54, but only overwrites if options.remove or something is set. Or feature.support.addNote(...) which doesn't clobber an existing note but instead appends.

@ddbeck
Copy link
Collaborator

ddbeck commented Aug 28, 2020

I added the "Semantic BCD diffs" use case to the list.

In general, this suggests that it would be nice to have ways to query or traverse BCD data that maps more closely to its semantics than the literal objects, too. In other words, spare ourselves from having to do a lot of type checking to do routine things such as:

  • Support iteration: for (const support of feature.support.get("firefox")) would always iterate, without having to check whether Firefox has a single support statement or an array
  • Summations: feature.support.introduced("firefox"), feature.support.removed("edge") for first version a feature was added in a particular browser or the version it was last removed
  • Assertions: supports("some.dotted.feature", "chrome", "62") could test whether Chrome 62 is in the range of supported values for that feature (I'm imagining the API I'd like to have for Linter: check consistency of dependent features #6571 here)

foolip added a commit to foolip/browser-compat-data that referenced this issue Jan 14, 2021
This is to not have any doubt about whether other packages can depend on
utilities we add for maintaining BCD. In particular, when experimenting
with scripts for querying/updating the data it would be nice to know
it's internal to BCD only.

See mdn#6585.
@foolip
Copy link
Collaborator Author

foolip commented Feb 25, 2021

I've sent some work-in-progress utilities in #9257 and but bringing back some of the discussion here and looping in @Elchi3.

Here's what I've found useful in many different contexts and I think we should turn into shared utilities first:

  • A fixed path like api.Element.classList and the ability to find the feature (object with a __compat property) at that path.
  • Path "selectors" with wildcard support like api.*, api.RTC* or api.Element.* which find zero or more features.

I think we should update npm run traverse, npm run stats and a bunch of other things to accept such selectors, and I'd also like to use them in https://github.com/foolip/mdn-bcd-collector. As a start, I think we could have only selectors and no fixed paths, and just use the first feature found in contexts where that's expected.

@ddbeck is that something you've already written utilities for?

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 26, 2021

I haven't messed with wildcards yet, but I've got utility functions for

  • query("some.dotted.path") — fetches a single subtree (not sure what the semantics would be for wildcards)
  • walk("some.dotted.path") — walks the tree recursively, as a generator (e.g., for { path, feature } of walk("some.dotted.path"))

I've also been playing with a couple of other utilities, though more incomplete:

  • visit(path, function testFn(path, feature) { /* … */ }, function visitorFn(path, feature) { /* … */ }) — This is probably an antecedent to wildcards. For example:

    Code

    visit(
      undefined,
      (path) => {
        return path.startsWith("api.RTC");
      },
      (path, feature) => {
        const std = feature.status.standard_track ? "✅" : "❗️";
        console.log(`${std} ${path}`);
      }
    );

    Output

    …
    ✅ api.RTCIceCandidatePairStats.retransmissionsSent
    ✅ api.RTCIceCandidatePairStats.state
    ✅ api.RTCIceCandidatePairStats.totalRoundTripTime
    ✅ api.RTCIceCandidatePairStats.transportId
    ❗️ api.RTCIceCandidatePairStats.writable
    ✅ api.RTCIceCandidateStats
    ✅ api.RTCIceCandidateStats.address
    ✅ api.RTCIceCandidateStats.candidateType
    ❗️ api.RTCIceCandidateStats.componentId
    ✅ api.RTCIceCandidateStats.deleted
    ❗️ api.RTCIceCandidateStats.networkType
    ✅ api.RTCIceCandidateStats.port
    ✅ api.RTCIceCandidateStats.priority
    ✅ api.RTCIceCandidateStats.protocol
    …
    

    The point of this one was to be able to control traversal by filtering (e.g., skipping objects with testFn) or continue and break-like control over recursion via visitorFn return values. Very much inspired by unist-util-visit.

  • iterSupport(feature, browser) — a convenience function for iterating over support statements (e.g., for statement of iterSupport(feature, "firefox")) that would always Just Work™, even if the browser is missing or there's only a single support statement instead of an array

I'll open a PR with actual code shortly.

@foolip
Copy link
Collaborator Author

foolip commented Feb 26, 2021

Sounds great @ddbeck!

@ddbeck
Copy link
Collaborator

ddbeck commented Mar 7, 2021

OK "shortly" has been a bit longer coming than I anticipated. That said, I did clean things up a little and you can catch a glimpse of what I've already written here: ddbeck/bcd-utils. In addition to the utilities in src, there's a few examples to demonstrate different uses.

One thing that I haven't confronted yet: I've got a certain vocabulary for traversing BCD. When I open a PR to add, for instance, query(), should I rename it to something like queryById() so that it can fit in with @foolip's query module?

@foolip
Copy link
Collaborator Author

foolip commented Mar 12, 2021

@ddbeck I would be very happy to see all of that merged into BCD as-is, no need to rename in anticipation of future changes. If we exclude this from the NPM package we can change it quite freely.

@foolip
Copy link
Collaborator Author

foolip commented Mar 12, 2021

How about we add it under utils/?

@queengooborg
Copy link
Collaborator

I think #9441 resolves this so I'm going to go ahead and close this. It'd be nice to publish the utilities in the future, but that's another issue for another day!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra 🏗️ Infrastructure issues (npm, GitHub Actions, releases) of this project
Projects
None yet
Development

No branches or pull requests

4 participants
@ddbeck @foolip @queengooborg and others