Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include more metadata in wptreport JSON #10511

Closed
foolip opened this issue Apr 17, 2018 · 22 comments
Closed

Include more metadata in wptreport JSON #10511

foolip opened this issue Apr 17, 2018 · 22 comments

Comments

@foolip
Copy link
Member

foolip commented Apr 17, 2018

This came up in Design Doc: WPT Results Receiver with @jgraham and @Hexcles.

The format is currently:

{"results": [
  {"test": "/foo/bar/baz.html",
   "status": "OK",
   "message": null,
   "subtests": [
     {"name": "...", "status": "PASS", "message": "..."},
     ...
   ]
  },
  ...
]}

Add something like a metadata field at the top level, another object with something like:

  • The wpt commit being tested
  • Full wpt run command line (which would include sharding commands)
  • OS, browser and webdriver version. Browser version also needs to include channel (stable, beta, dev) since the exact same version number might appear in two release channels.
  • Datetime when wpt run was invoked and datetime when run finished. (duration can be computed)
  • Name and revision of any runner environment being used, like https://github.com/web-platform-tests/results-collection

@jugglinmike, anything else you'd like to put in there?

@foolip foolip added the infra label Apr 17, 2018
@gsnedders
Copy link
Member

OS, browser and webdriver version. Browser version also needs to include channel (stable, beta, dev) since the exact same version number might appear in two release channels.

I'd like to see the UA string, too?

@gsnedders
Copy link
Member

See also #10026.

@foolip
Copy link
Member Author

foolip commented Apr 18, 2018

The UA string couldn't be trusted for the version number, as Safari's is frozen, but sure, if it's already known then including it seems fine.

@jugglinmike
Copy link
Contributor

OS, browser and webdriver version. Browser version also needs to include > channel (stable, beta, dev) since the exact same version number might appear > in two release channels.

What do you think about a vendor-defined build identifier?

Full wpt run command line (which would include sharding commands)

The arguments passed to the WPT CLI may be overly specific to our current needs. They may include a lot of irrelevant information (--yes, --log-tbpl, etc.). Other folks don't use that tool, so they would have to take extra steps to interpret that information. This could be inconvenient when comparing datasets across time because the CLI is part of WPT itself--the type and meaning of arguments may change with any revision.

What do you think about distilling the relevant information down into runner-agnostic properties? That would include environment variables and command-line arguments for both the browser process and the WebDriver server process, though there might be more.

This would omit the timeout multiplier. This is certainly relevant, so we could track that separately (if only to avoid asking folks to perse UNIX-style commands to get at the data). I'm wondering, though, if we want to support variation there. Might it be better to promote uniformity by requiring reporters to use the default timeout values?

The WPT server configuration is another source of variation between test executions. Unlike the WPT CLI, I'm not aware of anyone who attempts to maintain their own server. Then again, it's not clear if any of that configuration is relevant for reproducibility. It's possible that a given test might pass when served from port 8000 not port 8001, but that doesn't seem likely enough to warrant inclusion in the report.

@foolip
Copy link
Member Author

foolip commented Apr 18, 2018

We will need the channel information for presentation, so putting it a single vendor-defined build identifier would require parsing it out again.

Similarly, for the command line args I was thinking that's useful debug information, not for wpt.fyi to parse information from. Any information we know we can make use of makes more sense to extract out into separate fields.

Other than the packaging of the information, is there anything extra to include that's not been mentioned?

@jgraham, are there other more verbose output formats we could steal some ideas from?

@jugglinmike
Copy link
Contributor

Other than being potentially more stable over time, I agree that my suggested additions won't be of particular use to the wpt.fyi website. For this generic test results format, I'm interested in promoting reproducibility.

@foolip
Copy link
Member Author

foolip commented Apr 18, 2018

I guess the various config dicts might also make sense to include, if we're to maximize debugability and reproducibility.

@foolip
Copy link
Member Author

foolip commented Apr 18, 2018

We could include both the driver and browser paths, and maybe derive stable vs. experimental from that, but I suspect that won't be different on Windows Insider Preview.

@jgraham
Copy link
Contributor

jgraham commented Apr 18, 2018

So, in wptrunner implementation terms there are a few things to consider:

  • The most natural place to put this information in the structured log is in the suite_start message. We should always pass things though the structured logger and never add information directly in the formatter.
  • suite_start currently takes four parameters of interest. run_info is approximately information produced by mozinfo which for Firefox is about the build configuration. device_info is typically about the physical hardware and version_info is for extra information about the browser version. extra is basically a free-for-all.
  • We currently only provide run_info and use as input the expectation ini files when evaluating conditionals. For that reason we also added "product" to run_info. In practice I don't think much in the mozlog ecosystem cares about the details of what's in these fields.
  • We have users who aren't running with wpt run and users who aren't running from git. So if we start to make things that depend on either of those without fallback then it will break (notably in gecko).

More generally I think we should not try to stuff lots of debugging information into these fields. For the use case of debugging public runs, a fuller log should be available. For reproduction of private runs we have already lost if people are allowed to use whatever test runner they like. So lots of detail about wptrunner-specific configuration options and so on doesn't make much sense to me. I would suggest the following:

  • Start/end times (already recorded).
  • Chunk information
  • Git commit (if available)
  • Browser version inc. some channel information (the Firefox version number already gives you channel information, so I don't know this needs to be multiple fields).
  • OS information (this is already in run_info).

I think recording all the prefs passed in may be too much (it's at least hard to do in wptrunner).

For chunk information in particular I'm very happy to extend the mozlog API to explicitly support chunk/totalChunks/repetition fields on suite_start. For git commit information, I suggest we put it in run_info. We can probably put the browser version into version_info.

[1] https://searchfox.org/mozilla-central/source/testing/mozbase/mozlog/mozlog/structuredlog.py#286

@Hexcles
Copy link
Member

Hexcles commented May 2, 2018

I agree with @jgraham 's proposed list of extra fields, at least as starters.

In terms of the priorities, git commit, browser & OS versions are the most important. I'd like to have them in the report before the result receiver launches, since they will affect whether runners need to include extra fields in the requests when uploading new results. I'll take a stab at adding them to the JSON report. @foolip , WDYT?

Meanwhile, @jgraham could you look into the other fields when you have some time? They are not as urgent.

Hexcles added a commit that referenced this issue May 3, 2018
Hexcles added a commit that referenced this issue May 3, 2018
@Hexcles
Copy link
Member

Hexcles commented May 3, 2018

The browser information is a blocker that we need to address before relying on that in the results receiver of wpt.fyi.

I'm proposing:

  • browser: {chrome, edge, firefox, safari, etc.}
  • browser_version: "version string preferably from binary --version but without the product name"
  • experimental: {true, false}
  • os: {linux, windows, osx, android, ios}

browser & os are really enums

We should define a list of known browsers and os. I think the ones I gave above are a good start.

Why experimental needs to be a separate field?

Although browser versions usually have enough information to tell the channel (e.g. "Mozilla Firefox 61.0a1", "Google Chrome 68.0.3409.2 dev"), we need to prepare for the two potential changes in the future:

  • WPT CLI Feature proposal: --experimental #10452 : if we add an --experimental flag to wpt run to test cutting-edge browsers with some experimental features turned on, then this experimental field will capture the important information of whether that flag is used, which coincides with whether the dev/nightly channel is used.
  • Change of version schemas or mobile browsers. Relying on version schema is a bit fragile, especially if the schema changes at some point, we'd need two different patterns before and after that point. Besides, in wpt.fyi we always wish to have more browsers, especially mobile browsers. Having different rules for each browser isn't very scalable and some browsers' version strings might not even include the channel.

What about WebDriver/executor?

I'm thinking we can leave the question for a later day, as it's non-blocking.

@foolip
Copy link
Member Author

foolip commented May 3, 2018

@Hexcles, that set of information SGTM, I'll go ahead and assign this issue to you for the initial batch.

I do think that it would be a good idea to include the verbatim string from which some of which this information is derived, i.e., the unprocessed binary --version and the command line arguments used to start the browser. Possibly also uname -a or wherever it is that OS name comes from.

Nit: osx → macos

Hexcles added a commit that referenced this issue May 3, 2018
run_info includes OS and browser info.

This is the first stab at #10511 .
Hexcles added a commit that referenced this issue May 3, 2018
run_info includes OS and browser info.

This is the first stab at #10511 .
@foolip
Copy link
Member Author

foolip commented May 7, 2018

Awesome, with #10811 merged, https://taskcluster-artifacts.net/H11pXovsRiyOOgBcxbDHow/0/public/results/wpt_report.json now begins like this:

{
  "run_info": {
    "bits": 64,
    "debug": false,
    "processor": "x86_64",
    "os": "linux",
    "version": "Ubuntu 16.04",
    "product": "chrome",
    "os_version": "16.04",
    "linux_distro": "Ubuntu",
    "has_sandbox": true
  },
  "time_start": 1525648962173,
  "time_end": 1525650923418,
  "results": ...
}

Looks like we have most of the information mentioned in this issue already.

@Hexcles, was it deliberate that the timestamps are not inside run_info? And do you think it would be an improvement to format the dates as strings with timezone, to make it unambiguous to any reader, and harder to get it wrong for the producer?

I also wonder what debug applies to, if it's the product under test or wpt run. Also version seems like it should be prefixed with... something?

@gsnedders
Copy link
Member

@foolip Given it just uses the pre-existing run_info, from the mozlog docs:

An optional dictionary describing the properties of the build and test environment. This contains the information provided by mozinfo, plus a boolean debug field indicating whether the build under test is a debug build.

So, uh, it's whatever mozinfo does. And its docs don't seem to say what fields it provides and what they mean… :/

@Hexcles
Copy link
Member

Hexcles commented May 7, 2018

@foolip Yes, the timestamps were intentionally put outside of run_info. The rationales are: 1. jgraham and I prefer having the original run_info from mozlog in the report; 2. I don't want to add timestamps to run_info because they already exist as a top-level field in all mozlog entries. Regarding the format, this is milliseconds since UNIX epoch. Since the report is mostly consumed by machines, I'm personally in favour of timestamps rather than human-readable date strings, but I'm open to suggestions.

The version thing is automatically generated by mozlog using mozinfo and seems to be yet another (and somewhat redundant) representation of OS version. I don't want to go down the rabbit hole for now; in fact, I'll only require the os field in the results receiver. Besides, I'll add another field called browser_version into run_info.

@foolip
Copy link
Member Author

foolip commented May 7, 2018

@Hexcles if not all information will make sense in run_info, I wonder if we should put all of the metadata in a single top-level attribute like metadata or... something? Unfortunately run_info is a pretty good name, but it's taken :)

And will the run_info stuff be stable enough to use in the results receiver, or should we have code in wpt run that derives the information we need from it and includes both the input (run_info) and the output in the report?

@gsnedders
Copy link
Member

@foolip filed https://bugzilla.mozilla.org/show_bug.cgi?id=1459651 for the lack of documentation of any of the fields in run_info, FWIW.

@jugglinmike
Copy link
Contributor

@Hexcles I know that you would prefer to hold off on documenting the data structure for now, but given that run_info is itself undocumented and other system will depend on this data, it seems wise to at least assert the presence of the required contents. What do you think?

@Hexcles
Copy link
Member

Hexcles commented May 10, 2018

@jugglinmike yes I will.

Status update: I think the last piece of crucial information missing in wptreport is the revision (we don't have #10452 yet so there'll be no "experimental" field for now).

@jugglinmike
Copy link
Contributor

Thanks, Robert!

@foolip
Copy link
Member Author

foolip commented Jul 11, 2018

@Hexcles, is this issue finished now, given that #10971 is merged?

@Hexcles
Copy link
Member

Hexcles commented Jul 11, 2018

Yeah I think we've covered the basics. If there's anything else we'd like to add we can open new issues.

@Hexcles Hexcles closed this as completed Jul 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants