Include more metadata in wptreport JSON #10511

foolip · 2018-04-17T20:24:37Z

This came up in Design Doc: WPT Results Receiver with @jgraham and @Hexcles.

The format is currently:

{"results": [
  {"test": "/foo/bar/baz.html",
   "status": "OK",
   "message": null,
   "subtests": [
     {"name": "...", "status": "PASS", "message": "..."},
     ...
   ]
  },
  ...
]}

Add something like a metadata field at the top level, another object with something like:

The wpt commit being tested
Full wpt run command line (which would include sharding commands)
OS, browser and webdriver version. Browser version also needs to include channel (stable, beta, dev) since the exact same version number might appear in two release channels.
Datetime when wpt run was invoked and datetime when run finished. (duration can be computed)
Name and revision of any runner environment being used, like https://github.com/web-platform-tests/results-collection

@jugglinmike, anything else you'd like to put in there?

The text was updated successfully, but these errors were encountered:

gsnedders · 2018-04-17T21:34:25Z

OS, browser and webdriver version. Browser version also needs to include channel (stable, beta, dev) since the exact same version number might appear in two release channels.

I'd like to see the UA string, too?

gsnedders · 2018-04-17T21:55:51Z

See also #10026.

foolip · 2018-04-18T04:33:33Z

The UA string couldn't be trusted for the version number, as Safari's is frozen, but sure, if it's already known then including it seems fine.

jugglinmike · 2018-04-18T14:41:04Z

OS, browser and webdriver version. Browser version also needs to include > channel (stable, beta, dev) since the exact same version number might appear > in two release channels.

What do you think about a vendor-defined build identifier?

Full wpt run command line (which would include sharding commands)

The arguments passed to the WPT CLI may be overly specific to our current needs. They may include a lot of irrelevant information (--yes, --log-tbpl, etc.). Other folks don't use that tool, so they would have to take extra steps to interpret that information. This could be inconvenient when comparing datasets across time because the CLI is part of WPT itself--the type and meaning of arguments may change with any revision.

What do you think about distilling the relevant information down into runner-agnostic properties? That would include environment variables and command-line arguments for both the browser process and the WebDriver server process, though there might be more.

This would omit the timeout multiplier. This is certainly relevant, so we could track that separately (if only to avoid asking folks to perse UNIX-style commands to get at the data). I'm wondering, though, if we want to support variation there. Might it be better to promote uniformity by requiring reporters to use the default timeout values?

The WPT server configuration is another source of variation between test executions. Unlike the WPT CLI, I'm not aware of anyone who attempts to maintain their own server. Then again, it's not clear if any of that configuration is relevant for reproducibility. It's possible that a given test might pass when served from port 8000 not port 8001, but that doesn't seem likely enough to warrant inclusion in the report.

foolip · 2018-04-18T14:51:37Z

We will need the channel information for presentation, so putting it a single vendor-defined build identifier would require parsing it out again.

Similarly, for the command line args I was thinking that's useful debug information, not for wpt.fyi to parse information from. Any information we know we can make use of makes more sense to extract out into separate fields.

Other than the packaging of the information, is there anything extra to include that's not been mentioned?

@jgraham, are there other more verbose output formats we could steal some ideas from?

jugglinmike · 2018-04-18T14:58:49Z

Other than being potentially more stable over time, I agree that my suggested additions won't be of particular use to the wpt.fyi website. For this generic test results format, I'm interested in promoting reproducibility.

foolip · 2018-04-18T15:02:31Z

I guess the various config dicts might also make sense to include, if we're to maximize debugability and reproducibility.

foolip · 2018-04-18T16:48:33Z

We could include both the driver and browser paths, and maybe derive stable vs. experimental from that, but I suspect that won't be different on Windows Insider Preview.

jgraham · 2018-04-18T18:52:03Z

So, in wptrunner implementation terms there are a few things to consider:

The most natural place to put this information in the structured log is in the suite_start message. We should always pass things though the structured logger and never add information directly in the formatter.
suite_start currently takes four parameters of interest. run_info is approximately information produced by mozinfo which for Firefox is about the build configuration. device_info is typically about the physical hardware and version_info is for extra information about the browser version. extra is basically a free-for-all.
We currently only provide run_info and use as input the expectation ini files when evaluating conditionals. For that reason we also added "product" to run_info. In practice I don't think much in the mozlog ecosystem cares about the details of what's in these fields.
We have users who aren't running with wpt run and users who aren't running from git. So if we start to make things that depend on either of those without fallback then it will break (notably in gecko).

More generally I think we should not try to stuff lots of debugging information into these fields. For the use case of debugging public runs, a fuller log should be available. For reproduction of private runs we have already lost if people are allowed to use whatever test runner they like. So lots of detail about wptrunner-specific configuration options and so on doesn't make much sense to me. I would suggest the following:

Start/end times (already recorded).
Chunk information
Git commit (if available)
Browser version inc. some channel information (the Firefox version number already gives you channel information, so I don't know this needs to be multiple fields).
OS information (this is already in run_info).

I think recording all the prefs passed in may be too much (it's at least hard to do in wptrunner).

For chunk information in particular I'm very happy to extend the mozlog API to explicitly support chunk/totalChunks/repetition fields on suite_start. For git commit information, I suggest we put it in run_info. We can probably put the browser version into version_info.

[1] https://searchfox.org/mozilla-central/source/testing/mozbase/mozlog/mozlog/structuredlog.py#286

Hexcles · 2018-05-02T18:58:01Z

I agree with @jgraham 's proposed list of extra fields, at least as starters.

In terms of the priorities, git commit, browser & OS versions are the most important. I'd like to have them in the report before the result receiver launches, since they will affect whether runners need to include extra fields in the requests when uploading new results. I'll take a stab at adding them to the JSON report. @foolip , WDYT?

Meanwhile, @jgraham could you look into the other fields when you have some time? They are not as urgent.

The first stab at #10511

Hexcles · 2018-05-03T02:00:38Z

The browser information is a blocker that we need to address before relying on that in the results receiver of wpt.fyi.

I'm proposing:

browser: {chrome, edge, firefox, safari, etc.}
browser_version: "version string preferably from binary --version but without the product name"
experimental: {true, false}
os: {linux, windows, osx, android, ios}

`browser` & `os` are really enums

We should define a list of known browsers and os. I think the ones I gave above are a good start.

Why `experimental` needs to be a separate field?

Although browser versions usually have enough information to tell the channel (e.g. "Mozilla Firefox 61.0a1", "Google Chrome 68.0.3409.2 dev"), we need to prepare for the two potential changes in the future:

WPT CLI Feature proposal: --experimental #10452 : if we add an --experimental flag to wpt run to test cutting-edge browsers with some experimental features turned on, then this experimental field will capture the important information of whether that flag is used, which coincides with whether the dev/nightly channel is used.
Change of version schemas or mobile browsers. Relying on version schema is a bit fragile, especially if the schema changes at some point, we'd need two different patterns before and after that point. Besides, in wpt.fyi we always wish to have more browsers, especially mobile browsers. Having different rules for each browser isn't very scalable and some browsers' version strings might not even include the channel.

What about WebDriver/executor?

I'm thinking we can leave the question for a later day, as it's non-blocking.

foolip · 2018-05-03T08:21:58Z

@Hexcles, that set of information SGTM, I'll go ahead and assign this issue to you for the initial batch.

I do think that it would be a good idea to include the verbatim string from which some of which this information is derived, i.e., the unprocessed binary --version and the command line arguments used to start the browser. Possibly also uname -a or wherever it is that OS name comes from.

Nit: osx → macos

run_info includes OS and browser info. This is the first stab at #10511 .

foolip · 2018-05-07T09:25:21Z

Awesome, with #10811 merged, https://taskcluster-artifacts.net/H11pXovsRiyOOgBcxbDHow/0/public/results/wpt_report.json now begins like this:

{
  "run_info": {
    "bits": 64,
    "debug": false,
    "processor": "x86_64",
    "os": "linux",
    "version": "Ubuntu 16.04",
    "product": "chrome",
    "os_version": "16.04",
    "linux_distro": "Ubuntu",
    "has_sandbox": true
  },
  "time_start": 1525648962173,
  "time_end": 1525650923418,
  "results": ...
}

Looks like we have most of the information mentioned in this issue already.

@Hexcles, was it deliberate that the timestamps are not inside run_info? And do you think it would be an improvement to format the dates as strings with timezone, to make it unambiguous to any reader, and harder to get it wrong for the producer?

I also wonder what debug applies to, if it's the product under test or wpt run. Also version seems like it should be prefixed with... something?

gsnedders · 2018-05-07T15:46:06Z

@foolip Given it just uses the pre-existing run_info, from the mozlog docs:

An optional dictionary describing the properties of the build and test environment. This contains the information provided by mozinfo, plus a boolean debug field indicating whether the build under test is a debug build.

So, uh, it's whatever mozinfo does. And its docs don't seem to say what fields it provides and what they mean… :/

Hexcles · 2018-05-07T15:46:38Z

@foolip Yes, the timestamps were intentionally put outside of run_info. The rationales are: 1. jgraham and I prefer having the original run_info from mozlog in the report; 2. I don't want to add timestamps to run_info because they already exist as a top-level field in all mozlog entries. Regarding the format, this is milliseconds since UNIX epoch. Since the report is mostly consumed by machines, I'm personally in favour of timestamps rather than human-readable date strings, but I'm open to suggestions.

The version thing is automatically generated by mozlog using mozinfo and seems to be yet another (and somewhat redundant) representation of OS version. I don't want to go down the rabbit hole for now; in fact, I'll only require the os field in the results receiver. Besides, I'll add another field called browser_version into run_info.

foolip · 2018-05-07T15:54:14Z

@Hexcles if not all information will make sense in run_info, I wonder if we should put all of the metadata in a single top-level attribute like metadata or... something? Unfortunately run_info is a pretty good name, but it's taken :)

And will the run_info stuff be stable enough to use in the results receiver, or should we have code in wpt run that derives the information we need from it and includes both the input (run_info) and the output in the report?

gsnedders · 2018-05-07T15:58:50Z

@foolip filed https://bugzilla.mozilla.org/show_bug.cgi?id=1459651 for the lack of documentation of any of the fields in run_info, FWIW.

jugglinmike · 2018-05-07T16:42:20Z

@Hexcles I know that you would prefer to hold off on documenting the data structure for now, but given that run_info is itself undocumented and other system will depend on this data, it seems wise to at least assert the presence of the required contents. What do you think?

Hexcles · 2018-05-10T15:53:18Z

@jugglinmike yes I will.

Status update: I think the last piece of crucial information missing in wptreport is the revision (we don't have #10452 yet so there'll be no "experimental" field for now).

jugglinmike · 2018-05-10T18:36:55Z

Thanks, Robert!

foolip · 2018-07-11T15:34:09Z

@Hexcles, is this issue finished now, given that #10971 is merged?

Hexcles · 2018-07-11T22:39:11Z

Yeah I think we've covered the basics. If there's anything else we'd like to add we can open new issues.

foolip added the infra label Apr 17, 2018

alijuma added the priority:roadmap label Apr 20, 2018

Hexcles added a commit that referenced this issue May 3, 2018

Add OS and browser to wptreport

80f6752

The first stab at #10511

Hexcles mentioned this issue May 3, 2018

Add run_info and timestamps to wptreport #10811

Merged

Hexcles added a commit that referenced this issue May 3, 2018

Add OS, browser and timestamps to wptreport

689cc7f

The first stab at #10511

foolip assigned Hexcles May 3, 2018

Hexcles added a commit that referenced this issue May 3, 2018

Add run_info and timestamps to wptreport

45278a9

run_info includes OS and browser info. This is the first stab at #10511 .

Hexcles added a commit that referenced this issue May 3, 2018

Add run_info and timestamps to wptreport

b2b367b

run_info includes OS and browser info. This is the first stab at #10511 .

Hexcles mentioned this issue May 9, 2018

Add browser version into run_info (and hence wptreport) #10935

Merged

Hexcles mentioned this issue May 10, 2018

Results Receiver API for external runs web-platform-tests/wpt.fyi#55

Closed

Hexcles mentioned this issue May 12, 2018

Add git revision to run_info (and wptreport) #10971

Merged

Hexcles closed this as completed Jul 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include more metadata in wptreport JSON #10511

Include more metadata in wptreport JSON #10511

foolip commented Apr 17, 2018

gsnedders commented Apr 17, 2018

gsnedders commented Apr 17, 2018

foolip commented Apr 18, 2018

jugglinmike commented Apr 18, 2018

foolip commented Apr 18, 2018

jugglinmike commented Apr 18, 2018

foolip commented Apr 18, 2018

foolip commented Apr 18, 2018

jgraham commented Apr 18, 2018

Hexcles commented May 2, 2018 •

edited

Loading

Hexcles commented May 3, 2018

foolip commented May 3, 2018

foolip commented May 7, 2018

gsnedders commented May 7, 2018

Hexcles commented May 7, 2018

foolip commented May 7, 2018

gsnedders commented May 7, 2018

jugglinmike commented May 7, 2018

Hexcles commented May 10, 2018

jugglinmike commented May 10, 2018

foolip commented Jul 11, 2018

Hexcles commented Jul 11, 2018

Include more metadata in wptreport JSON #10511

Include more metadata in wptreport JSON #10511

Comments

foolip commented Apr 17, 2018

gsnedders commented Apr 17, 2018

gsnedders commented Apr 17, 2018

foolip commented Apr 18, 2018

jugglinmike commented Apr 18, 2018

foolip commented Apr 18, 2018

jugglinmike commented Apr 18, 2018

foolip commented Apr 18, 2018

foolip commented Apr 18, 2018

jgraham commented Apr 18, 2018

Hexcles commented May 2, 2018 • edited Loading

Hexcles commented May 3, 2018

browser & os are really enums

Why experimental needs to be a separate field?

What about WebDriver/executor?

foolip commented May 3, 2018

foolip commented May 7, 2018

gsnedders commented May 7, 2018

Hexcles commented May 7, 2018

foolip commented May 7, 2018

gsnedders commented May 7, 2018

jugglinmike commented May 7, 2018

Hexcles commented May 10, 2018

jugglinmike commented May 10, 2018

foolip commented Jul 11, 2018

Hexcles commented Jul 11, 2018

Hexcles commented May 2, 2018 •

edited

Loading

`browser` & `os` are really enums

Why `experimental` needs to be a separate field?