-
Notifications
You must be signed in to change notification settings - Fork 46
Synchronize browsers used by WPT CI and results collector #535
Comments
I think it would be valuable to ensure that the configuration for Chrome and Firefox are as similar as possible or even identical on wpt.fyi running infra and Travis, that will help towards web-platform-tests/wpt#7475. So, what are our options for making this true, short of having Travis use the same infra behind the scenes and just waiting, or replacing Travis outright? Agree to punt on Edge and Safari until we have a CI solution for them. |
Our options depend on how widely you define the term "configuration". That might include:
The more similar, the better, but there's a trade-off between parity and the complexity used to achieve it. If we limit ourselves to "process-level", we could address this today by manually maintaining duplication in the scripts for the two projects. We could maybe go a step further and define an "automation mode" for the WPT CLI which enables all the configuration we're interested in (perhaps also disallowing any additional flags). I'd need to run that by @jgraham before claiming that to be a workable solution, though. If we want more parity, then manually maintaining synchrony starts to seem onerous. (As requested, I'll ignore the possibility of relying on the same results collection service from both projects,) We could define a separate repository with system-level configuration scripts. A tool like Puppet or Ansible could be used during setup for CI jobs and for the results collector. We have a proof-of-concept for this in the system we're currently using to collect results. However, configuring a system "from scratch" may have an unacceptable impact on time-to-results for pull request validation. It's also not a viable option for managing closed-source browsers. For that, we may need to get into the business of maintaining images. We might be able to do this with Docker. WPT could consume Docker images through its existing TravisCI integration. I'm fuzzy on Docker support for Windows virtualization, though. Furthermore, given that we don't necessarily take TravisCI for granted, we shouldn't over-value solutions that integrate conveniently with that service. Full machine images may be necessary. With recent trends in "immutable infrastructure," the tooling around this kind of operations management is becoming quite sophisticated. There may be other alternatives, too. Does this give you any ideas? |
Given that we want What could a solution for this in the wpt CLI look like? Would it be such that upgrading the nightly browser version is a PR on wpt?
Still interested in what you think. I'm under the assumption that it's kind of an inevitable end point, but that it'll make more sense once we have solutions for Edge and Safari that we're happy with and are fast enough. |
I think we'd want to avoid maintaining our own definition of "nightly" within I wrote out some concrete ideas for how the feature would behave, but it
Beyond the challenges of managing those browsers, the responsibility of pull With all the talk about revision announcers and vendor-supplied test results, |
Thanks for filing those issues, will comment there!
Using Sauce, or BrowserStack, or anything that could put the browser far away from wptserve on the network seems unworkable in the long run, that'll be a source of flakiness forever unfixable, and can't ever run very fast. So step 1 is to figure out how we're going to get Edge and Safari results into wpt.fyi on a more sound setup. There are a few options we're exploring there, as you know. Before that it doesn't make too much sense, I think, to contemplate running these browsers on every PR. But... actually, we don't need to gate a PR solution for Chrome and Firefox on figuring out the wpt.fyi waterfall builds for Edge and Safari. This is what I'd like: For the Travis jobs that effectively run Those checks don't need to abide by any timeout, and we can just have one per browser configuration. If we ever want to have checks that depend on the results of more than one run ("fails in all? extra bad!") then that could just be a separate check that waits for the rest. Internally, those checks would offload to something very much like the infra to run all of wpt, but with the When would I like it? Dunno, when it seems like the worst problem on our hands, which isn't for a bit longer I think. @lukebjerring? @jugglinmike, if there isn't an existing issue which covers this and you think it's worth tracking, please go ahead and file an issue :) |
@domenic, you built the WHATWG status checks, is https://developer.github.com/v3/guides/building-a-ci-server/ the right documentation to start from? |
I think that's what I used, yeah. Indeed there are two main points of interaction with the GitHub API: a webhook to detect new PRs, and the status API endpoint to post new statuses to that PR's commits. You can browse the code in https://github.com/whatwg/participate.whatwg.org/tree/master/lib; pr-webhook.js is the main file. server-infra/validate-github-webhook.js is also somewhat important. |
If the proposal is to run the stability jobs on custom infrastructure, I reiterate again that we can use TaskCluster, which already has built-in GitHub integration, allows access to substantial resources, can have long time limits, has a significantly better architecture than Travis (or buildbot). Generally I haven't pushed too hard on this because I understand that people are wary of "nonstandard" solutions, and there are several requirements for wpt.fyi that might not integrate easily into taskcluster, particularly around custom Windows versions and macOS. But if the choice for running stability jobs is a DIY setup or something that is already built and proven at a scale way beyond our needs, then I think we should take advantage of the existing solution. |
Can we also use TaskCluster for wpt.fyi, at least for Linux? Having two tech stacks is itself part of the problem/nuisance. If there's no path for running fresh Windows Insider Preview or Safari Technology Preview on TaskCluster it takes it out of the running for, well, both waterfall and PRs, but we're already moving towards having a diversity of runner infras for wpt.fyi so we should be open to partial solutions where all-encompassing ones seem to not exist. @jugglinmike, have you done any reading on TaskCluster, WDYT? |
Yes, we could use it for Linux. I have a PR [1] open to have it run every push on master in Chrome and Firefox nightly; if we merge that it could be running right away (the GitHub integration would have to be set up of course). Some integration with wpt.fyi iwould be needed to get the I understand the desire to avoid a plurality of infrastructure. But I have several countervailing concerns:
To the extent that custom hardware, or unusual operating system configuration, is required to get results, I think that outweighs my concerns for wpt.fyi. However I haven't heard those requirements expressed for the stability checking part. I also understand that there might be similar concerns about Taskcluster being tied to a specific organisation. However there are some differences:
|
On the governance, I share those exact concerns/biases, and my ideal is really that each browser vendors takes care of running its own browser and submitting results to wpt.fyi, but in a way that is in-principle-reproducible, i.e. if people use To get to that world, we think we need to bootstrap wpt.fyi into good shape with frequent runs for almost everything we care about, to make it useful and indispensable, to get the lock-in we need to make the ideal world sustainable or somehow self-reenforcing, where people want to join the party. If Mozilla wants to run Firefox right away, on TaskCluster or anything, we could set that up in a matter of weeks with what @Hexcles is working on. I think, though, that there's benefit in running Chrome and Firefox on similar infrastructure, because it must be possible, and should reduce total engineering cost. At this point, I think that @jgraham and @jugglinmike should talk to each other :) |
In our OKR document for Q1 2018, the following text was listed as a KR of the priority-2 objective titled, "web-platform-tests continuous integration is reliable and useful":
The intent behind this isn't clear, and the acceptance criteria are also somewhat vague. Here's where we stand and where we're headed on the results collection front:
Since no one is running experimental builds of Safari or Edge, I think it's safe to say those browsers are not relevant for this goal.
The WPT CL process has long run Chrome's "dev" channel, but it's only thanks to @Hexcles recent
efforts that failures are actually visible.
Once this project is collecting results from the experimental builds of Chrome and Firefox (see gh-388 and gh-521), it seems like we will have reached this goal.
@foolip can you weigh in on this?
The text was updated successfully, but these errors were encountered: