chore(web): CI builds are currently extremely unreliable #10494

mcdurdin · 2024-01-24T04:37:18Z

Over the last month, we have had over 50 failed builds which are due to timeouts or similar and seem to be related to BrowserStack calls?

This is adding a lot of noise to otherwise good pull requests. Recommendations:

Move /common/web modules that are KeymanWeb-exclusive into /web
Disable BrowserStack tests entirely unless /web files are touched.
Try and find out what is going wrong -- add retries/robustness etc.

mcdurdin · 2024-01-24T05:00:15Z

Notes:

it seems that some BrowserStack sessions are getting a plain major.minor.patch version number and others are getting a full major.minor.patch-x-y-pr version number. Could this be related to the agent triggering it? Could multiple agents running web builds at the same time be related?

mcdurdin · 2024-01-24T05:02:59Z

See also:

keyman/web/test.sh

Lines 49 to 65 in c9fac1b

    
           if [[ $VERSION_ENVIRONMENT == test ]]; then 
        
             # Implied: CONFIG=CI.conf.js because `-CI` parameter is passed. 
        
             # 
        
             # If we are running a TeamCity test build, for now, only run BrowserStack 
        
             # tests when on a PR branch with a title including "(web)" or with the label 
        
             # test-browserstack. This is because the BrowserStack tests are currently 
        
             # unreliable, and the false positive failures are masking actual failures. 
        
             # 
        
             # We do not run BrowserStack tests on master, beta, or stable-x.y test 
        
             # builds. 
        
             DO_BROWSER_TEST_SUITE=false 
        
             if builder_pull_get_details; then 
        
               if [[ $builder_pull_title =~ \(web\) ]] || builder_pull_has_label test-browserstack; then 
        
                 DO_BROWSER_TEST_SUITE=true 
        
               fi 
        
             fi 
        
           fi

Why is this not working?

mcdurdin · 2024-01-24T05:38:44Z

The BrowserStack tests with full version information are under @jahorton's username. The ones with only major.minor.patch are under the Keyman Server username (i.e. running from CI).

So ... something is different in how they are being triggered.

jahorton · 2024-01-24T06:22:37Z

The BrowserStack tests with full version information are under @jahorton's username. The ones with only major.minor.patch are under the Keyman Server username (i.e. running from CI).

So ... something is different in how they are being triggered.

In case this is something we have to triage and get back to later... I haven't directly run BrowserStack tests from my machine in months.

I remember that early on, I think my log-on details were used to set up BrowserStack CI... and then we changed the details to use Keyman Server stuff.

mcdurdin · 2024-01-24T06:53:34Z

The BrowserStack tests with full version information are under @jahorton's username

JH noticed that the LMLayer test configuration has a different username

jahorton · 2024-01-24T07:20:36Z

Toward the bottom of the logs BrowserStack had available, we were able to determine that in stable-16.0, both sources of unit tests - KMW and the LMLayer - had full version strings pass through. (And my name, not keyman-server.)

We have not yet determined... where it's even getting the version string from.

jahorton · 2024-04-26T04:58:10Z

Thought I'd put the comment here, but I can't find it, wherever it did land.

Useful reference: actions/runner-images#3737

A specific comment of note: actions/runner-images#3737 (comment)

There's a recent npm fix that may be notably related. Updating to NPM 10.5.2 or higher would give us that fix.

mcdurdin · 2024-05-03T01:13:40Z

Thought I'd put the comment here, but I can't find it, wherever it did land.

#10350.

jahorton · 2024-06-10T01:52:53Z

With #11456 and #11451 in place, this issue is probably resolved for now. It's also believed that #8616 was related; @mcdurdin mentioned finding something about use of git:// entries in package.json sometimes contributing to npm instability. (Sadly, I can't find where he posted a reference about it, but it was certainly mentioned in our discussions in other team channels.)

I'll leave this up for a bit to make sure that everything is fully resolved, as the changes made by #8616 and #11451 may trigger a new behavior we haven't gotten to test or validate fully set.

Granted, there may be a minor flare-up when we get to #11501... but we should be able to limit the scope of those tests and keep things stabler than before, even then.

jahorton · 2024-06-17T04:58:13Z

At this point, the only instabilities I'm seeing on master-based PRs is from the test builds for our other products, rather than from testing Keyman Engine for Web itself. This doesn't fix stable instability issues, but I still consider it enough of a win to close this issue.

mcdurdin · 2024-07-02T12:04:36Z

@mcdurdin mentioned finding something about use of git:// entries in package.json sometimes contributing to npm instability. (Sadly, I can't find where he posted a reference about it, but it was certainly mentioned in our discussions in other team channels.)

I wrote on Slack:

So removing the two git-based dependencies from package.json reduces the npm i time from 4 minutes to 13 seconds on my machine. Seems like that is a very poorly supported mechanism.

Certainly I concur that the issues appear to be pretty much resolved, hurrah (except stable as you note, which would be a bit hard to patch)

mcdurdin added this to the 18.0 milestone Jan 24, 2024

keymanapp-test-bot bot added chore web/ labels Jan 24, 2024

mcdurdin added the ci Issues relating to build infrastructure label Jan 24, 2024

mcdurdin mentioned this issue Jan 24, 2024

refactor(core): remove unused context APIs 🌱 #10433

Merged

mcdurdin mentioned this issue Jan 24, 2024

chore(web): karma is deprecated, find an alternative test runner #10497

Closed

mcdurdin modified the milestones: 18.0, A18S1 Apr 19, 2024

mcdurdin assigned jahorton Apr 29, 2024

darcywong00 modified the milestones: A18S1, A18S2 May 11, 2024

mcdurdin modified the milestones: A18S2, A18S3, A18S4 May 24, 2024

jahorton closed this as completed Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(web): CI builds are currently extremely unreliable #10494

chore(web): CI builds are currently extremely unreliable #10494

mcdurdin commented Jan 24, 2024

mcdurdin commented Jan 24, 2024

mcdurdin commented Jan 24, 2024 •

edited

Loading

mcdurdin commented Jan 24, 2024

jahorton commented Jan 24, 2024

mcdurdin commented Jan 24, 2024

jahorton commented Jan 24, 2024

jahorton commented Apr 26, 2024 •

edited

Loading

mcdurdin commented May 3, 2024

jahorton commented Jun 10, 2024 •

edited

Loading

jahorton commented Jun 17, 2024 •

edited

Loading

mcdurdin commented Jul 2, 2024

chore(web): CI builds are currently extremely unreliable #10494

chore(web): CI builds are currently extremely unreliable #10494

Comments

mcdurdin commented Jan 24, 2024

mcdurdin commented Jan 24, 2024

mcdurdin commented Jan 24, 2024 • edited Loading

mcdurdin commented Jan 24, 2024

jahorton commented Jan 24, 2024

mcdurdin commented Jan 24, 2024

jahorton commented Jan 24, 2024

jahorton commented Apr 26, 2024 • edited Loading

mcdurdin commented May 3, 2024

jahorton commented Jun 10, 2024 • edited Loading

jahorton commented Jun 17, 2024 • edited Loading

mcdurdin commented Jul 2, 2024

mcdurdin commented Jan 24, 2024 •

edited

Loading

jahorton commented Apr 26, 2024 •

edited

Loading

jahorton commented Jun 10, 2024 •

edited

Loading

jahorton commented Jun 17, 2024 •

edited

Loading