CI: Optimize the tests performance. #30709

raisedadead · 2018-10-27T22:33:49Z

Challenge tests in CI are async and are hitting the limits, most of them exiting before completion:

https://travis-ci.org/freeCodeCamp/freeCodeCamp/builds/447041746#L821-L822

I was taking a look at this, in and I think that we can avoid this by splitting up the describe blocks into their own test script "stages". This would let us "fail fast" in the CI. We can quickly get away by first testing for most common error areas, followed by the next complex area.

Meaning, we should have the stages ordered, where user's are most likely to make errors while making changes.

These could be a possible order:

Platform tests (Currently in Jest).
Check valid mongo id.
Check Syntax for testString - Linting only.
Validate HTML (Certification 1) challenges
Validate CSS (Certification 2) challenges
Validate ... (Certification N) challenges
Others.

Each is its own independent block, instead of the humongous file right now.

We can possibly then use Travis's Build Stages and optimize our builds.

The text was updated successfully, but these errors were encountered:

raisedadead · 2018-10-27T22:36:48Z

/cc @ValeraS @tchaffee @Bouncey @tbushman for inputs.

raisedadead · 2018-10-27T22:47:27Z

Why we need this:

At the minute we are testing only english challenges and already have ~10000 test items [not to be confused with no. of challenges]. Every it in the test script is a item.

Soon we need to multiply that for each language variant.

ValeraS · 2018-10-27T23:04:49Z

Sorry, I don't understand what do you mean by exiting before completion? If you talk about 880 pending tests, then it is the challenges without tests, solutions or seed, and they marked as pending.

Nirajn2311 · 2018-10-28T01:26:07Z

@ValeraS I think he is talking about #30462, as some of the tests need more timeout compared to 100ms.

raisedadead · 2018-10-28T08:17:25Z

... they marked as pending

Ah, I see. I guess the log message threw me off. Nonetheless I think its worth to have tests 'fail fast' ? Meaning as soon as we hit an error, we need not proceed to other tests. This would enhance the memory and CPU utilizations a lot and in turn improve the build time.

What do you think about the stages and splitting up into micro modules than the humongous file?

raisedadead · 2018-10-28T08:18:59Z

On a side note, I would like to explore https://ava.li for unit tests and Mocha for Browser in future (not a priority).

ValeraS · 2018-10-28T10:20:07Z

The first test from test:* with non-zero exit code breaks other tests. For example in https://travis-ci.org/freeCodeCamp/freeCodeCamp/jobs/447316254 test: curriculum did not finish its job.

About stages, I think we need two steps: one for common tests like test: client and one for tests for certain languages that run in parallel.

raisedadead · 2018-10-28T10:34:23Z

Hmm I see. And can we not have linting the files as the first step? Or you think that might be rather complex?

ValeraS · 2018-10-28T19:49:38Z

How about running tests for modified parts only? For example, if we change only the guide files, we'll run tests only for the guide and will not run tests for the client, server and curriculum.

tchaffee · 2018-10-29T16:15:59Z

@ValeraS That can be a good idea, but one risk is that something in one area could break something in another area. Someone needs to figure out what that risk is. Just for example, it's possible that a change server side or client side could make the curriculum no longer work. But maybe a change to the curriculum would never break the client or server? I don't really know, those are just examples.

I don't have any detailed insight into this specific to this project and what could go wrong with only testing what has changed. I'm just commenting on a very high level about how I've seen this cause problems in the past with other projects.

raisedadead · 2018-10-29T17:06:23Z

How about running tests for modified parts only?

I am with Todd, on this one. My experience with this says never trust heuristic testing. That's what humans would do. And then they would be wrong with guesses.

The CI/CD should always test everything. Always. This way you know that your starting point when debugging.

But,

I can see and have thought of this several times that CI/CD is not a setup for content editing. We are rather doing weird stuff by using git for versioning content. Anyways, the point being, for this to work in a sane scalable manner here is what we plan to do:

Travis will run two types of tests.
- Basic Unit Testing or UT. Note, Currently UT is the curriculum/test/test-challenges.
- Full User Acceptance Testing or UAT. This is something that is non existent on this project right now.
Travis may NOT be run on pull-requests, but we will see about that after Hacktoberfest. We could potentially run the UT on them. I am undecided on this at the minute, pending more analysis of the data and contribution behaviour.
Pull requests will run linting tests to check file formats, and we are building additional automation with a probot app to help us with heuristic testing, tagging the correct reviewer etc.
Travis builds will also be triggered at every hour to sanity check the merged PRs in that hour (dev team is notified of the builds). These will ALWAYS be run FULLY (both UT and UAT) on master or production-* branches. Broken builds, mean we need to revert a few changes and notify the moderator who merged the PR.
On production branches we build the entire app, this is where we go away from Travis. We use Azure Pipelines (more expensive but robust), to build the app from scratch and deploy it to a staging server. Here we perform the UAT (User acceptance testing) i.e browser tests with Puppeteer or Cypress etc. This could involve automated user testing to check flows such as UI, looks good, etc.

Reasoning with the approach and potential enforcement to PR workflow:

This all is with one simple criteria in mind that 99.9% of code changes at the minute are to content, which need linting.

Any change to code files (gatsby apps, server code, tooling, packaging, anything that's not markdown), will need someone from the dev-team to pull it and only they can merge. The probot app adds checks that this can't be overridden.

Any change to curriculum need to pass the browser tests for UAT. This is something I am exploring at the minute. So that we can bring in some consistency. I am thinking we could potentially do a Azure build and deploy to a staging server every hour as well.

Unless we have achieved all this, we might not consider things to be safe, by doing heuristical testing on the CI.

raisedadead · 2018-10-29T17:08:38Z

Anyways, not getting away from the original issue at hand. We still need to vastly improve the CI/CD performances. Every millisecond that can be saved counts a lot, considering we want to add more and more automation, and we want to bring in several languages as well. The testing needs to catch up with the growth.

tchaffee · 2018-10-29T17:39:03Z

I can see and have thought of this several times that CI/CD is not a setup for content editing.

Is there any good reason we can't get away from this and use something like Wikimedia for the guide? It seems like using the wrong tool for the job is creating a lot of extra work for us, and clogging up Travis CI as well.

For the curriculum I could see still using github and PRs. The curriculum is used directly by the client and the server, and the curriculum has code in it in the form of tests.

raisedadead · 2018-10-29T18:04:23Z

Is there any good reason we can't get away from this and use something like Wikimedia for the guide?

We have had proposed this internally within the team and it was turned down.

This was because @QuincyLarson thought of us having to manage yet another platform. The desire was the we stick to certain stack for the deployments, which are within the budget for the non-profit.

For instance, if we were to use say wordpress or wikimedia which is good for content, we would have to have additional persons on the team for the DevOps and not mention additional customization (both UI and access control) to unify the user experience.

We may consider rolling out our own on top of Netlify CMS which neatly falls in place with GitHub and Gatsby, but that is again another future goal.

Our most ideal goal is to go away from servers as far as we can, and at the same time have collaborative open editing environment.

tchaffee · 2018-10-29T18:56:26Z

This was because @QuincyLarson thought of us having to manage yet another platform.

It's a worthy goal so I understand the decision. Would now be a good time to re-assess whether or not sticking with the same stack is in reality saving time?

Anyway..... I'm happy to focus on getting what we have right now to work as best as possible.

I see two unrelated opportunities to improve what we have:

Could we consider automatically approving and merging PRs for the guide? That's kind of how Wikimedia works. The review and approval process is a huge effort. We could just trust people to make good contributions, and then let folks find and fix when that has been abused.
Would it be possible to skip the Travis CI stuff altogether only when it's a guide PR?

raisedadead · 2018-10-29T19:39:07Z

Could we consider automatically approving and merging PRs for the guide?

I think, we would still have an human intervention. Because I can see people efforts to abuse the system. Its less rewarding in a system like wikimedia where its not tied to your profile graph of contributions. The incentives here are rather high. So I would rather spend the time on getting the CMS to work. Where the contributions will be less incentivised and hence people making changes to guide are genuinely making them? Anyways. That needs more debate. And I will have to get back with feedback on this particular question.

Would it be possible to skip the Travis CI stuff altogether only when it's a guide PR?

Absolutely yes. That's the intent. If only things are changes in the guide then we should have the bot lint it. People editing off of GitHub do not respect the formatting. Luckily GitHub's api does give us a diff of the change. So I think it's possible to get away with it. I'll get back to you with the implementation details of this part shortly.

paulywill · 2018-10-31T17:55:55Z

This was because @QuincyLarson thought of us having to manage yet another platform. The desire was the we stick to certain stack for the deployments, which are within the budget for the non-profit.

Another take away from Hacktoberfest... although a [redacted] I'm sure for the Dev team and QA/moderators, the fact all these users were submitting their first PR is huge. Just learning to navigate and contribute to a git repo is a useful skillset they're going to need.

tchaffee · 2018-10-31T20:52:41Z

@raisedadead just another thought: the guide doesn't have much relationship to the three things that are highly related: the curriculum, the learn platform, and the api-server. Would it make sense to give the guide its own repo?

paulywill · 2018-10-31T20:59:49Z

@tchaffee Was that not what they just moved and archived?

https://github.com/freeCodeCamp/guide

tchaffee · 2018-10-31T21:05:53Z

@paulywill Everything used to be in separate repos. Which wasn't great. It makes development more difficult than it should be. By using Lerna and putting related stuff in a mono-repo it will make local development easier. But the guide has nothing to do with local development. That's why I'm suggesting putting it back in its own repo might further improve the local dev experience.

tchaffee · 2018-10-31T21:12:01Z

We should look at what other companies are doing for guidance on best practice. Just for example, Facebook doesn't put everything React related in its React main Lerna mono-repo.

https://github.com/facebook/react/tree/master/packages

create-react-app has its own Lerna mono-repo:

https://github.com/facebook/create-react-app/tree/master/packages

So the question is: when does it make sense to keep stuff in the same Lerna mono-repo and when do you use a new dedicated repo? I'd guess that one factor is ease of development, which points back to which packages depend on each other. You can't work on the learn front-end without the api-server running locally. You can work on the guide with nothing at all running.

freeCodeCamp locked and limited conversation to collaborators Oct 27, 2018

raisedadead added scope: tests status: discussing Under discussion threads. Closed as stale after 60 days of inactivity. type: showstopper Issues that are urgent and are critical path. These need immediate attention & shipping. labels Oct 27, 2018

paulywill pinned this issue Jan 23, 2019

raisedadead unpinned this issue Jan 26, 2019

raisedadead removed the scope: tests label Sep 19, 2019

raisedadead closed this as completed Oct 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Optimize the tests performance. #30709

CI: Optimize the tests performance. #30709

raisedadead commented Oct 27, 2018 •

edited by ezioda004

Loading

raisedadead commented Oct 27, 2018

raisedadead commented Oct 27, 2018 •

edited

Loading

ValeraS commented Oct 27, 2018

Nirajn2311 commented Oct 28, 2018 •

edited

Loading

raisedadead commented Oct 28, 2018

raisedadead commented Oct 28, 2018

ValeraS commented Oct 28, 2018

raisedadead commented Oct 28, 2018

ValeraS commented Oct 28, 2018

tchaffee commented Oct 29, 2018

raisedadead commented Oct 29, 2018 •

edited

Loading

raisedadead commented Oct 29, 2018

tchaffee commented Oct 29, 2018

raisedadead commented Oct 29, 2018

tchaffee commented Oct 29, 2018

raisedadead commented Oct 29, 2018

paulywill commented Oct 31, 2018 •

edited by raisedadead

Loading

tchaffee commented Oct 31, 2018

paulywill commented Oct 31, 2018

tchaffee commented Oct 31, 2018 •

edited

Loading

tchaffee commented Oct 31, 2018 •

edited

Loading

CI: Optimize the tests performance. #30709

CI: Optimize the tests performance. #30709

Comments

raisedadead commented Oct 27, 2018 • edited by ezioda004 Loading

raisedadead commented Oct 27, 2018

raisedadead commented Oct 27, 2018 • edited Loading

ValeraS commented Oct 27, 2018

Nirajn2311 commented Oct 28, 2018 • edited Loading

raisedadead commented Oct 28, 2018

raisedadead commented Oct 28, 2018

ValeraS commented Oct 28, 2018

raisedadead commented Oct 28, 2018

ValeraS commented Oct 28, 2018

tchaffee commented Oct 29, 2018

raisedadead commented Oct 29, 2018 • edited Loading

Reasoning with the approach and potential enforcement to PR workflow:

raisedadead commented Oct 29, 2018

tchaffee commented Oct 29, 2018

raisedadead commented Oct 29, 2018

tchaffee commented Oct 29, 2018

raisedadead commented Oct 29, 2018

paulywill commented Oct 31, 2018 • edited by raisedadead Loading

tchaffee commented Oct 31, 2018

paulywill commented Oct 31, 2018

tchaffee commented Oct 31, 2018 • edited Loading

tchaffee commented Oct 31, 2018 • edited Loading

raisedadead commented Oct 27, 2018 •

edited by ezioda004

Loading

raisedadead commented Oct 27, 2018 •

edited

Loading

Nirajn2311 commented Oct 28, 2018 •

edited

Loading

raisedadead commented Oct 29, 2018 •

edited

Loading

paulywill commented Oct 31, 2018 •

edited by raisedadead

Loading

tchaffee commented Oct 31, 2018 •

edited

Loading

tchaffee commented Oct 31, 2018 •

edited

Loading