Improve Jest startup time and test runtime, particularly when running with coverage, by caching micromatch and avoiding recreating RegExp instances #10131

I was profiling some Jest runs at Airbnb and noticed that on my MacBook Pro, we can spend over 2 seconds at Jest startup time in SearchSource getTestPaths. I believe that this will grow as the size of the codebase increases. Looking at the call stacks, it appears to be calling micromatch repeatedly, which calls picomatch, which builds a regex out of the globs. It seems that the parsing and regex building also triggers the garbage collector frequently. Upon testing, I noticed that the globs don't actually change between these calls, so we can save a bunch of work by making a micromatch matcher and reusing that function for all of the paths. micromatch has some logic internally to handle lists of globs that may include negated globs. A naive approach of just checking if it matched any of the globs won't capture that, so I copied and simplified the logic from within micromatch. https://github.com/micromatch/micromatch/blob/fe4858b0/index.js#L27-L77 In my profiling of this change locally, this brings down the time of startRun from about 2000ms to about 200ms.

After optimizing globsToMatcher, I noticed that there was still a lot of unnecessary overhead at Jest startup time spent recreating the same RegExp instances repeatedly. Thankfully, we can be a little smarter about this and create them all ahead of time and just reuse them. On top of my globsToMatcher optimization, this brings the speed of the ArrayMap in startRun down from about 160ms to about 7ms.

I'd like to start using this in more places to improve performance. Moving it to jest-util seems like a better spot. Now that it is a standalone module, I decided to write some unit tests for this function. In doing so, I uncovered a small difference between the behavior of this function and micromatch when overlapping glob patterns are used which I also fixed.

While incorporating this function into more places, I discovered a discrepancy here with how micromatch works. We can fix this by creating a fast path for when there are no globs at all.

I've been profiling running Jest with code coverage at Airbnb, and noticed that shouldInstrument is called often and is fairly expensive. It also seems to call micromatch and `new RegExp` repeatedly, both of which can be optimized by caching the work to convert globs and strings into matchers and regexes. I profiled this change by running a set of 27 fairly simple tests. Before this change, about 6-7 seconds was spent in shouldInstrument. After this change, only 400-500 ms is spent there. I would expect this delta to increase along with the number of tests and size of their dependency graphs.

I was profiling some Jest runs at Airbnb and noticed that on my MacBook Pro, we can spend over 30 seconds after running Jest with code coverage as the coverage reporter adds all of the untested files. I believe that this will grow as the size of the codebase increases. Looking at the call stacks, it appears to be calling micromatch repeatedly, which calls picomatch, which builds a regex out of the globs. It seems that the parsing and regex building also triggers the garbage collector frequently. Since this is in a tight loop and the globs won't change between checks, we can greatly improve the performance here by using our new and optimized globsToMatcher function, which avoids re-parsing globs unnecessarily. This optimization reduces the block of time here from about 30s to about 10s. The aggregated total time of coverage reporter's onRunComplete goes from 23s to 600ms.

The logic here might be a little confusing, so I am adding some comments that I hope will help make it easier for future explorers to understand. While I was doing this, I noticed a small way to simplify this function even more.

Commits on Jun 23, 2020

Update globsToMatcher.ts

cpojer authored Jun 23, 2020

Configuration menu

View commit details

Copy full SHA for 03c8004

Browse repository at this point

Copy the full SHA

03c8004 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Jest startup time and test runtime, particularly when running with coverage, by caching micromatch and avoiding recreating RegExp instances #10131

Improve Jest startup time and test runtime, particularly when running with coverage, by caching micromatch and avoiding recreating RegExp instances #10131

Commits on Jun 7, 2020

Commits on Jun 8, 2020

Commits on Jun 9, 2020

Commits on Jun 23, 2020