Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues when using multiple threads due to regex scratch space contention #986

Closed
orf opened this issue Aug 12, 2023 · 0 comments · Fixed by #996
Closed

Performance issues when using multiple threads due to regex scratch space contention #986

orf opened this issue Aug 12, 2023 · 0 comments · Fixed by #996

Comments

@orf
Copy link
Contributor

orf commented Aug 12, 2023

There appears to be a lot of mutext contention when using libcst with multiple threads. I discovered this while parsing a lot of code across only 8 threads on my local machine. I was averaging about ~1.5k files per second with a flamegraph that spent 80% of the time inside TextPattern.match_len:

flamegraph-mutext

This is because the scratch-space for each regex defined in libcst is shared between all threads. As a proof of concept I switched the regex compilations to use thread locals, and immediately saw a ~10x performance jump to ~12k files per second:

flamegraph

This is because the pattern scratch-space is no longer shared between threads, and thus there is no contention.

I'm happy to make a pull request with these changes, but I wanted to spur a discussion here because the trade-offs are not clear. If you create/destroy a lot of threads then this change would have a negative impact, however if your threads are long-lived and you are doing a lot of individual parsing calls then this is a great benefit.

@orf orf changed the title Performance issues when using multiple threads Performance issues when using multiple threads due to regex scratch space contention Aug 12, 2023
@zsol zsol closed this as completed in #996 Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant