You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently updated to a more recent version of this repo for my team's git hooks and we noticed that check_case_conflict got a lot slower. I believe I've tracked down what's going on and I have a suggested improvement, though I'm not sure if it's a great approach.
Testing done using Python 3.6.3 on Windows 10 with pre-commit-hooks==4.0.1 and pre-commit==2.13.0
#575 introduced also checking directories, which is great! But this snippet ends up being slow because os.path.dirname is slow (at least on windows :( )
But it doesn't really feel good doing this instead of using the built-in os.path stuff. I'm not really sure how correct/robust/multi-OS-compatible this is.
I'm curious if anyone has alternative ideas! But also if this is simply out of scope for this tool that's fine too :) Thanks for reading!
The text was updated successfully, but these errors were encountered:
guykisel
changed the title
check_case_conflict is slow in a repo with many files/directories
check_case_conflict is slow in a repo with many files/directories on windows
Jul 9, 2021
@guykisel good to see you again! sorry we made this slower! that profile looks good, pre-commit will always call the tool with forward slashes and if git returns paths consistently then I think we can do an optimization similar to the one you've proposed
there may be a smarter way to do this as well (for instance grouping by the depth of things might improve speed?) but I think your proposed change is probably good enough
Hello!
I recently updated to a more recent version of this repo for my team's git hooks and we noticed that check_case_conflict got a lot slower. I believe I've tracked down what's going on and I have a suggested improvement, though I'm not sure if it's a great approach.
For context, this is the Legends of Runeterra game's monorepo :) It's got around 1m files and some are pretty deeply nested in directories (a little bit more on this at https://technology.riotgames.com/news/legends-runeterra-cicd-pipeline if anyone's curious)
Testing done using Python 3.6.3 on Windows 10 with
pre-commit-hooks==4.0.1
andpre-commit==2.13.0
#575 introduced also checking directories, which is great! But this snippet ends up being slow because
os.path.dirname
is slow (at least on windows :( )I used
cProfile
to run this against a single file:results:
I messed with it a bit locally and this approach seems to perform significantly better:
But it doesn't really feel good doing this instead of using the built-in
os.path
stuff. I'm not really sure how correct/robust/multi-OS-compatible this is.I'm curious if anyone has alternative ideas! But also if this is simply out of scope for this tool that's fine too :) Thanks for reading!
The text was updated successfully, but these errors were encountered: