-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance #34
Comments
Example of performance as it currently stands. |
Initial tests did not look so good. Just adding the fan out to the work was not brilliant as it produced no speed improvements. Checking against a single file shows that the initial pass takes ~600ms so there is a lot to be gained there as well. Probably need to consider changing how the whole pipeline works to achieve a speedup here. |
Most of the time is spent in contains and index comparisons. Might be faster to move over to byte comparisons. |
https://blog.sourced.tech/post/gld/ Called out publicly... oh its on now. Time to double down on performance. |
Although one of their goals is to "Favor false positives over false negatives (target data mining instead of compliance)." Which I did not want to do. |
http://web.archive.org/web/20180904032703/https://blog.sourced.tech/post/gld/ updated link because they went away |
Although it seems it lives on somewhat here https://github.com/go-enry/go-license-detector |
https://github.com/src-d/go-license-detector/blob/master/licensedb/dataset.zip link to file for testing |
The performance could be a lot better through the use of fan out. Might be possible to speed up the matching as well by using byte comparisons rather than string. Need to investigate both as the tool can be quite slow at times.
The text was updated successfully, but these errors were encountered: