-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase go-git cache size #440
Comments
@jfontan how does this scale with increased memory in relation to repo size? The reason I am asking is that I much rather have a significantly faster response and have a customer use 100GB of ram for a query. High memory instances are cheap these days. |
@eiso, maybe it could heuristically assign a size but the fist step will be adding an option to change it. |
...or maybe we can start from something like 20% of RAM as default for gitbase |
Maybe we should be able to configure cache size not only for |
@jfontan my question was more about if the memory improvement stays the same dependent on size of the repository (which I am assuming is a yes) |
I'll try to implement it by adding support for couple environment variables, e.g.: I think it's more flexible (can skip couple levels of passing options from one constructor to another) and it doesn't require any extra integration. |
@eiso unfortunately I've seen changes in speedup from repo to repo but I don't know the cause. Maybe is the repo that has more complex commit tree and makes iterating it more expensive. We would have to measure more to know for sure. |
yeah, I thought about the lowest level. So, in this case |
@kuba-- You are totally right about If we go the env variable way maybe changing the default value is enough to change both normal repository opens or direct packfile open. |
exactly. this is my intention- avoid tons of silly changes just by passing yet another static option. |
Maybe we should allow passing a cache instead of just cache size. That would allow:
|
Right now we have just one type of cache (LRU):
Moreover passing down go-git
|
Only I believe that for now we can benefit being able to pass a cache to the storer as @smola says. We already have https://github.com/src-d/gitbase/blob/master/repository_pool.go#L68 So at the end we will only have to pass cache in three places, two in We may need a better cache interface in the future but I would stick with the one that |
I also don't want to over-engineer it, so that's why I preferred env vars (but I see some mental no to env vars :)) |
So far, this issue is blocked till |
Right now there is no way of changing the default cache size in
go-git
and its size too small (96 MiB). I've been doing tests changing this value and its performance improved a lot.Repositories: linux (2013), numpy, tensorflow
Number of rows: 395709
Query:
SELECT count(*) FROM commits c NATURAL JOIN ref_commits r WHERE r.ref_name = 'HEAD';
Default cache:
1 row in set (54 min 22.17 sec)
Cache size * 8:
1 row in set (20 min 43.69 sec)
Memory consumption is also not too big.
gitbase
used 1.3 GiB in this query.We should add an option to go-git Open to select cache size.
The text was updated successfully, but these errors were encountered: