-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow version calculation times on freshly cloned pull request #1850
Comments
#1838 should help, which will be included in the 5.1.0 release of GitVersion. If you want to test it now, you can download the build artifacts and give them a spin. |
@asbjornu We tried it with 5.0.2-beta1.77. You're right in that it improved, however it still seems to take an unreasonably long time - 7 minutes (compared to 1.5 - 2.5 for a "normal" branch).
We also tried with 5.0.2-beta1.95 with no futher improvement. Given the normalization from a detached head state only takes 17s, do you know what it's doing differently in the calculating base versions? |
@hanzworld, since the author of #1838, @erikbra, has been looking at this code from a performance viewpoint recently, perhaps he has an idea of what the problem you're seeing might be? |
I did have some theory about simplifying finding base revisions (see #1839), but I don't know exactly how it should be done. I just have a hunch that in many percent of the cases (as leas on our own codebase), the correct base revision is found after very few node traversals up the tree, and backtracking through all of the tree trying to find versions is not necessary. Although, I am very humble about not knowing all the rules. I have some started work trying to cache what we read from LibGit2, to avoid reading from native code so many times, but it is just started, and I haven't got that far. My theory is that in most cases, calculating the base version could be done in very few seconds (maybe sub-second), and not hundreds. And, you are right, on some branches (or PRs), calculating the base version is much heavier than finding the base version on e.g. master. So, in conclusion, I have some ideas, but they are not verified yet, and problem is not solved. Hope to be able to look more at it in the not-so-distant future. But, you know, work, life, and all that... ;) |
@erikbra, I really appreciate the time you're spending on this. With regards to caching LibGit2's data, please see #1243 and #1244 for @JakeGinnivan's attempt to do the same thing. I completely agree that building our own immutable in-memory model of the Git tree is something we should do to speed things up, reduce bugs and create a LibGit2-independent abstraction that serves GitVersion's needs. |
Thank for the tips on different PRs. I was thinking about just caching the IEnumerable from LibGit2 with the repo information, but maybe a full abstraction layer is a better way to go. |
@hanzworld (or anyone else) - is your repository public? Or do you know of any other public repository that demonstrates the same issues? I have access to a private one that we have the same issues with, but it would be really great to have a public, large one to perform tests on, it will make the discussions easier. |
Unfortunately no, my repository is an internal GitHub Enterprise
repository, and I don’t know of any others, although I’m guessing it’s
simply the 13 years of git history on a 2m line codebase that qualify it
for experiencing this issue.
I wonder if Linux or similar would be a suitable test candidate?
…On Wed, 16 Oct 2019 at 7:16 PM, Erik A. Brandstadmoen < ***@***.***> wrote:
@hanzworld <https://github.com/hanzworld> (or anyone else) - is your
repository public? Or do you know of any other public repository that
demonstrates the same issues? I have access to a private one that we have
the same issues with, but it would be really great to have a public, large
one to perform tests on, it will make the discussions easier.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1850?email_source=notifications&email_token=AAOER2JP2FDXZ4CGM7KRICTQO2WS7A5CNFSM4I7FP7B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBLHPUY#issuecomment-542537683>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOER2JNP65B5WTGB7KKBGLQO2WS7ANCNFSM4I7FP7BQ>
.
|
I don't know if it's too big. Other stuff I can think of is the asp.net core codebase. But it might be too big as well. It should have a few branches, and ideally use different strategies for versioning (labels, commit messages, etc), to test the different strategies. |
@erikbra I'm afraid I don't really understand the inner workings of GitVersion. Would you be able to explain to me why this takes so long in a detached head state? Or more specifically, why even when we resolve the detached head state (e.g. create a branch) it still takes longer than a non-detached-head-initial-state scenario? |
The main issue is that GitVersion needs to understand what the source branch for your branch is. There is no way to know for sure because git is a directed acyclic graph of commits, and branches just point to a single commit. Take this example.
Given this graph, what is the parent of feature 3, is it feature1, feature2, or master? We can easily look at that and say well of course i'd want master. What about this one?
In this scenario feature1 was branched off master, but now master has moved on. GitVersion still needs to figure out that it needs to use master for version calculation and not one of the many other branches which were branched off different points in the graph. |
Stop
…On Mon, Oct 21, 2019, 7:49 PM Jake Ginnivan ***@***.***> wrote:
The main issue is that GitVersion needs to understand what the source
branch for your branch is.
There is no way to know for sure because git is a directed acyclic graph
of commits, and branches just point to a single commit.
Take this example.
* feature2
| * feature 3
|/
* feature1
|
* master
Given this graph, what is the parent of feature 3, is it feature1,
feature2, or master? We can easily look at that and say well of course i'd
want master.
What about this one?
- master
| * feature1
|/
-
In this scenario feature1 was branched off master, but now master has moved on. GitVersion still needs to figure out that it needs to use master for version calculation and not one of the many other branches which were branched off different points in the graph.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1850?email_source=notifications&email_token=AM3RZCI7RHNIHSK2NVYTJZDQPZE2VA5CNFSM4I7FP7B2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB4HTKA#issuecomment-544766376>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AM3RZCNEJII6DJFGCQJBFEDQPZE2VANCNFSM4I7FP7BQ>
.
|
This issue has been automatically marked as stale because it has not had recent activity. After 30 days from now, it will be closed if no further activity occurs. Thank you for your contributions. |
Hi team
We have a large repository (800MB) which experiences incredibly slow GitVersion-ing when run on pull requests. However we can't understand how to fix it.
Command:
gitversion.exe /output buildserver /nofetch /UpdateAssemblyInfo true
Symptom:
refs/pull/9752/merge
- which means the git repository is in a detached head state) it takes approximately 30 minutes to execute.feature/thing
) the problem does not occur and it takes between 1m30s to 2m30s.I've worked out it's to do with the fact the repository is in a detached HEAD state. I know that GitVersion goes through a process to remedy such a scenario, and we can see this process in the logs:
Begin: Normalizing git directory for branch 'refs/pull/9752/merge'
. However, it according to the logs it takes < 1 minute to normalise the git directory by creating local branches, and the remaining 20+ minutes follow the "normal" GitVersion process but very, very slowly. If we rerun the same process on the same repo (not freshly cloned) it runs very quickly.Timings on a sample run (taken straight from logs):
We have tried the following with no difference.
Let me know what else you need - the log is 2MB, but I'm happy to sanitize it and make available if of use.
The text was updated successfully, but these errors were encountered: