-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
revset: add fork_point
function
#4795
base: main
Are you sure you want to change the base?
Conversation
441f09c
to
6afd5b2
Compare
I don't think GCA is the term used for this, it's lowest (or sometimes last) common ancestor. I would probably be fine with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, it's the "greatest" in terms of generation numbers? I feel both gca
and lca
are equally cryptic.
common_ancestor()
, best_ancestor()
, closest_ancestor()
?
Git calls it best common ancestor (via
|
6afd5b2
to
9264792
Compare
gca
functioncommon_ancestors
function
c713fe9
to
a1ced5b
Compare
Does anyone else have any thoughts on the naming? Also, should the name be in singular or plural form ( |
If we don't say "best", "greatest", etc., a singular form would suggest that it won't include all common ancestors. So I prefer singular |
Now that I think about it, I agree that |
That seems good, but isn't it too wordy? I'm okay with either |
Would |
This is an interesting point, I'm not familiar enough with graph theory (?) to say if the best common ancestors of an empty set of nodes is the root of the graph. |
I think the best common ancestor of a single commit should be itself, and the best common ancestor of an empty set ought to be none (it doesn't exist) |
Hmm, you might actually be right. We want the property that |
Something like |
How can a commit be its own ancestor? |
We simply define it such that a commits ancestor includes itself. We do that because it's often useful in practice. Mercurial does the same. |
In the same way that you can define the ancestors of a person as:
which is consistent but gives more answers than a definition that excludes the first reflexive condition. It’s the same reason that the sum of one number is itself and the product of one number is itself, even though there’s no addition or multiplication going on there; similarly why we define the sum of no numbers to be 0, and the product of no numbers to be 1. (In mathematical terms, we’re trying to make this operation form a monoid, because that results in good properties.) |
Or, to motivate it directly in terms of a defining property: the greatest common ancestor of a set of commits is the commit that all elements of the set are descendents of, such that there is no descendent of the ancestor that also satisfies that property. For a single commit, that’s the commit itself. (This still leaves the choice of the ancestor of an empty set somewhat arbitrary, though it already makes it clear that there can be no single unique choice for many commit graphs, which motivates |
I think if it includes itself, it shouldn't be called ancestor. Commits don't appear out of thin air, do they? Wouldn't there at least be |
In graph theory, when talking about things like lowest common ancestor (LCA) you consider nodes to be descendants of themselves. |
It’s also the same case as the greatest common divisor of a single number being itself (which Python’s |
If this is simply for the name of a function in code, call it whatever and explain with a comment. |
It’s the standard graph theory terminology, not a VCS‐specific quirk, and the mathematically obvious definition. People anyway aren’t likely to call |
This is mostly a tangent, but the empty case is a bit iffy and you end up having to be careful in how you define what a common ancestor is (to avoid issues of vacuous truths). E.g. for a set of nodes " But this behaves poorly if If you make the (sensible) constraint that common ancestors need to be an ancestors of some node in |
Some more motivation of the base cases here: If we desire that More pragmatically, a common practical use of this operation – Git’s Base cases are hard. It wasn’t that long ago that people found the idea of zero very alien, and rejected formulas like n + 0 = n, n × 0 = 0, or the idea that you could sum up no integers at all and get 0, or take the product of no integers at all and get 1. But once you work out the rules it’s clear that there’s only one correct answer, and that skipping defining it because it seems weird and unintuitive at first reduces the power and convenience of your operations and forces you to introduce special cases elsewhere in the system. (Corollary of base cases being hard: I might have the reasoning wrong here! But I’m pretty confident that returning the input for a single commit is the correct behaviour, that any other concrete choice would make things worse and break things you would reasonably want to do, and that declining to define it would make things more awkward and less useful in general.) |
There's an edge case that
Sounds also good. It would reflect user's intent. |
Actually, maybe even just EDIT: But an important difference between them is that |
Yes, maybe we can add |
There doesn't seem to be a clear consensus and no one seems to feel strongly. To avoid this just getting stalled, I'm fine with @bnjmnt4n making the final decision on the name. Sounds good? Or do other feel like there's a name that's clearly best (and or names that they really don't want)? |
a1ced5b
to
3fee11e
Compare
common_ancestors
functionfork_point
function
I like the suggested |
Maybe it makes sense to give the example of the two commit case (which I think is |
We have "examples" section in the doc. |
This can be used to find the fork point (best common ancestors) of a revset with an arbitrary number of commits, which cannot be expressed currently in the revset language.
3fee11e
to
4c5e8e5
Compare
* `fork_point(D|B)` ⇒ `{A}` | ||
* `fork_point(B|C)` ⇒ `{A}` | ||
* `fork_point(A)` ⇒ `{A}` | ||
* `fork_point()` ⇒ `{}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe fork_point(none())
so users don't think it can be called with no arguments
This can be used to find the best common ancestors of a revset with an arbitrary number of commits, which cannot be done currently.
Checklist
If applicable:
CHANGELOG.md