-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0088] Nixpkgs Breaking Change Policy #88
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good initiative to take this further. Please do limit the line length to make it easier to comment.
The RFC does not go sufficiently into size/scope of PR's this RFC covers nor how it relates to the different development branches (master/staging/staging-next) we have. Motivations are missing for the specific steps in the procedure. Maybe they should be added there or in a separate section.
When talking about what is an important and unimportant change it would be very helpful to first have defined what is considered core and what not. I think it is difficult to define the policy well in this RFC without defining package tiers.
# Unresolved questions | ||
## Critical Packages and Tests | ||
|
||
What if a breaking change breaks NixOS tests? There must be packages and tests so critical that we can not merge without them passing? In that case do we leave the PR open until fixed? or use the `staging` branch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the scope of the change we are talking about? Staging is not for testing or dumping changes in the hope they get fixed; one is expected to follow their change and fix regressions especially when it has reached staging-next.
There are important NixOS tests and tests that are pretty unimportant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your suggestions? Should we have a list of tests that can't be marked broken? In that case I think we would need to block the initial PR.
If that is desired I can hard-wrap it on the next push.
Right now I am focused on
This is being discussed here: #88 (comment) |
This specifies the expected workflow as a base for workflows and future automation.
Updated:
|
dependent packages involved in the review to review if the breakage is | ||
justified. | ||
4. The maintainer will contact the maintainers of all dependent, broken | ||
packages (herein called sub-maintainers). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packages (herein called sub-maintainers). | |
packages (herein called sub-maintainers). | |
1. The sub-maintainers will be contacted preferably by the `github` field, | |
followed by the `email` field of [maintainer-list.nix | |
](https://github.com/NixOS/nixpkgs/blob/master/maintainers/maintainer-list.nix). | |
If neither of those fields are set than contacting the maintainer is not | |
required and the package should be marked as broken after 7 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we wait? Or remove this maintainer from the blocking list right away? What is the odds that they randomly notice that we are breaking their package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I'd say what this RFC lacks most of all is a clear definition of a “breaking change”. From reading it, I get the impression that it is used as “a change which causes a build failure in a reverse dependency”. Clarification of this would be important in my eyes.
Very important would be to clarify that this policy doesn't apply to the introduction of (partial) evaluation errors by e. g. changing an internal interface since we can't work around such a problem using meta attributes and the channel would be held back until all breakage would be fixed.
Additionally, I think we can not just ignore breakage for users, as outlined in a comment.
6. The maintainer must merge the target branch into their PR branch. | ||
7. The maintainer must mark all still-broken packages as | ||
[broken](https://nixos.org/manual/nixpkgs/stable/#sec-standard-meta-attributes). | ||
8. The maintainer can now merge to the target branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there should be someone with commit access assigned to managing this process if the maintainer of the derivation that causes the breakage doesn't have commit access, so we can guarantee these relatively strict time frames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am slightly hesitant to giving special privileges to changes that require dependent updates but maybe this is justified as packages with more dependents are likely more important.
I am also hesitant to put the full burden of merging these changes on an individual.
I guess this is blocking if this policy puts additional difficulties to getting the change merged. But as I see the additional burden is on the author/maintainer, not on the actual process of getting it merged.
The RFC is open for shepherd nominations btw |
Co-authored-by: asymmetric <[email protected]>
Nominations for the shepherd team are welcome! @kevincox maybe you have someone in mind who you think could help move this forward? |
@blaggacao would you be interested and able? |
From RFC36 about Shepherd Team:
I think I can do that and accept nomination. |
@nrdxp you have valuable experience (bors, hydra) and also afaics aligned interests to further this RFC, would you accept nomination for Shepherd? |
We still need more nomination for this RFC to proceed. |
Shepherds Meeting, Monday, 23th of August 2021
In addition to the above discussion the existing feedback was summarily reviewed. Most of it was reported as addressed. I'd encourage the parties that have given feedback in the past to revise the document under the light of the above discussion and the recent changes and potentially renew their questions / observations to the author. Otherwise, in my opinion, we might be slowly approaching FCP. Please also consider expanding on the broader context of this RFC in the author's recent blog post: https://kevincox.ca/2021/08/21/nixpkgs-ci/ |
I think most contributors who have tried to push for large changes, and have needed to wait for input from many individuals will agree that some convention about a grace period is needed. In general, this should only affect PRs against staging, as most PRs against master can generally be tested more thoroughly, and the effects of the changes usually carry much lower risk of breaking downstream packages (because there's significantly less of them). I think we should move this into FCP. |
Co-authored-by: Jörg Thalheim <[email protected]>
Co-authored-by: Théo Zimmermann <[email protected]>
@jonringer has the shepherd team agreed to move this to FCP with disposition to accept? If so, let us know and we'll update the label. |
@lheckemann I confirm the shepherd team has agreed to move this to FCP with disposition to accept. |
sorry, got buried under other notifications, yes. We would like to move forward, with disposition to accept. |
We are now in the Final Comment Period (FCP). Unless any major objections are raised, this will be accepted and merged in 10 days from now, on November 13th! Please post any final comments on this pull request. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfc-0088-fcp-nixpkgs-breaking-change-policy/15876/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with pretty much everything proposed here. It's good to have some formal guidelines and documentation to point users to. Two considerations:
-
I think the grace period shouldn't be applied strictly. It should be a guideline or made flexible depending on the importance of the change and the sub-package: if a sub-package is not essential but important (some popular library, application, etc.) I'd rather delay a bit the update than to break it.
-
I fear this could increase the burden on maintainers of mid to large packages (I mean those between being a leaf and core package). Those maintaining core packages are probably already well equipped and eat mass-rebuilds at breakfast, but not everyone else. It would be good if ofBorg, nixpkgs-review or some other new tool would automate doing some of the step in the proposed workflow before this RFC is enforced:
- detecting broken packages
- producing a list of contacts to ping the sub-maintainers
- keeping track of who replied
- marking packages as broken
5. The maintainer must provide a grace period of at least 7 days from when the | ||
sub-maintainers were notified. | ||
6. The maintainer must merge the target branch into their PR branch. | ||
7. The maintainer must mark all still-broken packages as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me if "still-broken" packages refers to all the packages or just those in the sample.
In any case, marking many broken packages (even if <100) for every PR seems like a lot of work: it would be good if this could be facilitated by a tool like nixpkgs-review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually it should be all packages. My not-so-secret goal is that one day every single package will be considered "channel blocking" unless it is marked broken, basically meaning that there are no unintentional breakages.
It is expected that tooling will manage this. This policy is not going to be enforced until we have made it reasonably easy to comply. This RFC is a first step to agree agree upon a policy first, then we can build up the tooling to implement the policy.
that a critical output file was not forgotten) Note that sometimes it **is** | ||
necessary to break all dependent packages, and the maintainer is not required | ||
to avoid this. | ||
1. A sample of at least 100 packages or 3h of build time is recommend as a guideline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a bit too demanding (for potato hardware like mine), but I guess it's ok for a guideline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with committers and active contributors to use my server: https://github.com/jonringer/server-configuration . If you're worried about compute. Since I'm not opening this up to the general public, it's not a solution, just a mitigation; but the offer is there.
Obviously, this is guideline is meant "within expectations" of a user's ability. Having some sort nixpkgs-review
does make it a lot easier for reviewers to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why we added the time limit and called it a guideline. The idea is even if you only own a potato we hope that you can spare 3h of your potato or find someone else to do so for big changes. If you think 3h is still to muck to ask I'd be happy to consider other thresholds, it was picked mostly arbitrarily to provide a rough target.
Sounds like nothing FCP-breaking has happened. |
|
||
- Maintainers of dependencies have a clear framework for handling changes that | ||
break dependants. | ||
- Maintainers of dependants have a clear framework for how dependency breacking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Maintainers of dependants have a clear framework for how dependency breacking | |
- Maintainers of dependants have a clear framework for how dependency breaking |
- Avoid putting more burden than necessary on the dependency maintainer. If the | ||
maintainers of core derivations face toil proportionally to the number of | ||
transitive dependencies they will quickly become overloaded. These | ||
maintainers are arguably the most critical to Nixpkgs and their load needs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maintainers are arguably the most critical to Nixpkgs and their load needs | |
maintainers are arguably the most critical to Nixpkgs and their load needs |
* Create RFC for nixpkgs Breaking Change Policy This specifies the expected workflow as a base for workflows and future automation. * Update rfcs/0088-nixpkgs-breaking-change-policy.md Co-authored-by: asymmetric <[email protected]> * Reformat metadata * Update and clarify. - Highlight the high-level goal. - Highlight the low-level goal of this specific RFC. - Wording fixes. * Implement suggestions from shepherds meeting. * Set shepherd leader Co-authored-by: Jörg Thalheim <[email protected]> * Typo fix Co-authored-by: Théo Zimmermann <[email protected]> Co-authored-by: asymmetric <[email protected]> Co-authored-by: Linus Heckemann <[email protected]> Co-authored-by: Eelco Dolstra <[email protected]> Co-authored-by: Jörg Thalheim <[email protected]> Co-authored-by: Théo Zimmermann <[email protected]>
This specifies the expected workflow as a base for workflows and future automation.
Rendered