-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turning on branch protection for all tracks #174
Comments
Are there any tracks with a single maintainer / mentor that would not have anyone to approve their PRs? Overall what you propose sounds fine. |
There are quite a few tracks where that is a problem. I'm working on a dashboard to help with this, and am hoping to make a team of "guardians" who can help mitigate the problem until we can actively go find more maintainers for these tracks. |
My support with no reservations
My historically bad experiences with this setting cause me to wish it were not necessary, but I know that it is. I have no better suggestion that still achieves the desired goal. My support, with a wish that things could be better: I wish the up-to-date setting could take advantage of the fact that Travis CI checks the result of merging each PR with master. Therefore, if I could make a feature suggestion for GitHub, it would be to detect whether Travis CI's merge check was against an up-to-date master, as an alternative to having the branch be based off an up-to-date master. I understand that this suggestion that does not help when there are N PRs that were made simultaneously since then the result of merging them with each other is not checked unless the builds are explicitly restarted after each one is merged. The feature would ideally take that into account and allow it if the build was explicitly restarted. Since I see the setting is a necessary one, when it is a problem (>= 3 PRs simultaneously in flight) I have historically dealt with the problem by octomerging. It might be the case that I will be doing that a lot more in the future. It would therefore also be interesting if GitHub had a UI for octomerges... but that seems very unlikely, given their increased complexity. No feature request listed in this comment is a blocker for requiring up-to-date branches. |
@petertseng these are all great observations.
One of the problems with this is that this is an external integration. The only part of this that is under GitHub's control is the final status of, not what happens to set that status (e.g. you could have a bot that sets a green check mark and says your commits are bananas!, and all GitHub knows is that the status is good). Now... we could likely write some automation around this if we wanted to, given that we know which exact part of the Travis integration we care about. That said, the N PRs in flight thing is a problem and we've actually run into this where two PRs both pass the tests individually and against master, but are incompatible with each other. So yeah. I see your point, and I also don't see an immediate way around that. I am going to think about whether we could write an automation for the case that you describe, though. |
We had I do hope, he can help me to get the nextercism tree right. But also, when I was the only maintainer, and we didn't even made reviews necessary, waiting for travis was quite a burden, especially when we were operating on the 5 builds limitation, because of automated mass-PRs through blazon. I often had to manually set alarms or reminders on a third party service or my mobile. I missed some of them, which really slowed everything down. Therefore, I have to say, that this might really be a show stopper or at least slow downer for "undermaintained" tracks. Especially when we require reviews (who shall review PRs of the sole maintainer?) We shouldn't activate this setting before it is ensured that there will always someone available for reviews! |
I also see this as a problem on branches where there are 3 maintainers when 2 of the tree maintainers may be swamped. Has the "social contract" of not pushing to master except for trivial commits been problematic? As it is, only those with "maintainer" privileges (assumed, not tested) are able to do so, and on the tracks that I am on, I have not seen this abused. |
@NobbZ This is a very good point. I'm going to think about this one for a bit.
Not in the tracks with seasoned, experienced maintainers, no. In new tracks (without an experienced buddy), very much so. |
I'd also add that a small mistake in a trivial commit could have a much bigger effect now. For example, someone misplacing a couple of brackets and therefore mis-structuring the Again - very unlikely with seasoned maintainers/devs (but I wouldn't always bet on myself to not make mistakes!), more likely with less experienced. |
I know I don't trust myself. I'll pay close attention every single time, except the time when I... don't for some reason. |
As a side note, I'm going to take a look at what we can do to speed up Travis (use caching where possible, abort builds as soon as possible if we know we need to build a newer one, etc). |
If you're interested, the Java (and Kotlin) tracks recently looked into ways to speed up our CI builds; see exercism/java#410 for some discussion. exercism/java#326 is an issue we have identified but are yet to fix; it could save significant time when the build is broken if addressed. |
I would guess that a time when speeding up Travis is helpful is when a PR is made to every track. Here is a thing that has happened when this has been done, historically:
If Travis CI would not build the branch, the number of Travis builds would be cut in half. I thought of two ways to achieve that.
|
Also - we should investigate what it would take to up the limit from 5 to x. Rust-lang has 15 according to this post and could pay for more. Do we know anyone at Travis we could reach out to to ask what the max free level could be, and how much it would cost to extend that? |
I think it were best if @kytrinyx just asked for a sponshorship at travis. Perhaps they can tune us to 10 builds in exchange for a small badge? |
Update: octomerge still possible if "build branch updates" is off: https://github.com/petertseng/octotest/pull/6 |
This is a great observation. To me it makes no difference where I make the PR from, I script everything anyway.
Yeah, this is a good point. I've never thought about it much. I have thought about how to clean up old branches (as evidenced here: https://splice.com/blog/cleaning-git-branches/), but not how to avoid them in the first place. I'm pretty sure I'm convinced you're right.
We could, but I see no reason why I (and others) shouldn't make the PRs from forks. That would solve half the problem.
I'll ask the folks at Travis! I want to do my homework first, though, and do all that I can to speed things up first. Then if we get some extra cycles, we'll be making them count instead of wasting them. |
We were talking about branches as well, instead, on the Delphi track. Having master for the base, but push / merge to Publish branch for active web display. |
@kotp you mean for the exercism.io website? I could see this complexity to the backend, and I'm not sure the trade-offs are worth it. I'd be open to hearing the pros/cons, though. |
@rpottsoh has some thoughts, not quite hashed out yet, about potentially double publishing from "master" view and "published" view. Where any changes that make it to master can be viewed before going to the 'normal' view that users are aware of. Gives a safety check, of how everything works together before going 'live', while still in a live way. But not really hashed out yet. |
@kotp @rpottsoh it sounds like what you're thinking about is some sort of way to verify things in a staging environment. Rather than something that involves automatically having multiple main branches (master for staging, published for production), I would like to encourage people to think about what it would take to build a stand-alone track-visualizer that could be used with any branch on the repository on GitHub. I think that would be more light-weight and potentially more useful than a full-on staging set-up. |
I think so essentially. Instead of the site pulling from Master it would instead pull from Published. Whenever Master is updated a maintainer would go to "http://exercism.io/test/Ruby", for example, or some other flag so that the site they arrived on would be utilizing their tracks Master branch. If no ill affects were observed the maintainer would then advance Published forward to Master.... That is the basic premise of what @kotp and I were discussing yesterday. |
My gut reaction to having a full-fledged master/published pipeline is that this sounds like a lot of work for maintainers. My suspicion is that for the most part, people will be fine with going straight to production, so long as they've reviewed the pull request. There are always some changes that we're more wary of, and I'd love to find a system that lets us verify these without imposing a change across the board on our workflow. |
FWIW I've had success with git hooks before, e.g. we could set it up such that you can't push to master unless the tests pass and the configlet exits with |
@yurrriq I've also had great success with git hooks—my problem with them is that it's too easy to not have everyone using them. |
@kytrinyx, right. I feel like that could be a pro and a con though 😄. Pros: simple, easy to use. Cons: simple, easy to ignore. |
@yurrriq too true! 😂 |
I forked all the track repositories and then discovered that you can turn off branch updates on Travis CI. Now I've updated all the Travis settings for all the language tracks to the following: This should improve things somewhat. There are some great suggestions in the Java thread that @stkent linked to (exercism/java#410). The two that jump out at me are:
|
Running only those tests that seem to be necessary might be hard to detect. We had to scan the git tree since the last forking from master, see which files were altered and associate them with an exercise. Those detections might be error prone and take probably as much time as running and building the tests. Per my observation, often the culprit lies in downloading and compiling dependencies. For some languages we need to repeat this even per exercise. If those languages were able to use a local cache instead then a detector wouldn't need to be created anymore but tests were much faster than before. Some tracks even require compiling and installing a complete language (as idris), since they aren't officially supported by Travis. Those languages should be somehow ported to Travis as a possible base language. |
This is true, but I think that we want to build PRs. We just want to build them once, not twice (as in exercism/haskell#580 - I see "2 checks passed", "continuous-integration/travis-ci/pr The Travis CI build passed Details", and "continuous-integration/travis-ci/push The Travis CI build passed Details") |
This is correct, and desirable. What we were seeing was that we would create a branch on the repo (not a fork), push to the branch (the branch gets built), create a PR (a PR gets built). We were trying to eliminate the first build. I don't want people to have to do manual work, but I also don't want to have twice the number of builds. It seems unnecessary to make everyone create forks. I would like to suggest the following settings.
This is similar to my first suggestion, but I've removed the bit about "at least one approval" because we don't have enough maintainers to make that possible. Another option for master is to do what @Stargator suggested in the Dart track and add However, I really don't want us to be pushing to master, both for the sake of visibility (notifications happen on PRs not on pushes to master) and for the sake of safety (v2 updates the website based on master automatically and immediately. The site would break even before a build completes, if we get something wrong). |
Readers who only wanted to see that summary can stop reading; the below only expands on the above two points. My understanding of the properties we desire:
The suggestion:
maintains both properties. I strongly dislike the "be up-to-date with master" setting, but I have no alternative that maintains the desired properties. My suggestion of turning "Build branch updates" on for all branches has no effect on Safety, but interferes with At-most-once because everything will get built twice and maybe thrice.
The second build is what we desire in all cases, but the first and third are not. The suggestion of exercism/dart#35 (comment) (Turn "Build branch updates" on but use At this point the only disadvantage I have left to offer of leaving "Build branch updates" off is that any Travis CI badge (The ones that say "Build: passing", see e.g. https://github.com/exercism/ruby which uses https://travis-ci.org/exercism/ruby.svg?branch=master) are relatively useless since we will not have any branch with up-to-date builds for that image to use. Perhaps the issue can be remedied by using https://docs.travis-ci.com/user/cron-jobs/ to build master periodically. I will trial this immediately. The only alternatives I have to offer to preserve Safety:
On using forks versus Exercism's copy of the repo:
Well, every would-be contributor who doesn't have write permissions to the repo must create a fork anyway. In either case, I still think what I say in #174 (comment) applies:
I don't have any force behind those bullet points or any strong principles on why they're a good idea; just vague feelings of "if there are branches on a repo that people might clone, they should only be branches that people would benefit from having", and I think a lot of the branches we've made are those that people do not benefit from having. |
As a contributor on GitHub, I'm used to forking repos and keeping up work. |
@petertseng thank you, this is an excellent summary.
I'd like to understand this stance better. What do you dislike about it?
I understand the technicality of it, however I'm not certain I understand why these branches are a burden. I do my best to prune and keep unnecessary/dead/finished branches to a minimum. If we had hundreds or (as I've sometimes seen, thousands, many of which are years old) then yes—definitely a problem. But if there are a handful of branches, I don't understand what problem they're causing. Would you please help me see your point of view?
Yes, that's completely fair. That said, most contributors only need to fork and clone one or two or at most a small handful of repositories. I think that if I just needed to create one or two repositories I wouldn't mind, but I need to have a fork of 60 different repositories on my personal account. I did this for a while, and invariably I would miss something, even when I scripted things as much as possible. Every time I bootstrap a new repository, I must also fork it to my personal account. I need to make sure to keep master on my local copies up to date with 60 different upstream repositories, and also update my 60 forks. In my case it adds a not-insignificant additional burden in order to do janitorial work that I most often wish I didn't have to do in the first place. |
Consider the case when I have N PRs simultaneously in flight. Without "be up-to-date with master":
With "be up-to-date with master", we have two choices:
[1]: It does not matter whether we rebase on master, or use GitHub's "Update from master" (merges master into the branch) with the intent to squash later; the parallel -> serial still applies. As you can see, although it is necessary for Safety, it does slow things down. As long as I can still octomerge so that I'm only slowed down by O(1) instead of O(N), I accept the slowdown for the sake of Safety. If there is any setting that prevents my ability to octomerge, I will not be able to accept the slowdown. (You may look at #174 (comment) for a past discussion, but I have put the relevant information here as well, so you do not need to)
Me neither. Are they? I can speak for myself. I don't like to keep old branches around, and maybe I don't like to see other people old branches around. Let's say we do development on exercism's branch B. Contributor C forks the repo during that time, and now they have branch B. We merge branch B and then delete it. Ah, but Contributor C still has branch B on C's fork, and there is no way to delete that unless C explicitly takes action. So, is that a problem? I know such a thing bugs me personally, but for the purposes of this discussion maybe it's only a problem if Contributor C cares about it. We haven't heard anyone complain, have we?
You would have to do this no matter whether your local copy's
I personally don't think it's necessary to keep the master branch of the fork (as it appears on GitHub) up to date. It's necessary to keep it up to date locally, of course. I set up my local repos such that the master branch (and only the master branch) pulls from Exercism's copy.
On this point, it does make a difference. (I have no additional comments to add but am just pointing this out to contrast with the above point)
I wondered if it would be too sardonic to say "then just don't do it", but I know that not doing it is not an option when the change blocks a configlet release or a website change, so I have no alternative to offer. So, I really don't have any force behind the suggestion, and I fear my reasoning just boils down to "it bugs me but I can't explain why". Making branches on Exercism's repo bugs me at about a level of 4/10, so if making forks bugs you at any level >= 4, I take your word for it. |
It's not an option. I'm already a huge bottleneck. I'm trying my best to make it so that I am not, but it's a long, long road.
That's completely fair.
I'll think on it some more. I really appreciate you taking the time to explore the nuances in this. |
I am trying to see if someone (well, someone on the internet) has written anything on the subject matter, but I have not yet been able to find anything. I may have inherited my attitude from someone on my team who says "I want the main repo's branches to be pristine so people looking at it don't get confused by the branch list", but I suppose we don't have so many branches as to make that a concern. I asked about that and heard in response that a certain project really wanted to ensure that their main repo has as few refs as possible so that it wouldn't be so slow to clone. Again, I don't think we have that many branches that it's a concern. It seems that these concerns are useful in larger repos, but not as important here. So I guess that's why it bugs me without me being able to explain why. Since there's less mental overhead for me if I keep the same workflow for all the projects I work on, it's why I continue to use forks for Exercism, but the same doesn't necessarily apply to anyone else. |
Here's the thing, GitHub won't let you merge in changes if there are conflicts with the PR and master. The setting to be "up to date with master" is meant to avoid conflicts. But any PR reviewer should see the notification by GitHub that there's a conflict. Given then, unless git detects a conflict I think it is fine to work without the "up to date with master" setting. My team has been using GitHub for just two years. At some point, we started using the setting, then switched it off because GitHub prevents PR merges (through GitHub, not command-line). So maybe consider that. |
I agree that this statement is true.
I disagree that this statement is true... depending on the definition of "conflict". It doesn't do anything with regard to conflicts in the |
Yes, this is exactly it. We've run into this problem on several occasions, where the conflict is conceptual, not an actual line/merge conflict. |
Thank you for elaborating, @petertseng. In this context, where the repos are quite small and the branches are low in overhead both mentally (looking at the list of branches and seeing what's going on) and resource-wise (in terms of cloning), I'm going to go ahead and make the call that maintainers can (but are not forced to of course) work on branches on the main repository. @petertseng -- the proposed settings would still allow octomerge, correct? |
As far as I can tell, yes. I have tested this on a repo where I changed the settings to those proposed. |
Here is a potential concern with "require branches to be up-to-date with master". The potential impact increases as the repo has more simultaneous PRs in flight: I wonder if this will tempt some people to use the "Update branch" button, making a merge commit, and then also use "Merge pull request" that makes another merge commit. Again, this bugs me but I'm not sure if I should be bugged by it. The best explanation I can come up with is that I want at most one merge commit, because I think having two is redundant and I like to avoid redundancy when possible. I like one merge commit if the PR has multiple commits since it acts as a record of what PR the commits came from, but I don't really have a justification for two merge commits. There are many ways that avoid doubling up on merge commits. So the concern is mitigated if we are aware of the below ways.
|
This bugs me too, but I've decided to live with it, because it gives me safety (in other words, it's not truly redundant, because it provides safety, but it feels redundant, because it has extra merge commits).
Yes, that's correct. I think we do not need to make this a required setting for that repository. |
Frankly, I use Squash and Merge. A PR shouldn't be so big that the effort needs to be broken up into commits. If so, then the settings could be changed to allow whatever approach seems best at the time and then revert back. |
We've imported this issue to the https://github.com/exercism/exercism.io repository. |
Background
A couple of months ago @nicolechalmers started doing sketches, designs, and wireframes based on the design documents that Thalamus and I have spent the past 6+ months working on. More recently, @ccare and @iHiD started implementing a prototype based on those designs. We've made some design decisions that we believe will be a vast improvement over the current website, particularly with respect to how changes in track repositories are reflected on the site.
Current Situation
In the current website, track repositories are all submodules in a Ruby gem which has some additional logic related to tracks and exercises. This means that we need to
In other words, this involves committing to three different repositories and deploying two different codebases.
There are a lot of places where this can go wrong, not to mention I have to actually deploy. Sometimes I get busy, and which point all the hard work that all the maintainers do gets held up.
That's not acceptable.
Changes in Nextercism
In the nextercism prototype we're working directly with the tracks' git repositories themselves. We will be updating these continuously, so any changes to the tracks should be reflected on the website near-instantaneously (well, it might take a minute, but probably not much longer than that. Compared to the day-to-week lag that you all are used to operating with, this will probably feel like a bit of a miracle).
On the other hand, this means that if we screw up master in any way on a track, that will be reflected on the website immediately.
Proposal
So. In order to protect the website I wish to turn on branch protection for the master branch of all tracks. That will mean that:
This means that even trivial changes will need to go into a branch and a pull request, and be approved by one of the other maintainers.
This might be a little bit annoying, but I think it's worth it.
If you have any counter-arguments or can think of anything I've missed, or other ways to protect the website from bad changes, I'd love to hear your thoughts.
If you think this seems reasonable but don't have anything to add to the discussion, please add a 👍 reaction.
The text was updated successfully, but these errors were encountered: