Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turning on branch protection for all tracks #174

Closed
kytrinyx opened this issue Jul 23, 2017 · 55 comments
Closed

Turning on branch protection for all tracks #174

kytrinyx opened this issue Jul 23, 2017 · 55 comments

Comments

@kytrinyx
Copy link
Member

Background

A couple of months ago @nicolechalmers started doing sketches, designs, and wireframes based on the design documents that Thalamus and I have spent the past 6+ months working on. More recently, @ccare and @iHiD started implementing a prototype based on those designs. We've made some design decisions that we believe will be a vast improvement over the current website, particularly with respect to how changes in track repositories are reflected on the site.

Current Situation

In the current website, track repositories are all submodules in a Ruby gem which has some additional logic related to tracks and exercises. This means that we need to

  • ... in the gem:
    • update all the submodules
    • bump the version
    • publish a new version to rubygems
  • ... in the website (and the api):
    • update the version of the gem in the lockfile
    • commit
    • deploy

In other words, this involves committing to three different repositories and deploying two different codebases.

There are a lot of places where this can go wrong, not to mention I have to actually deploy. Sometimes I get busy, and which point all the hard work that all the maintainers do gets held up.

That's not acceptable.

Changes in Nextercism

In the nextercism prototype we're working directly with the tracks' git repositories themselves. We will be updating these continuously, so any changes to the tracks should be reflected on the website near-instantaneously (well, it might take a minute, but probably not much longer than that. Compared to the day-to-week lag that you all are used to operating with, this will probably feel like a bit of a miracle).

On the other hand, this means that if we screw up master in any way on a track, that will be reflected on the website immediately.

Proposal

So. In order to protect the website I wish to turn on branch protection for the master branch of all tracks. That will mean that:

  • master cannot be pushed to directly
  • before they can be merged, all pull requests must
    • have at least one approval
    • have a passing build
    • be up-to-date with master

This means that even trivial changes will need to go into a branch and a pull request, and be approved by one of the other maintainers.

This might be a little bit annoying, but I think it's worth it.

If you have any counter-arguments or can think of anything I've missed, or other ways to protect the website from bad changes, I'd love to hear your thoughts.

If you think this seems reasonable but don't have anything to add to the discussion, please add a 👍 reaction.

@rpottsoh
Copy link
Member

Are there any tracks with a single maintainer / mentor that would not have anyone to approve their PRs? Overall what you propose sounds fine.

@kytrinyx
Copy link
Member Author

There are quite a few tracks where that is a problem.

I'm working on a dashboard to help with this, and am hoping to make a team of "guardians" who can help mitigate the problem until we can actively go find more maintainers for these tracks.

@petertseng
Copy link
Member

petertseng commented Jul 23, 2017

  • master cannot be pushed to directly
  • before they can be merged, all pull requests must
    • have at least one approval
    • have a passing build

My support with no reservations

  • be up-to-date with master

My historically bad experiences with this setting cause me to wish it were not necessary, but I know that it is. I have no better suggestion that still achieves the desired goal. My support, with a wish that things could be better:

I wish the up-to-date setting could take advantage of the fact that Travis CI checks the result of merging each PR with master. Therefore, if I could make a feature suggestion for GitHub, it would be to detect whether Travis CI's merge check was against an up-to-date master, as an alternative to having the branch be based off an up-to-date master.

I understand that this suggestion that does not help when there are N PRs that were made simultaneously since then the result of merging them with each other is not checked unless the builds are explicitly restarted after each one is merged. The feature would ideally take that into account and allow it if the build was explicitly restarted.

Since I see the setting is a necessary one, when it is a problem (>= 3 PRs simultaneously in flight) I have historically dealt with the problem by octomerging. It might be the case that I will be doing that a lot more in the future. It would therefore also be interesting if GitHub had a UI for octomerges... but that seems very unlikely, given their increased complexity. No feature request listed in this comment is a blocker for requiring up-to-date branches.

@kytrinyx
Copy link
Member Author

@petertseng these are all great observations.

I wish the up-to-date setting could take advantage of the fact that Travis CI checks the result of merging each PR with master. Therefore, if I could make a feature suggestion for GitHub, it would be to detect whether Travis CI's merge check was against an up-to-date master, as an alternative to having the branch be based off an up-to-date master.

One of the problems with this is that this is an external integration. The only part of this that is under GitHub's control is the final status of, not what happens to set that status (e.g. you could have a bot that sets a green check mark and says your commits are bananas!, and all GitHub knows is that the status is good).

Now... we could likely write some automation around this if we wanted to, given that we know which exact part of the Travis integration we care about.

That said, the N PRs in flight thing is a problem and we've actually run into this where two PRs both pass the tests individually and against master, but are incompatible with each other.

So yeah. I see your point, and I also don't see an immediate way around that. I am going to think about whether we could write an automation for the case that you describe, though.

@NobbZ
Copy link
Member

NobbZ commented Jul 23, 2017

We had master protected for a few months on the erlang track. That has been in a time, when I was the only active maintainer. The current second active maintainer is only semi-active and learning erlang as he goes along and doesn't implement or care for new exercises, but tries to get ordering and difficulties right.

I do hope, he can help me to get the nextercism tree right.

But also, when I was the only maintainer, and we didn't even made reviews necessary, waiting for travis was quite a burden, especially when we were operating on the 5 builds limitation, because of automated mass-PRs through blazon. I often had to manually set alarms or reminders on a third party service or my mobile. I missed some of them, which really slowed everything down.

Therefore, I have to say, that this might really be a show stopper or at least slow downer for "undermaintained" tracks. Especially when we require reviews (who shall review PRs of the sole maintainer?)

We shouldn't activate this setting before it is ensured that there will always someone available for reviews!

@kotp
Copy link
Member

kotp commented Jul 23, 2017

I also see this as a problem on branches where there are 3 maintainers when 2 of the tree maintainers may be swamped.

Has the "social contract" of not pushing to master except for trivial commits been problematic?

As it is, only those with "maintainer" privileges (assumed, not tested) are able to do so, and on the tracks that I am on, I have not seen this abused.

@kytrinyx
Copy link
Member Author

waiting for travis was quite a burden

@NobbZ This is a very good point. I'm going to think about this one for a bit.

Has the "social contract" of not pushing to master except for trivial commits been problematic?

Not in the tracks with seasoned, experienced maintainers, no. In new tracks (without an experienced buddy), very much so.

@iHiD
Copy link
Member

iHiD commented Jul 24, 2017

Has the "social contract" of not pushing to master except for trivial commits been problematic?

I'd also add that a small mistake in a trivial commit could have a much bigger effect now. For example, someone misplacing a couple of brackets and therefore mis-structuring the config.json while tidying the indentation or something, could mean that half the exercises disappear from the website immediately.

Again - very unlikely with seasoned maintainers/devs (but I wouldn't always bet on myself to not make mistakes!), more likely with less experienced.

@kytrinyx
Copy link
Member Author

I wouldn't always bet on myself to not make mistakes

I know I don't trust myself. I'll pay close attention every single time, except the time when I... don't for some reason.

@kytrinyx
Copy link
Member Author

As a side note, I'm going to take a look at what we can do to speed up Travis (use caching where possible, abort builds as soon as possible if we know we need to build a newer one, etc).

@stkent
Copy link

stkent commented Jul 24, 2017

If you're interested, the Java (and Kotlin) tracks recently looked into ways to speed up our CI builds; see exercism/java#410 for some discussion. exercism/java#326 is an issue we have identified but are yet to fix; it could save significant time when the build is broken if addressed.

@petertseng
Copy link
Member

petertseng commented Jul 24, 2017

what we can do to speed up Travis

I would guess that a time when speeding up Travis is helpful is when a PR is made to every track. Here is a thing that has happened when this has been done, historically:

  • Commit(s) is/are pushed to a branch on Exercism's copy of the track repository.
    • Travis CI builds that branch (as it does for all branches, if the setting is enabled).
  • The PR is submitted from this branch.
    • Travis builds the result of merging the PR (as it does for all PRs, and as will be required for all PRs).

If Travis CI would not build the branch, the number of Travis builds would be cut in half.

I thought of two ways to achieve that.

  • Use a branch on a fork instead of Exercism's copy.
    • I know, I know, this requires someone who wishes to submit a PR to every track to actually fork every track.
    • I honestly think it's cleaner to use a fork anyway.
      • When a repo is forked, all branches are copied. If branches are made on Exercism's copy, we foist branches on forkers unwittingly. Mitigation: Forker deletes the branch after forking.
      • By default (unless the fetcher uses -t flag), git remotes are set up to fetch all branches. So if branches are made on Exercism's copy, again we foist branches on fetchers unwittingly. Mitigation: use -t flag.
  • Turn "build branch updates" off.
    • I have confirmed at https://github.com/petertseng/octotest/pull/6 that octomerges can still be built.
      • Just have to make a PR for the octomerge instead of simply pushing it to a branch on Exercism's copy of the repo.
      • And indeed, by using a branch I was violating cleanliness, so maybe it is fine to use a PR instead).
    • If this had prevented octomerges from being built (and therefore being mergeable), dealbreaker for me. If "up-to-date with master" is a requirement, it will become prohibitively expensive to carry out my duties as a maintainer unless I am able to octomerge.
  • A way that I did not think of.

@iHiD
Copy link
Member

iHiD commented Jul 24, 2017

Also - we should investigate what it would take to up the limit from 5 to x. Rust-lang has 15 according to this post and could pay for more. Do we know anyone at Travis we could reach out to to ask what the max free level could be, and how much it would cost to extend that?

@NobbZ
Copy link
Member

NobbZ commented Jul 24, 2017

I think it were best if @kytrinyx just asked for a sponshorship at travis. Perhaps they can tune us to 10 builds in exchange for a small badge?

@petertseng
Copy link
Member

Update: octomerge still possible if "build branch updates" is off: https://github.com/petertseng/octotest/pull/6

@kytrinyx
Copy link
Member Author

Use a branch on a fork instead of Exercism's copy.

This is a great observation. To me it makes no difference where I make the PR from, I script everything anyway.

I honestly think it's cleaner to use a fork anyway.

Yeah, this is a good point. I've never thought about it much. I have thought about how to clean up old branches (as evidenced here: https://splice.com/blog/cleaning-git-branches/), but not how to avoid them in the first place. I'm pretty sure I'm convinced you're right.

Turn "build branch updates" off.

We could, but I see no reason why I (and others) shouldn't make the PRs from forks. That would solve half the problem.

we should investigate what it would take to up the limit from 5 to x

I'll ask the folks at Travis! I want to do my homework first, though, and do all that I can to speed things up first. Then if we get some extra cycles, we'll be making them count instead of wasting them.

@kotp
Copy link
Member

kotp commented Jul 24, 2017

We were talking about branches as well, instead, on the Delphi track. Having master for the base, but push / merge to Publish branch for active web display.

@kytrinyx
Copy link
Member Author

push / merge to Publish branch for active web display.

@kotp you mean for the exercism.io website?

I could see this complexity to the backend, and I'm not sure the trade-offs are worth it. I'd be open to hearing the pros/cons, though.

@kotp
Copy link
Member

kotp commented Jul 25, 2017

@rpottsoh has some thoughts, not quite hashed out yet, about potentially double publishing from "master" view and "published" view. Where any changes that make it to master can be viewed before going to the 'normal' view that users are aware of. Gives a safety check, of how everything works together before going 'live', while still in a live way.

But not really hashed out yet.

@kytrinyx
Copy link
Member Author

@kotp @rpottsoh it sounds like what you're thinking about is some sort of way to verify things in a staging environment.

Rather than something that involves automatically having multiple main branches (master for staging, published for production), I would like to encourage people to think about what it would take to build a stand-alone track-visualizer that could be used with any branch on the repository on GitHub. I think that would be more light-weight and potentially more useful than a full-on staging set-up.

@rpottsoh
Copy link
Member

@kotp @rpottsoh it sounds like what you're thinking about is some sort of way to verify things in a staging environment.

I think so essentially. Instead of the site pulling from Master it would instead pull from Published. Whenever Master is updated a maintainer would go to "http://exercism.io/test/Ruby", for example, or some other flag so that the site they arrived on would be utilizing their tracks Master branch. If no ill affects were observed the maintainer would then advance Published forward to Master....

That is the basic premise of what @kotp and I were discussing yesterday.

@kytrinyx
Copy link
Member Author

My gut reaction to having a full-fledged master/published pipeline is that this sounds like a lot of work for maintainers. My suspicion is that for the most part, people will be fine with going straight to production, so long as they've reviewed the pull request. There are always some changes that we're more wary of, and I'd love to find a system that lets us verify these without imposing a change across the board on our workflow.

@yurrriq
Copy link
Member

yurrriq commented Jul 25, 2017

FWIW I've had success with git hooks before, e.g. we could set it up such that you can't push to master unless the tests pass and the configlet exits with 0.

@kytrinyx
Copy link
Member Author

@yurrriq I've also had great success with git hooks—my problem with them is that it's too easy to not have everyone using them.

@yurrriq
Copy link
Member

yurrriq commented Jul 25, 2017

@kytrinyx, right. I feel like that could be a pro and a con though 😄. Pros: simple, easy to use. Cons: simple, easy to ignore.

@kytrinyx
Copy link
Member Author

@yurrriq too true! 😂

@kytrinyx
Copy link
Member Author

I forked all the track repositories and then discovered that you can turn off branch updates on Travis CI.

Now I've updated all the Travis settings for all the language tracks to the following:

screen shot 2017-07-29 at 1 30 15 pm

This should improve things somewhat.

There are some great suggestions in the Java thread that @stkent linked to (exercism/java#410).

The two that jump out at me are:

  • Fail fast. If configlet complains and is going to fail the build, then don't bother running the rest of test suite. Same for any other early change, I suppose, though sometimes it's useful to see the extent of the damage.
  • Only run the tests for the exercises if some of the exercises changed (potentially also config.json, though).
  • Only run the tests for the exercises that changed (ditto on config.json)

@NobbZ
Copy link
Member

NobbZ commented Jul 29, 2017

Running only those tests that seem to be necessary might be hard to detect. We had to scan the git tree since the last forking from master, see which files were altered and associate them with an exercise. Those detections might be error prone and take probably as much time as running and building the tests.

Per my observation, often the culprit lies in downloading and compiling dependencies. For some languages we need to repeat this even per exercise. If those languages were able to use a local cache instead then a detector wouldn't need to be created anymore but tests were much faster than before.

Some tracks even require compiling and installing a complete language (as idris), since they aren't officially supported by Travis. Those languages should be somehow ported to Travis as a possible base language.

@petertseng
Copy link
Member

Whether PRs are made from the base repo or a fork seems to still "use" a build slot up.

This is true, but I think that we want to build PRs. We just want to build them once, not twice (as in exercism/haskell#580 - I see "2 checks passed", "continuous-integration/travis-ci/pr The Travis CI build passed Details", and "continuous-integration/travis-ci/push The Travis CI build passed Details")

@kytrinyx
Copy link
Member Author

Whether PRs are made from the base repo or a fork seems to still "use" a build slot up.

This is correct, and desirable. What we were seeing was that we would create a branch on the repo (not a fork), push to the branch (the branch gets built), create a PR (a PR gets built). We were trying to eliminate the first build.

I don't want people to have to do manual work, but I also don't want to have twice the number of builds. It seems unnecessary to make everyone create forks.

I would like to suggest the following settings.

  • master cannot be pushed to directly
  • before they can be merged, all pull requests must
    • have a passing build
    • be up-to-date with master

This is similar to my first suggestion, but I've removed the bit about "at least one approval" because we don't have enough maintainers to make that possible.

Another option for master is to do what @Stargator suggested in the Dart track and add branches: only: master to the travis config.

However, I really don't want us to be pushing to master, both for the sake of visibility (notifications happen on PRs not on pushes to master) and for the sake of safety (v2 updates the website based on master automatically and immediately. The site would break even before a build completes, if we get something wrong).

@petertseng
Copy link
Member

petertseng commented Sep 21, 2017

  1. Based on my understanding of the properties Exercism wishes its CI to have, I think I cannot stand behind my suggestion of turning "Build branch updates" on, unfortunately.
  2. Regardless of whether that's true, I still suggest that development branches be made on forks, not on Exercism's copy.

Readers who only wanted to see that summary can stop reading; the below only expands on the above two points.


My understanding of the properties we desire:

  • The most important, and critical, is Safety. Before a given commit is used by the website, it needs to have a passing build.
  • Also important, but not critical, is At-most-once. We'd like to avoid building commits twice, which is especially important when PRs are simultaneously made to all language tracks.

The suggestion:

  • master cannot be pushed to directly
  • before they can be merged, all pull requests must
    • have a passing build
    • be up-to-date with master

maintains both properties. I strongly dislike the "be up-to-date with master" setting, but I have no alternative that maintains the desired properties.

My suggestion of turning "Build branch updates" on for all branches has no effect on Safety, but interferes with At-most-once because everything will get built twice and maybe thrice.

  1. If a branch is made on Exercism's copy of the repo, the branch is built. I say people should not make branches on Exercism's copy, but the point stands.
  2. The PR is built before it can be merged.
  3. After it gets merged to master, master is built.

The second build is what we desire in all cases, but the first and third are not.

The suggestion of exercism/dart#35 (comment) (Turn "Build branch updates" on but use branches: only: master to make it only build the master branch) always eliminates the first build, but never the third, so it also falls short on At-most-once.

At this point the only disadvantage I have left to offer of leaving "Build branch updates" off is that any Travis CI badge (The ones that say "Build: passing", see e.g. https://github.com/exercism/ruby which uses https://travis-ci.org/exercism/ruby.svg?branch=master) are relatively useless since we will not have any branch with up-to-date builds for that image to use. Perhaps the issue can be remedied by using https://docs.travis-ci.com/user/cron-jobs/ to build master periodically. I will trial this immediately.

The only alternatives I have to offer to preserve Safety:

  • I wish the suggestion of Turning on branch protection for all tracks #174 (comment) (a published branch) would work, but in an automated fashion. If Travis-CI would, upon a passing build on master, push the commit that passed to a branch (let's say it's called the green branch), then the website can use the green branch instead of the master branch, and no manual work is required. Unfortunately, I do not see Travis-CI offer such an option.
  • If the website would inspect the build status of each commit on master, and only take the most recent one for which the build is green (rather than the most recent one, unqualified), that would do, but increases complexity. I can't recommend that the website does this (but y'all can if y'all want...).

On using forks versus Exercism's copy of the repo:

It seems unnecessary to make everyone create forks.

Well, every would-be contributor who doesn't have write permissions to the repo must create a fork anyway.

In either case, I still think what I say in #174 (comment) applies:

  • When a repo is forked, all branches are copied. If branches are made on Exercism's copy, we foist branches on forkers unwittingly. Mitigation: Forker deletes the branch after forking.
  • By default (unless the fetcher uses -t flag), git remotes are set up to fetch all branches. So if branches are made on Exercism's copy, again we foist branches on fetchers unwittingly. Mitigation: use -t flag.

I don't have any force behind those bullet points or any strong principles on why they're a good idea; just vague feelings of "if there are branches on a repo that people might clone, they should only be branches that people would benefit from having", and I think a lot of the branches we've made are those that people do not benefit from having.

@Stargator
Copy link

As a contributor on GitHub, I'm used to forking repos and keeping up work.

@kytrinyx
Copy link
Member Author

My understanding of the properties we desire:

  • The most important, and critical, is Safety. [...]
  • Also important, but not critical, is At-most-once [...]

@petertseng thank you, this is an excellent summary.

I strongly dislike the "be up-to-date with master" setting

I'd like to understand this stance better. What do you dislike about it?

When a repo is forked, all branches are copied. If branches are made on Exercism's copy, we foist branches on forkers unwittingly.

I understand the technicality of it, however I'm not certain I understand why these branches are a burden. I do my best to prune and keep unnecessary/dead/finished branches to a minimum. If we had hundreds or (as I've sometimes seen, thousands, many of which are years old) then yes—definitely a problem. But if there are a handful of branches, I don't understand what problem they're causing. Would you please help me see your point of view?

Well, every would-be contributor who doesn't have write permissions to the repo must create a fork anyway.

Yes, that's completely fair. That said, most contributors only need to fork and clone one or two or at most a small handful of repositories. I think that if I just needed to create one or two repositories I wouldn't mind, but I need to have a fork of 60 different repositories on my personal account. I did this for a while, and invariably I would miss something, even when I scripted things as much as possible. Every time I bootstrap a new repository, I must also fork it to my personal account. I need to make sure to keep master on my local copies up to date with 60 different upstream repositories, and also update my 60 forks. In my case it adds a not-insignificant additional burden in order to do janitorial work that I most often wish I didn't have to do in the first place.

@petertseng
Copy link
Member

petertseng commented Sep 25, 2017

I strongly dislike the "be up-to-date with master" setting

I'd like to understand this stance better. What do you dislike about it?

Consider the case when I have N PRs simultaneously in flight.

Without "be up-to-date with master":

  • As soon as the build passes for each PR, it can be merged.
  • Dangerous (violates Safety) if multiple PRs pass CI individually but fail CI when merged.

With "be up-to-date with master", we have two choices:

  1. Merge the first PR, rebase[1] the second, wait for CI to run, merge the second, rebase[1] the third, wait for CI to run...
    • Turned a parallel operation into a serial operation.
    • Preserves Safety by serially building the next after each other PR is merged.
  2. Octomerge all PRs simultaneously.
    • Incurs the cost of one additional CI build (the build that ensures the octomerge result is good), rather than N.
    • Preserves Safety by building the result of merging all the PRs together.
    • I highly prefer choice 2 over choice 1.
    • Examples of when this has been done in Exercism in the past:

[1]: It does not matter whether we rebase on master, or use GitHub's "Update from master" (merges master into the branch) with the intent to squash later; the parallel -> serial still applies.

As you can see, although it is necessary for Safety, it does slow things down. As long as I can still octomerge so that I'm only slowed down by O(1) instead of O(N), I accept the slowdown for the sake of Safety. If there is any setting that prevents my ability to octomerge, I will not be able to accept the slowdown.

(You may look at #174 (comment) for a past discussion, but I have put the relevant information here as well, so you do not need to)


When a repo is forked, all branches are copied. If branches are made on Exercism's copy, we foist branches on forkers unwittingly.

I understand the technicality of it, however I'm not certain I understand why these branches are a burden.

Me neither. Are they?

I can speak for myself. I don't like to keep old branches around, and maybe I don't like to see other people old branches around. Let's say we do development on exercism's branch B. Contributor C forks the repo during that time, and now they have branch B. We merge branch B and then delete it. Ah, but Contributor C still has branch B on C's fork, and there is no way to delete that unless C explicitly takes action.

So, is that a problem? I know such a thing bugs me personally, but for the purposes of this discussion maybe it's only a problem if Contributor C cares about it. We haven't heard anyone complain, have we?

I need to make sure to keep master on my local copies up to date with 60 different upstream repositories

You would have to do this no matter whether your local copy's origin is a personal fork or Exercism's copy, so on this point it seems to me it makes no difference whether you fork: the operations performed locally are still the same.

and also update my 60 forks.

I personally don't think it's necessary to keep the master branch of the fork (as it appears on GitHub) up to date. It's necessary to keep it up to date locally, of course. I set up my local repos such that the master branch (and only the master branch) pulls from Exercism's copy.

I need to have a fork of 60 different repositories on my personal account. I did this for a while, and invariably I would miss something, even when I scripted things as much as possible. Every time I bootstrap a new repository, I must also fork it to my personal account

On this point, it does make a difference. (I have no additional comments to add but am just pointing this out to contrast with the above point)

In my case it adds a not-insignificant additional burden in order to do janitorial work that I most often wish I didn't have to do in the first place.

I wondered if it would be too sardonic to say "then just don't do it", but I know that not doing it is not an option when the change blocks a configlet release or a website change, so I have no alternative to offer.

So, I really don't have any force behind the suggestion, and I fear my reasoning just boils down to "it bugs me but I can't explain why". Making branches on Exercism's repo bugs me at about a level of 4/10, so if making forks bugs you at any level >= 4, I take your word for it.

@kytrinyx
Copy link
Member Author

I wondered if it would be too sardonic to say "then just don't do it", but I know that not doing it is not an option when the change blocks a configlet release or a website change

It's not an option. I'm already a huge bottleneck. I'm trying my best to make it so that I am not, but it's a long, long road.

I fear my reasoning just boils down to "it bugs me but I can't explain why".

That's completely fair.

Making branches on Exercism's repo bugs me at about a level of 4/10, so if making forks bugs you at any level >= 4, I take your word for it.

I'll think on it some more. I really appreciate you taking the time to explore the nuances in this.

@petertseng
Copy link
Member

I am trying to see if someone (well, someone on the internet) has written anything on the subject matter, but I have not yet been able to find anything.

I may have inherited my attitude from someone on my team who says "I want the main repo's branches to be pristine so people looking at it don't get confused by the branch list", but I suppose we don't have so many branches as to make that a concern.

I asked about that and heard in response that a certain project really wanted to ensure that their main repo has as few refs as possible so that it wouldn't be so slow to clone. Again, I don't think we have that many branches that it's a concern.

It seems that these concerns are useful in larger repos, but not as important here. So I guess that's why it bugs me without me being able to explain why. Since there's less mental overhead for me if I keep the same workflow for all the projects I work on, it's why I continue to use forks for Exercism, but the same doesn't necessarily apply to anyone else.

@Stargator
Copy link

When a repo is forked, all branches are copied. If branches are made on Exercism's copy, we foist branches on forkers unwittingly.

Here's the thing, GitHub won't let you merge in changes if there are conflicts with the PR and master. The setting to be "up to date with master" is meant to avoid conflicts. But any PR reviewer should see the notification by GitHub that there's a conflict.

Given then, unless git detects a conflict I think it is fine to work without the "up to date with master" setting.

My team has been using GitHub for just two years. At some point, we started using the setting, then switched it off because GitHub prevents PR merges (through GitHub, not command-line).

So maybe consider that.

@petertseng
Copy link
Member

Here's the thing, GitHub won't let you merge in changes if there are conflicts with the PR and master.

I agree that this statement is true.

The setting to be "up to date with master" is meant to avoid conflicts

I disagree that this statement is true... depending on the definition of "conflict". It doesn't do anything with regard to conflicts in the git merge sense, since you always have to resolve those no matter whether "up to date with master" setting is enabled. One thing the setting helps to avoid is the sort of... we'll call it logical conflict where two commits that branched off of master at the same point pass CI individually but fail CI when merged together.

@kytrinyx
Copy link
Member Author

kytrinyx commented Oct 1, 2017

One thing the setting helps to avoid is the sort of... we'll call it logical conflict where two commits that branched off of master at the same point pass CI individually but fail CI when merged together.

Yes, this is exactly it. We've run into this problem on several occasions, where the conflict is conceptual, not an actual line/merge conflict.

@kytrinyx
Copy link
Member Author

kytrinyx commented Oct 1, 2017

It seems that these concerns are useful in larger repos, but not as important here.

Thank you for elaborating, @petertseng.

In this context, where the repos are quite small and the branches are low in overhead both mentally (looking at the list of branches and seeing what's going on) and resource-wise (in terms of cloning), I'm going to go ahead and make the call that maintainers can (but are not forced to of course) work on branches on the main repository.

@petertseng -- the proposed settings would still allow octomerge, correct?

@petertseng
Copy link
Member

the proposed settings would still allow octomerge, correct?

As far as I can tell, yes. I have tested this on a repo where I changed the settings to those proposed.

@petertseng
Copy link
Member

Here is a potential concern with "require branches to be up-to-date with master". The potential impact increases as the repo has more simultaneous PRs in flight:

I wonder if this will tempt some people to use the "Update branch" button, making a merge commit, and then also use "Merge pull request" that makes another merge commit.

Again, this bugs me but I'm not sure if I should be bugged by it. The best explanation I can come up with is that I want at most one merge commit, because I think having two is redundant and I like to avoid redundancy when possible. I like one merge commit if the PR has multiple commits since it acts as a record of what PR the commits came from, but I don't really have a justification for two merge commits.

There are many ways that avoid doubling up on merge commits. So the concern is mitigated if we are aware of the below ways.

  • Merger uses "Squash and Merge" button. This is only possible if the PR makes sense as a single commit.
  • Merger uses "Rebase and Merge" button. The only thing I don't like about this approach is that as far as I can tell it is then no longer possible to tell what PR the commits came from if I don't have an internet connection. I don't know if that is an actual concern; I have not actually had to make use of this information without an internet connection, so it is possible that Rebase and Merge is the way to go if the PR must remain as multiple commits (Squash and Merge is thus not an option).
  • Merger rebases the branch on master and pushes it to contributor's fork, and only then uses "Merge Pull Request", creating one merge commit. Not possible if contributor explicitly chose to disable that (I typically don't see people disable it).
  • Well, one repo on which I think this concern may surface itself is problem-specifications. BUT, maybe we decide we do not need "Require branches to be up-to-date" for the problem-specifications repo, because the live site does not depend on it (right?)

@kytrinyx
Copy link
Member Author

kytrinyx commented Oct 1, 2017

Again, this bugs me but I'm not sure if I should be bugged by it.

This bugs me too, but I've decided to live with it, because it gives me safety (in other words, it's not truly redundant, because it provides safety, but it feels redundant, because it has extra merge commits).

one repo on which I think this concern may surface itself is problem-specifications. BUT, maybe we decide we do not need "Require branches to be up-to-date" for the problem-specifications repo, because the live site does not depend on it (right?)

Yes, that's correct. I think we do not need to make this a required setting for that repository.

@Stargator
Copy link

Frankly, I use Squash and Merge. A PR shouldn't be so big that the effort needs to be broken up into commits. If so, then the settings could be changed to allow whatever approach seems best at the time and then revert back.

@kytrinyx
Copy link
Member Author

kytrinyx commented Aug 3, 2018

We've imported this issue to the https://github.com/exercism/exercism.io repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests