-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up git history #350
Comments
Does that mean we should merge or close all open pull requests before doing this? |
I used qgit to look at your repo and it still shows merged branched properly. A lot of the early history seems to be reformatting, but there's also some useful stuff there, like the introduction of the I also wonder about branches like the 0.3 and 0.4 branches and release tags; are these going to need recreating? As @vks says, it doesn't seem sensible to do this migration while we have many PRs open. |
Most of these PR's are from me, and I'll manage 😄. Then there are three PR's left. With the exploration I am doing in dhardy#82, I don't think we will take #198. #144 would need some work, but shouldn't be hard to rebase. I don't care much yet for #152, but we could always recreate that one. Recreating branches and tags if necessary should not be all that much effort. |
I would be in favor of it, assuming it is not too much work. |
Updated https://github.com/pitdicker/rand_clean_history/. But somewhere along the way git started uploading the 100mb of old history, and I am not yet sure why. |
Created a new repro https://github.com/pitdicker/rand_clean/. It is less than 2 mb. |
This should be the operation to bring it over to this repo (tested on my rand_clean_history repro): git clone https://github.com/pitdicker/rand_clean
cd rand_clean
git remote add nursery https://github.com/rust-lang-nursery/rand
# From the github interface:
# - make gh-pages the default branch
# - delete all other branches
# delete all tags (one by one -- ugly!)
git push --delete nursery 0.1.1
git push --delete nursery 0.1.2
git push --delete nursery 0.1.3
git push --delete nursery 0.1.4
git push --delete nursery 0.2.0
git push --delete nursery 0.2.1
git push --delete nursery 0.3.0
git push --delete nursery 0.3.1
git push --delete nursery 0.3.10
git push --delete nursery 0.3.11
git push --delete nursery 0.3.12
git push --delete nursery 0.3.13
git push --delete nursery 0.3.14
git push --delete nursery 0.3.15
git push --delete nursery 0.3.16
git push --delete nursery 0.3.17
git push --delete nursery 0.3.18
git push --delete nursery 0.3.19
git push --delete nursery 0.3.2
git push --delete nursery 0.3.20
git push --delete nursery 0.3.21-pre.0
git push --delete nursery 0.3.22
git push --delete nursery 0.3.3
git push --delete nursery 0.3.4
git push --delete nursery 0.3.5
git push --delete nursery 0.3.6
git push --delete nursery 0.3.7
git push --delete nursery 0.3.8
git push --delete nursery 0.3.9
git push --delete nursery 0.4.0-pre.0
git push --delete nursery 0.4.1
git push --delete nursery 0.4.2
git push --delete nursery derive_rand-0.1.1
git push --delete nursery rand_core-0.1.0-pre.0
git push --delete nursery rand_derive-0.3.0
git push --delete nursery rand_derive-0.3.1
git push --delete nursery rand_macros-0.1.10
git push --delete nursery rand_macros-0.1.2
git push --delete nursery rand_macros-0.1.3
git push --delete nursery rand_macros-0.1.4
git push --delete nursery rand_macros-0.1.5
git push --delete nursery rand_macros-0.1.6
git push --delete nursery rand_macros-0.1.7
git push --delete nursery rand_macros-0.1.9
# restore branches and tags
git push master 0.4 0.3 --tags
# From the github interface:
# - make master the default branch |
@dhardy What do you think, would it be okay if I execute the commands above this afternoon? Very, very carefully and with plenty of backups... |
@pitdicker make a local clone of the old repo first, then go ahead. But you don't need to delete remote branches; you can simply overwrite: Also, please check these are actually equivalent first, i.e. I think we should also keep prior history around for a while with an |
Done. I checked with
Yes, good idea. No need to hurry here. |
Oops, apparently GitHub doesn't like PRs with more than 10.000 commits that differ. I will rebase #320 and document the steps. |
Great. |
It is now 3½ months later, and things have gone reasonably smooth. I just found out the For some reason GitHub shows the
How long should we keep the |
I don't think we should care much about
|
Hi, I just cloned the repo and was surprised git downloaded 85 mb of data. I guess it's linked to this issue? |
Is it already so large? But yes, this effort is the reason it's not even bigger. I don't believe we can reduce it again without a lot of disruption, however we should consider migrating some sub-crates to new repos to stop the issue getting worse. |
What would you suggest doing about this? Rebasing the history again? I'm not keen; we have quite a few active PRs and many complete ones whose history would be messed up. |
I think the old-master branch still needs to be deleted. |
I don't know the rand code base at all sorry. I just tried to pinpoint some commits bringing few Mb to the repo. Since you said
I was thinking, if those commits are from that old_master branch (that I don't know) maybe deleting it would save quite a lot Mbs. |
Actually I just checked and most of those heavy files seem to be only accessible from |
Aha. Thanks @pitdicker. Well, we haven't needed that branch for a long time so we should be able to safely delete it (I will keep a local copy for a while just in case). |
Seems to have fixed it. Thanks @mpizenberg for bringing this up! |
When the repro was created, to places where
rand
lived in the rustc repro where combined. One fromlibrand
, and one from the standard library (writing from memory here). Things went a bit messy, andrand
ended up with two git roots.git filter-branch
did remove a lot, but still we have most of the commits to the rust repro until 2014.I did some effort to clean things up (and that took a surprising amount of work) in https://github.com/pitdicker/rand_clean_history. That is half a year ago, so it would need updating.
Do we want to clean up the git history. Now we need a huge checkout, for what could be a couple of mb. And a large part of the history is basically undecipherable now.
The bad side is that it would cause breakage, and I don't know how bad. It would bring the forks and branches from others badly out of sync. Would merged PR's still be as explorable via github?
So I am not sure it is worth the effort, but would sure like to see this.
The text was updated successfully, but these errors were encountered: