Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache the hash in compilation options #62289

Merged
merged 9 commits into from
Jul 1, 2022

Conversation

CyrusNajmabadi
Copy link
Member

Also, produce far less garbage when computing the hash. Together, this saves around 110MB of allocations in incremental testing:

image

@@ -68,6 +68,30 @@ internal static int CombineValues<T>(IEnumerable<T>? values, int maxItemsToHash
return hashCode;
}

internal static int CombineValues<TKey, TValue>(ImmutableDictionary<TKey, TValue> values, int maxItemsToHash = int.MaxValue)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these helpers ensure that when we do compute the hashcode we dont' incur allocations going through the CombineValues<T>(IEnumerable<T>) path that existing calls are going through.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are copied from teh IEnumerable version, just tweaked accordingly. we really need shapes :)

@@ -1,3 +1,4 @@
*REMOVED*override Microsoft.CodeAnalysis.CSharp.CSharpCompilationOptions.GetHashCode() -> int
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is safe. it's jsut a removal of hte override.

@@ -261,6 +261,8 @@ protected set

private readonly Lazy<ImmutableArray<Diagnostic>> _lazyErrors;

private int _hashCode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason you're using 0 as the sentinel, not making this int?? Just easier for multithreading purposes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force of habit. it's just a pattern i got very accustomed to for some reason over my years :) happy to change to nullable if you'd like :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's also likely that i trust ints more (with nullable, not sure what the multithreading concerns may be) :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine to leave is as, was more for understanding why this approach was taken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an int won't tear but an int? might.

If we're concerned about multithreading, is there a need for some kind of memory barrier when we write to _hashCode?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that there's anything we need to be concerned about with int. Reads and writes are atomic, and since the calculation is stable, the worst that could happen is potentially multiple writes of the same value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup. i like int because it's just so safe.

@jcouv
Copy link
Member

jcouv commented Jun 30, 2022

@CyrusNajmabadi Is there a reason this is on auto-merge rather than auto-squash?

@CyrusNajmabadi
Copy link
Member Author

Yes. Like with all my change here I'm continually cross merging across about a dozen branches. Squashing destroys the entire flow and leads to horrendous merge conflicts. Afaict, it seems to buy nothing and is just a royal pita. Is there value in squashing?

@davidwengier
Copy link
Contributor

Is there value in squashing?

This came up yesterday as it happens... #62272

@RikkiGibson
Copy link
Contributor

Squashed PR commits make my life easier as a tiger when I'm looking through the history to try and figure out which recent change broke the VS insertion. I do acknowledge it's tougher to avoid merge conflicts for multiple ongoing sets of changes when you squash each change as it goes in.

This is a workflow I tend to use when I have multiple sets of outstanding changes:

  1. merge the latest bits from the PR head and base into my "subsequent work" branch
  2. squash merge the PR on GitHub
  3. back-up my "subsequent work" branch e.g. git branch subsequent-work-2
  4. reset to the latest commit on the base branch. e.g. git reset --hard features/whatever
  5. check out the "subsequent work" version of the files e.g. git checkout subsequent-work-2 -- .
  6. git commit -m "Subsequent work"

There are probably better workflows out there for this and I can't fault you for sticking with what works for you. That's why we let the author decide whether to squash.

@CyrusNajmabadi
Copy link
Member Author

Squashed PR commits make my life easier as a tiger when I'm looking through the history to try and figure out which recent change broke the VS insertion.

It feels like what we care about is what PRs flowed into main. So we can about the commit log at a depth of 1. Anything beyond that isn't relevant right?

@CyrusNajmabadi
Copy link
Member Author

This is a workflow I tend to use when I have multiple sets of outstanding changes:

That's actually pretty nice. I can see myself doing that for 1-2 cross branches. This has been more than a dozen though. Which feels exactly like what normal merges are good at (one step instead of 6).

@CyrusNajmabadi
Copy link
Member Author

This came up yesterday as it happens... #62272

Strange, does github not have a way to limit depth? Given how oriented around PRs it is, I'm surprised it doesn't have a merge centric UI here.

@RikkiGibson
Copy link
Contributor

This came up yesterday as it happens... #62272

Strange, does github not have a way to limit depth? Given how oriented around PRs it is, I'm surprised it doesn't have a merge centric UI here.

I haven't seen a way on github itself. We do have various command line tools/helper functions that search the commit log for merge commits e.g. when we generate insertions. But there's no webpage that I can plug the commit SHAs into and get the PRs in between.

Squashed PR commits make my life easier as a tiger when I'm looking through the history to try and figure out which recent change broke the VS insertion.

It feels like what we care about is what PRs flowed into main. So we can about the commit log at a depth of 1. Anything beyond that isn't relevant right?

I think that's right. I don't know how to limit the depth of e.g. a git log command though. It would definitely be handy to be able to "ignore commits brought in by the 2nd parent" when the log contains a merge commit. (not sure if I have the terminology precisely right there, but you probably get what I'm going for.)

@jcouv
Copy link
Member

jcouv commented Jun 30, 2022

I constantly run into issues dealing with non-squashed PRs. Rikki listed a few. This is the compiler guideline at the moment and we're able to work just fine. Bring up with the team if you'd like to discuss further.

Some examples which I'd mentioned to you before:

  • had to figure out whether any changes between two commits was significant enough to hold off a build, this is what this looks like.
  • git blame will list meaningless history such as "Update src/Compilers/Core/Portable/InternalUtilities/Hash.cs"

@jcouv jcouv disabled auto-merge June 30, 2022 23:42
@CyrusNajmabadi CyrusNajmabadi enabled auto-merge (squash) July 1, 2022 02:25
@CyrusNajmabadi
Copy link
Member Author

This is what i see when i try to locally figure out that information:

C:\github\roslyn [main ≡]> git log --oneline --first-parent 0c2fb6c0e941a32b519da371bc4b253d6cfad375...96beab67e4899a615b401a0803f076697dc747b2
96beab67e48 Merge pull request #61702 from DoctorKrolic/empty-types-quick-info
e2953249c85 Merge pull request #61701 from DoctorKrolic/return-in-regular-top-level-statements
b1fbb442e69 Merge pull request #61568 from CyrusNajmabadi/localFuncindexing
d79fe4eee78 Nullable annotate the lexer and a few related files (#61688)
945f9df54c3 Merge pull request #61683 from DoctorKrolic/no-nullable-enums-in-switch
0a7a1c05ba7 Merge pull request #46349 from sharwell/update-xunit-analyzers
e450bea81de Change target branch (#61696)
5e218ce3ac8 Stop attempting to include InteractiveComponents VSIX (#45159)
2a9938ad11e Updates based on Visual Studio 2019 Version 16.10 (#53966)
57147bcd929 Merge pull request #61630 from CyrusNajmabadi/solutionSync2
fe81f93cad9 Merge pull request #51474 from Youssef1313/remove-unneeded-proj-refs-expeval
eda19291e72 Merge pull request #51476 from Youssef1313/remove-unneeded-ivt-interactive
99195eac553 Merge pull request #61649 from CyrusNajmabadi/cacheSkeletonSet
b26a47f3b6d Merge pull request #61689 from DoctorKrolic/throw-as-control-keyword-vb
c34d129c928 Merge pull request #39006 from TIHan/loc-fixes
dbbf6a971e4 Merge pull request #48161 from Youssef1313/patch-46
36cf1ba1f25 Merge pull request #61685 from DoctorKrolic/throw-in-vb
318c113ac94 Merge pull request #61584 from DoctorKrolic/initializers-in-top-level-code
cc4bf245c9e do not attempt to add ErrorCode.WRN_ShouldNotReturn twice (#61682)
eba89630362 Merge pull request #61681 from dotnet/merges/release/dev17.3-to-main
28814d3ba94 Merge pull request #61676 from dotnet/merges/release/dev17.3-to-main
ceb02a6b7d3 Merge pull request #61671 from CyrusNajmabadi/safeEnumeration
85103ec1b2e Merge pull request #61673 from dotnet/jamesnk/virtualcharservice-static-property
bfe247943db Merge pull request #61664 from CyrusNajmabadi/underlineCrash
071e86af4df Merge pull request #61666 from CyrusNajmabadi/extractBaseClassPriority
3cca4fdc3b1 [LSP] Utilize list of all classification types + expose types to Razor (#61395)
6a8cd63884a Merge pull request #61656 from sharwell/batch-remove
2699c039de5 Mention sharplab in bug template (#61637)
d4cbcb54a33 Emit error when function pointer invocation is encountered in an expression tree (#61644)
f2abe2ca16b Report a warning for multiple matches for implicit interface implementation of static member (#61607)
dc866eddeae Perform language version check for an implicit implementation of a static virtual member (#61601)
9722863bcfc Merge pull request #61634 from CyrusNajmabadi/synchronizeSimplification
6d5ce2dc3b2 Merge pull request #61641 from dotnet/merges/release/dev17.3-to-main
ca780bbba00 Move the semantics checks of Inheritance margin to OOP (#61592)
9c99ebe3875 Mask count in shift operations on native integers (#61341)
2a97d0d771c Merge pull request #61614 from jasonmalinowski/fix-comment
30055853724 Merge pull request #61632 from dotnet/dev/jorobich/update-publishdata
c025a20d55e Merge pull request #61608 from CyrusNajmabadi/solutionSync
e95c4956afe Merge pull request #61621 from dotnet/merges/release/dev17.3-to-main
d81c03cdca3 Update config file for 17.3 P2 snapping (#61619)
74360dcd9f4 Merge pull request #61613 from CyrusNajmabadi/skeletonClone
57827824b22 Merge pull request #61549 from CyrusNajmabadi/renameAsync2

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Jul 1, 2022

git blame

Every time i look at a commit that blame shows me, it gives me that info:

image

Bring up with the team if you'd like to discuss further.

I'm ok with this if that's your preference. But it's much more difficult to do work. It basically grinds parallel development to a halt needing to continually make these merged changes work properly across all branches.

My preference would be:

  1. if you are not doing interrelated parallel work, do a squash.
  2. if you are doing parallel work which involves lots of cross merges, then just do a merge.

In this case i'm constantly doing small branches off of branches so that i can do lots of A/B testing of different potential fixes, then cross merging to see how different overall direction paths compare against each other. This works well because it's so seamless (given how attuned git is to the merge-based workflow). Once you start squashing then it becomes a nightmare where all that merge understanding is lost and you effectively have to resolve all the exact same changes you made (which can be really unpleasant depending on how many files were touched). I have had to deal with this, and i've screwed things up during that resolution and ended up broken and having to revert back to start several times.

@davidwengier
Copy link
Contributor

davidwengier commented Jul 1, 2022

needing to continually make these merged changes work properly across all branches

Could you explain, or give an example of what you mean by this, because I don't follow. I probably just don't work that way.

Though the other day I had a PR open, and another PR that was based off the first. When I squash merged the first one, GitHub correctly showed the second PR as having only the diff between it and the first. Admittedly the commit list in GitHub for that second PR did show way more than it should have, but I didn't worry about that, since I knew I was going to squash that one too.

UPDATE: I forgot about conflicts... and I had one yesterday in my example, so yes, that can definitely be a pain.

@CyrusNajmabadi
Copy link
Member Author

I'm going to try out Rikki's approach and will see if i can script something up to do it for me.

@CyrusNajmabadi CyrusNajmabadi enabled auto-merge (squash) July 1, 2022 16:11
@CyrusNajmabadi CyrusNajmabadi merged commit cec2aed into dotnet:main Jul 1, 2022
@ghost ghost added this to the Next milestone Jul 1, 2022
@CyrusNajmabadi CyrusNajmabadi deleted the hashCache branch July 1, 2022 17:13
333fred added a commit that referenced this pull request Jul 5, 2022
…ures/semi-auto-props

* upstream/main: (887 commits)
  Ensure elastic trivia for reusable syntax in field generator (#62346)
  Fix typos in the incremental generators doc (#62343)
  Theme The "Generate Overrides" Dialog (#62244)
  Walk green-nodes in incremental-generator attribute-finding path (#62295)
  Cache the hash in compilation options (#62289)
  Respect dotnet_style_namespace_match_folder (#62310)
  Remove unreachable condition
  Specify builder capacities in incremental generation to avoid wasted scratch arrays. (#62285)
  Skip the test (#62287)
  Revert "Revert "Add Move Static Member To Existing Type (#61519)"" (#62284)
  Highlight the search term in the options page (#61301)
  Synch handlers with fix (#62209)
  Disable integration tests
  Fix
  Set capacity of builder to avoid expensive garbage.
  Add public APIs for opened and closed event handling for non-source documents
  Handle possible null symbols in `getAttributeTarget` (#62137)
  Perform a lookahead rather than a parsing attempt in order to determine if current token starts a conversion operator declaration. (#62240)
  Fix a race in CachingDictionary. (#62248)
  Simplify
  ...
@allisonchou allisonchou modified the milestones: Next, 17.4 P1 Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants