Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow BatchNodes to reuse prior computed arrays if they would generate hte same value #62169

Merged
merged 21 commits into from
Jun 30, 2022

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Jun 27, 2022

Followup to #62100

This drops memory allocations by around 500 MB:

image

to

image

The reason for this is that we're constantly producing the exact same GlobalAlias[] array as the result of our .Collect call in the incremental generator. The change here lets us peek back at the prior array and then reuse it if we would produce the same.

Here we literally go from 5000 allocations to a single one that is shared from that point onwards. Only if the global aliases actually change would we need to perform a new allocation.

@CyrusNajmabadi CyrusNajmabadi marked this pull request as ready for review June 28, 2022 00:06
@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner June 28, 2022 00:06
sourceValuesBuilder.Builder.SequenceEqual(previousTable.Single().item, EqualityComparer<TInput>.Default))
{
return (previousTable.Single().item, sourceInputs);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part is the new logic that reuses from the previousTable if possible.

@CyrusNajmabadi
Copy link
Member Author

@jcouv @dotnet/roslyn-compiler for eyes. Thanks!

@CyrusNajmabadi
Copy link
Member Author

@chsienki for eyes. Thanks!

@CyrusNajmabadi
Copy link
Member Author

@cston have taken your suggested approach. will comment inline with what's going on. thanks!

sourceInputsBuilder?.Add((entry.Step!, entry.OutputIndex));

if (entry.State != EntryState.Removed)
entryCount++;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass determiens how many entries we have. note: this will allocate an IEnumerable. If we worry about the cost of that, we could have a dedicated method on SourceTable (called something like "CountOfNonRemovedEntries").


return (result, sourceInputs);

ImmutableArray<TInput> tryReusePreviousTableValues(int entryCount)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does non-allocating pass looking to see if we can reuse the prior array. it checks the lengths are the same then checks all non-removed items against the prior array. if a match, we can reuse.

}

return newTable.ToImmutableAndFree();
ImmutableArray<TInput> computeCurrentTableValues(int entryCount)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally, we do a pass where we allocate a builder. however, importantly, we know the exact amount of items we need, so the builder doesn't create a scratch array that has the chance of not being pooled if too large.

NodeStateTable<ImmutableArray<TInput>>.Builder newTable)
{
// Do an initial pass to both get the steps, and determine how many entries we'll have.
var sourceInputsBuilder = newTable.TrackIncrementalSteps ? ArrayBuilder<(IncrementalGeneratorRunStep InputStep, int OutputIndex)>.GetInstance() : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we initialize this with sourceTable.Count as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. that would be a reasonable approximation. The only reasons i didn't were that:

  1. it's not exactly correct. Tables have entries, and entries have steps. So it's quite possible to have way more steps tahn .Count returns.
  2. TrackingIncrementalSteps is done in tests, so being that efficient is not a pressing concern on my part. It doesn't show up in my benchmarks either as i'm not turning on that capability :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

Copy link
Member

@333fred 333fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love how much simpler this is. Just one question.

@CyrusNajmabadi CyrusNajmabadi merged commit 67d572d into dotnet:main Jun 30, 2022
@ghost ghost added this to the Next milestone Jun 30, 2022
@CyrusNajmabadi CyrusNajmabadi deleted the pooling3 branch June 30, 2022 02:36
@allisonchou allisonchou modified the milestones: Next, 17.4 P1 Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants