Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move all reflection caches out of statics to a centrally-stored location #1341

Merged
merged 14 commits into from
Sep 19, 2022

Conversation

alistairjevans
Copy link
Member

@alistairjevans alistairjevans commented Aug 21, 2022

This PR moves all of our reflection caches into an instance of the ReflectionCache class. It also incorporates some of @RogerKratz's changes in #1339 to cache assembly types during scanning.

I'm not finished with comments or tests yet, but wanted to push this up for discussion/thoughts.

A ReflectionCache is a representation of a set of IReflectionCacheStore. Each instance of IReflectionCacheStore is usually a derived implementation of ConcurrentDictionary<TKey, TValue>, where TKey and TValue can vary based on what is being cached in each store.

The ReflectionCache holds its own set of named IReflectionCacheStore instances in a ConcurrentDictionary<string, IReflectionCacheStore>, however there are a number of internal-only paths where it makes little sense to add another whole dictionary lookup to access a cache we will always be using. Hence, there is the InternalReflectionCaches class; an internal property on IReflectionCache exposes an instance of this class that contains a set of known caches pulled from the rest of the codebase.

I tried to figure out a way to keep the definition of our existing caches close to the use of the cache that doesn't hurt performance, but I could not come up with anything better than putting them all in InternalReflectionCaches as I have (that didn't really hamper readability/comprehension).

So, ReflectionCache is laid out as two 'sets':

  • Internal Cache Stores (manual properties of IReflectionCacheStore)
  • Named Cache Stores (usable by consumers of our code, and by integrations, if desired).

ReflectionCache intentionally does not have an IReflectionCache interface, and is sealed. Two reasons:

  • There is no need to ever mock the class (user code can simply use new ReflectionCache());
  • The internal set of caches must exist in ReflectionCache for our internal code to use them, no alternative implementation will do.

ReflectionCache Lifetime

One of the challenges I tried to consider is how we keep the existing cache-happy behaviour, useful in unit tests where different containers can share the same cache, and balance the desire to reclaim the memory Autofac uses once the container is collected.

With these changes, when a new ContainerBuilder is created, DefaultRegisteredServicesTracker will reach out to ReflectionCache.Shared, and store what it finds there, to 'capture' the cache instance across the ContainerBuilder and then the Container.

ReflectionCache.Shared is backed by a WeakReference. When the last Container that holds a reference to it gets GC'ed, so will the memory allocated for this shared reflection cache. In unit tests, in most cases there is unlikely to be a GC between unit tests, in which case the cache continues to be preserved across all the tests.

In the Autofac test suite comprising about 3500 tests, a new shared ReflectionCache was still only allocated once per project+platform combo (i.e process).

This should also mean (not verified yet), that when the last container referencing the shared cache is GC'ed, the AssemblyLoadContext that loaded types into the container should then be unloadable (have not tested that just yet, but I intend to).

It's important that ReflectionCache.Shared is not captured as a static field anywhere, otherwise the allocated cache will never be GC'ed. However, on balance it will be exposed publicly, to avoid a big chunk of breaking changes that can technically be avoided.

Caches only used for registration

IReflectionCacheStore has a property Usage, that indicates to ReflectionCache whether the cache store holds information used only at registration, resolution or both. This can be used to determine that the contents of that particular store can be safely discarded once the container is built. By default, ReflectionCache will clear all such caches at container build time (more on that below).

The InternalReflectionCaches.AssemblyScanAllowedTypes cache store has Usage set to ReflectionCacheUsages.All.

Allowing registration-only caches to be cleared and shared as appropriate

I mulled for some time over the nicest/best way to support caches such as scanned assemblies proposed in #1339, that are only used for registration, without using a bunch of unnecessary memory in apps that only build a single container.

The approach of allowing users to specify a ReflectionCache via a method on ContainerBuilder got annoying fairly quickly, particularly with the ability to use ContainerBuilder in nested scopes; the definition of what happens to the reflection cache between parent/child scopes caused me some distress. So I sought an alternate approach.

Where I ended up is the following set of assumptions:

  • If only one container is built per-process, there is no point hanging on to cache's only used in registration.

  • If at least 2 containers have been built in a given process, it's pretty likely that more than 2 containers will be built.

So, we now have everything we need to maintain a single shared cache, without requiring users to provide a custom instance of ReflectionCache, by tracking how many times we've constructed ContainerBuilder, then determining whether to clear the registration-only caches on container build.

If new ContainerBuilder() is invoked more than once, it means that more than 1 container is being initialised in the current process, and subsequent builds of containers will not clear the registration-only caches.

In practical terms this means that in unit tests that use assembly scanning, the methods that load types from assemblies will be invoked at most twice before that data starts to be cached between containers.


Hopefully that gives some context.

@alistairjevans
Copy link
Member Author

alistairjevans commented Aug 21, 2022

Breaking Changes

Old

Are as follows:

  • ConstructorBinder constructor is now

    ConstructorBinder(ConstructorInfo constructorInfo, ReflectionCache reflectionCache)
  • IConstructorFinder.FindConstructors signature is now

    ConstructorInfo[] FindConstructors(Type targetType, ReflectionCache reflectionCache)
    
  • New properties on IComponentRegistry, IComponentRegistryServices and IComponentRegistryBuilder.

    @tillig, I did consider using the existing Properties object to store ReflectionCache and avoid a breaking change, but it would mean that every attempt to access the cache would require a dictionary lookup, and because the properties dictionary is mutable, users can change the cache whenever they like.

  • abstract ImplicitRegistrationSource has a new constructor argument:

    protected ImplicitRegistrationSource(Type type, ReflectionCache reflectionCache) 

The only really "this is a proper breaking change" item there is the IConstructorFinder.FindConstructors call.

Latest set of changes have no breaking changes.

@tillig
Copy link
Member

tillig commented Aug 22, 2022

This appears to be fairly non-trivial to really grok so I'm going to have to take some time here. I'll probably try to wrap up the other issues I'm on first before tucking in on this.

My super-high-level TLDR scan of it only got a couple of thoughts coming up:

  • Simply accessing the shared cache is an indicator that it's used more than once; I can see us shooting ourselves in the foot by forgetting to store the reference and never use it again. It feels a little side-effect-ish to me. Maybe it's fine.
  • Would UsedAtRegistrationOnly be better as a flags enum of places things could get used? ReflectionCacheUsage.Registration | ReflectionCacheUsage.Resolution sort of thing. Possibly doesn't matter.
  • We have a lot of caches. I wonder if we could actually combine them into a performant single cache instead of a sort of composition of a bunch of caches.
  • Reflection in later .NET is way, way faster than it used to be in .NET 1.x. Are there any caches we could drop while still maintaining perf? (Possibly during registration-only operations maybe?)

@alistairjevans
Copy link
Member Author

I'm going to see if there's a stable way to have that ReflectionCache.Shared be accessible, and move the "is there more than one" check somewhere else.

…stance, and move the "do we need to clear registration caches" understanding to the `ContainerBuilder`.
@alistairjevans
Copy link
Member Author

alistairjevans commented Aug 27, 2022

I've taken another approach to this that:

  • Avoids all breaking changes.
  • Still retains the GC'able cache.
  • Moves the "are we creating multiple containers?" question to ContainerBuilder.
  • Introduces the (now I think about it, minor) downside that a user could technically take a static field reference to ReflectionCache.Shared and therefore prevent it being GC'ed.

I've updated the initial summary with the new approach.

@codecov
Copy link

codecov bot commented Aug 27, 2022

Codecov Report

Base: 77.76% // Head: 78.01% // Increases project coverage by +0.24% 🎉

Coverage data is based on head (efce62c) compared to base (f56d1fe).
Patch coverage: 94.11% of modified lines in pull request are covered.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1341      +/-   ##
===========================================
+ Coverage    77.76%   78.01%   +0.24%     
===========================================
  Files          190      195       +5     
  Lines         5488     5582      +94     
  Branches      1097     1121      +24     
===========================================
+ Hits          4268     4355      +87     
- Misses         711      714       +3     
- Partials       509      513       +4     
Impacted Files Coverage Δ
...ration/ScopeRestrictedRegisteredServicesTracker.cs 80.00% <ø> (ø)
...ac/Util/Cache/ReflectionCacheAssemblyDictionary.cs 62.50% <62.50%> (ø)
...tofac/Util/Cache/ReflectionCacheTupleDictionary.cs 72.72% <72.72%> (ø)
...rc/Autofac/Util/Cache/ReflectionCacheDictionary.cs 81.81% <81.81%> (ø)
src/Autofac/ContainerBuilder.cs 90.00% <100.00%> (+0.52%) ⬆️
...ctivators/Reflection/AutowiringPropertyInjector.cs 100.00% <100.00%> (ø)
...ac/Core/Activators/Reflection/ConstructorBinder.cs 91.93% <100.00%> (ø)
.../Activators/Reflection/DefaultConstructorFinder.cs 90.90% <100.00%> (-9.10%) ⬇️
src/Autofac/Core/ImplicitRegistrationSource.cs 86.84% <100.00%> (+0.35%) ⬆️
src/Autofac/Core/InternalReflectionCaches.cs 100.00% <100.00%> (ø)
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@tillig
Copy link
Member

tillig commented Aug 30, 2022

Looking good. I like some of the changes. Admittedly still sucking this in through osmosis into the old noggin.

The pattern that appears to be emerging is:

  • There's a central accessor/repository for the set of reflection caches.
  • Components can GetOrCreate themselves a reflection cache of a given type as needed given a unique key.
  • Reflection caches can specify when they're used (registration/resolution) and must support a Clear() functionality.

I see the 'well-known' ones want to avoid the dictionary lookup (the Internal cache set) but I wonder if we could do that by adding some sort of default initialization and holding the reference to the dictionary item.

public InternalReflectionCaches(ReflectionCache cacheHolder)
{
  this.IsGenericOrEnumerableInterface = cacheHolder.GetOrCreateCache<ReflectionCacheDictionary<Type, bool>>("IsGenericEnumerableInterface");
  this.IsGenericListOrCollectionInterfaceType = cacheHolder.GetOrCreateCache<ReflectionCacheDictionary<Type, bool>>("IsGenericListOrCollectionInterfaceType");
}

The benefit of that is that we don't have to treat the internal ones differently from the others when enumerating, spinning through the Clear() logic, etc. It also means those become somewhat "publicly available" to other things to pre-populate if needed by getting them using the key. (Maybe that last bit isn't desirable?)

I admit I may also be getting hung up on terminology, like, there's a ReflectionCache that holds... ReflectionCaches? I wonder if there's a way to better disambiguate "this thing holds caches" from "this thing is a cache" in here. ReflectionCacheStore? (I'm sooooo not glued to that, but the idea is there.)

@alistairjevans
Copy link
Member Author

Ahhh....excellent suggestion about holding the internal caches in pre-fetched dictionary entries. Love it, will implement. I knew there was a better way to do that, I just couldn't see it.

Agreed on the name; perhaps could flip the relationship to ReflectionCacheSet and IReflectionCache?

@tillig
Copy link
Member

tillig commented Aug 31, 2022

Oh, yeah "cache set" vs "cache" works.

@alistairjevans alistairjevans marked this pull request as ready for review September 3, 2022 17:19
@alistairjevans
Copy link
Member Author

Ran the full set of benchmarks on this change compared to develop across net6, net48 and netcoreapp3.1; no significant changes either way.

The benchmarks took 1hr30 for each run, 321 total benchmarks!

Related to your point about still needing caches @tillig, did you happen to catch some of the reflection perf improvements in net7 coming out?

image

https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/#reflection

The method invoke time is wayyy down now. However obviously only users on the latest framework version are going to be able to take advantage.

@tillig
Copy link
Member

tillig commented Sep 8, 2022

I hadn't seen the updates to reflection in .NET 7 but that's pretty hot. At some point it may be interesting to see what it'd look like to totally ditch caches in .NET 7. It'd suck to go back to #ifdef sorts of things, but it might help to call out which code could go when we start dropping older frameworks.

Copy link
Member

@tillig tillig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good. I added a few comments around fairly nitpicky stuff, but I think the larger pattern here is emerging nicely so there aren't any major issues to pick on.

src/Autofac/Core/ReflectionCacheSet.cs Outdated Show resolved Hide resolved
src/Autofac/Core/ReflectionCacheSet.cs Outdated Show resolved Hide resolved
src/Autofac/Core/ReflectionCacheSet.cs Show resolved Hide resolved
src/Autofac/Util/AssemblyExtensions.cs Show resolved Hide resolved
src/Autofac/Core/ReflectionCacheSet.cs Outdated Show resolved Hide resolved
Copy link
Member

@tillig tillig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have anything left to really nitpick. This looks really good. I had one question about test parallelism and whether or not that may invalidate some of the tests involved with clearing the shared cache, but that's about it.

test/Autofac.Test/Core/ReflectionCacheSetTests.cs Outdated Show resolved Hide resolved
… the confusing, probably useless concurrency tests.
Copy link
Member

@tillig tillig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good. It may be worth addressing the comment in #1324 about whether one can clear the cache on disposal of a child lifetime scope. It seems like there is some concern there; unclear on whether we need a unit test specifically showing/proving that or not.

@alistairjevans
Copy link
Member Author

#1324 won't be truly solved until we can isolate the registered services tracker in the root container from registrations in child scopes; that issue is "partly" about reflection caching, but really the services tracker is the more thorny problem, and a separate issue to this one really.

@tillig
Copy link
Member

tillig commented Sep 19, 2022

I'll buy that. Baby steps. I'll go ahead and merge this, then.

@tillig tillig merged commit 86c36ab into develop Sep 19, 2022
@tillig
Copy link
Member

tillig commented Sep 19, 2022

I'll let you delete the branch so I don't nuke it out from under you. :)

@alistairjevans alistairjevans deleted the feature/reflection-cache-clear branch September 25, 2022 11:30
@RogerKratz
Copy link
Contributor

this is looking good! would it be possible for you guys to make a new official nuget release now?

@alistairjevans
Copy link
Member Author

I'm hoping to complete #1350 before the next release, giving proper unloadability support out-of-the-box.

@kirk-marple
Copy link

Hey all, I just tried updating to Autofac 6.5.0 from 6.4.0, and ran into a very strange issue with another package that we pull in via our DI services. (Aspose.Total.NET)

I went down a rathole for a few hours thinking it was something Aspose was doing, because I was getting a FileNotFoundException on an assembly that I believe they're dynamically unpacking. But then reverted everything, and realized it was simply the updated Autofac package which was causing it.

Reading over the release notes here, I saw this 'reflection cache' feature, and thought maybe it could be related.

Is this being written to disk, or just to memory? Curious if there's anything involved with it which could affect assemblies which dynamically unpack themselves to disk?

I've reverted back to 6.4.0 for now, but wanted to let y'all know. It's a bit too vague for a bug report, so figured I'd add to this discussion.

@alistairjevans
Copy link
Member Author

Hi @kirk-marple , are you able to provide a minimal repro for the issue relating to dynamically unpacked assemblies? If so, suggest raising a new issue with that repro.

@kirk-marple
Copy link

Hi @alistairjevans, unfortunately it's not trivial - but is theoretically possible. It's licensed software from a 3rd party vendor (Aspose), so I'd have to request a trial license from them to fully provide you a repro (and bug report).

I'm happy to followup with them, and see if I can triangulate the issue, but was trying to better understand any impacts of this caching change.

Is there any disk footprint to the caching, which potentially could be changing the current working directory?

@alistairjevans
Copy link
Member Author

No, definitely no disk footprint, and certainly not something I'd expect to lead to a FileNotFoundException. We moved our reflection cache from static fields to instance (basically).

Anything in your code or Aspose that depended on Autofac internals, like reaching in and clearing our caches via reflection?

@kirk-marple
Copy link

kirk-marple commented Nov 17, 2022

Thanks, appreciate the insight. Our use of Autofac is pretty simple, basically registering assembly modules with ContainerBuilder:

        builder.RegisterAssemblyModules<ModuleBase>(AppDomain.CurrentDomain.GetAssemblies());

A few of our DI services use Aspose (for file format conversion), and I'm not sure if they use Autofac directly. Their code is obfuscated. We definitely don't do anything with Autofac and Aspose via reflection.

When I get some more time, I'll try and setup a simple repro of loading one of our DI services with the new Autofac, and try it with and without Aspose.

It may be a red herring that it ends up with an exception in their stuff. (I had a separate issue when them writing some stuff to disk at runtime, within an Azure Function, so I know there's some magic they're doing when their assembly loads.)

@tbjerknes
Copy link

Hi @kirk-marple . We are also experiencing problems when upgrading to 6.5.0. Probably a dumb question, but how did you figure out Aspose was the problem in your case? We don't use this library, so it must be something else in our case.

When I run locally with 6.5.0 i get this below, and a 403 error in the webpage
image

With 6.4.0 the same section looks like this, and it is all roses.
image

@tillig
Copy link
Member

tillig commented Nov 23, 2022

If anyone does figure out the Aspose issue or can at least create a repro, please file a new issue with the info so it's not lost/buried down here. Thanks!

@kirk-marple
Copy link

Hi @kirk-marple . We are also experiencing problems when upgrading to 6.5.0. Probably a dumb question, but how did you figure out Aspose was the problem in your case? We don't use this library, so it must be something else in our case.

Hi @tbjerknes, it surfaces when trying to resolve our DI services, and one of the services wouldn't load since it got a FileNotFoundException on an assembly which I tracked down into Aspose. The really odd part is that the DLL doesn't actually exist on disk anywhere, after a build, so I think they're unpacking the appropriate DLL at runtime - maybe based on the platform it's running on? They obfuscate their licensed assemblies, so I couldn't get very far in a call stack.

My guess was something of the 'context' was changing with 6.5.0, where this dynamic unpacking behavior was failing, so that DLL wasn't there to bind to.

I haven't had any time to try and dig further into this, and just sticking with 6.4.0 for the time being.

@tbjerknes
Copy link

thanks for the info. I will do a bit of digging when I have some time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants