Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unloadability issue in AppDomain::FindAssembly #58948

Merged
merged 1 commit into from
Sep 10, 2021

Conversation

janvorli
Copy link
Member

There is a bug in AppDomain::FindAssembly code path that iterates over
the domain assemblies. The iteration loop leaves the refcount of the
LoaderAllocator related to the returned DomainAssembly bumped, but
nothing ever decrements it. So when a code that needs to be unloaded
ends up in that code path, all the managed things like managed
LoaderAllocator, LoaderAllocatorScout are destroyed, but the unloading
doesn't complete due to the refcount.

We have never found it before as this code path is never executed in any
of the coreclr tests even with unloadability testing option.

The problem was reported yesterday by @StephenMolloy who has found
it when working on making Xml serializer behave with unloadability.

There is a bug in AppDomain::FindAssembly code path that iterates over
the domain assemblies. The iteration loop leaves the refcount of the
LoaderAllocator related to the returned DomainAssembly bumped, but
nothing ever decrements it. So when a code that needs to be unloaded
ends up in that code path, all the managed things like managed
LoaderAllocator, LoaderAllocatorScout are destroyed, but the unloading
doesn't complete due to the refcount.

We have never found it before as this code path is never executed in any
of the coreclr tests even with unloadability testing option.
@janvorli janvorli added this to the 6.0.0 milestone Sep 10, 2021
@janvorli janvorli self-assigned this Sep 10, 2021
@janvorli janvorli merged commit 8b38c19 into dotnet:main Sep 10, 2021
@janvorli janvorli deleted the fix-dynamic-assemblies-unload branch September 10, 2021 17:47
@ghost ghost locked as resolved and limited conversation to collaborators Oct 10, 2021
@agocke
Copy link
Member

agocke commented Nov 11, 2021

@janvorli We're currently examining the risk for backporting #61266 (comment) and trying to decide if the value is worth it. It sounds like without this change, the XmlSerializer DLLs wouldn't be unloadable. If we were to bring this change back too, what would the risk look like?

@janvorli
Copy link
Member Author

@agocke it is kind of hard to asses the risk here, even though I believe this was wrong before and the fix should not introduce any failure modes. But none of our coreclr or libraries tests execute this code path, so the only testing that was done was using the unload of the XmlSerializer where the unload was consistently failing before this fix and I believe it is consistently succeeding after the fix.
The problematic code path is specific to unloadability and the comment that was there indicated that there is no need for added reference, yet it was adding one via the not obvious handling of references in the CollectibleAssemblyHolder. So the original call to Extract instead of GetValue seems like an overlook of that fact.

@StephenMolloy
Copy link
Member

@agocke and @janvorli, has there been any more conversation around porting this back to 6.0?

We do have customers who have been asking for quite some time for the XmlSerializer fix that tripped on this, and they have noticed it isn't in .Net 6 yet. While I do think there is value in the XmlSerializer fix on it's own, as it enables an experience that is less broken than the previous 'can't do this at all' story... having this fix in 6.0 would really be helpful, especially since 6.0 is LTS.

@ericstj
Copy link
Member

ericstj commented Dec 14, 2021

Pinging @janvorli / @jkotas - what are your thoughts on shipping this in 6.0 servicing. Effectively adding to #61266

@jkotas
Copy link
Member

jkotas commented Dec 15, 2021

I think it is to fine to ship this fix in 6.0 servicing, but I would like to see us to gain more confidence in the fix as part of it, e.g. create a small regression test that hits the bug and demonstrates that the bug is fixed with this change.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants