Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Initial provisions for semantic Crossgen2 PDB validation #75172

Closed
wants to merge 7 commits into from

Conversation

trylek
Copy link
Member

@trylek trylek commented Sep 7, 2022

This Quality Week work item is motivated by my findings from a few months back that Crossgen2 PDB generator had been producing bogus symbol files for almost half a year due to a trivial bug and no testing in place was able to catch that. This change adds initial provisions for semantic PDB validation.

In this initial commit I'm adding a new managed app PdbChecker that uses the DIA library to read a given PDB and optionally check it for the presence of given symbols. In parallel I'm making test infra changes that enable selective PDB validation support in individual tests including checks for the presence of expected symbols.

This change by itself introduces rudimentary PR / CI validation of PDB files produced by Crossgen2. As next step I plan to introduce additional provisions for running this logic for all tests to be added to one of the Crossgen2-specific pipelines, and validation of PDBs produced during compilation of the framework assemblies.

Thanks

Tomas

This Quality Week work item is motivated by my findings from a few
months back that Crossgen2 PDB generator had been producing bogus
symbol files for almost half a year due to a trivial bug and no
testing in place was able to catch that. This change adds initial
provisions for semantic PDB validation.

In this initial commit I'm adding a new managed app PdbChecker
that uses the DIA library to read a given PDB and optionally check
it for the presence of given symbols. In parallel I'm making test
infra changes that enable selective PDB validation support in
individual tests including checks for the presence of expected
symbols.

This change by itself introduces rudimentary PR / CI validation of
PDB files produced by Crossgen2. As next step I plan to introduce
additional provisions for running this logic for all tests to be
added to one of the Crossgen2-specific pipelines, and validation
of PDBs produced during compilation of the framework assemblies.

Thanks

Tomas
@ghost
Copy link

ghost commented Sep 7, 2022

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

This Quality Week work item is motivated by my findings from a few months back that Crossgen2 PDB generator had been producing bogus symbol files for almost half a year due to a trivial bug and no testing in place was able to catch that. This change adds initial provisions for semantic PDB validation.

In this initial commit I'm adding a new managed app PdbChecker that uses the DIA library to read a given PDB and optionally check it for the presence of given symbols. In parallel I'm making test infra changes that enable selective PDB validation support in individual tests including checks for the presence of expected symbols.

This change by itself introduces rudimentary PR / CI validation of PDB files produced by Crossgen2. As next step I plan to introduce additional provisions for running this logic for all tests to be added to one of the Crossgen2-specific pipelines, and validation of PDBs produced during compilation of the framework assemblies.

Thanks

Tomas

Author: trylek
Assignees: -
Labels:

area-Infrastructure-coreclr

Milestone: -

@ghost ghost assigned trylek Sep 7, 2022
@trylek
Copy link
Member Author

trylek commented Sep 7, 2022

Fixes: #13134

Copy link
Member

@jkoritzinsky jkoritzinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do these tests as unit tests like https://github.com/dotnet/runtime/tree/main/src/coreclr/tools/aot/ILCompiler.Compiler.Tests instead of layering this on top of our runtime test suite?

If not, what type of failures/bugs do we think we’ll test by having this as part of the runtime test tree as compared to having it as unit tests?

@trylek
Copy link
Member Author

trylek commented Sep 7, 2022

@jkoritzinsky - my thinking has been that adding the optional PDB check step after Crossgen2 compilation gives us more freedom w.r.t. extent of the testing. AFAIK the ILCompiler tests are typically small unit tests targeting particular components of the compiler like the dependency analyzer or type system using a handwritten set of individual cases to test; by being able to run the PDB check on arbitrary compiled tests and ultimately the framework we should be able to test many more PDBs that more closely resemble end customer apps including a wide variety of constructs like different generic instantiations and a broad range of executable and PDB sizes. @cshung is working on a similar mechanism using R2RDump to disassemble the various Crossgen2-produced binaries to test its robustness. If you think I misunderstood your original suggestion or if you have additional arguments to support it, please elaborate.

@cshung
Copy link
Member

cshung commented Sep 7, 2022

FYI, my proposed change to add r2rdump is already out for PR as part of the hot-cold splitting upstreaming change here. This is considered part of the quality week work.

In particular, I changed src/tests/Common/CLRTest.CrossGen.targets to include an execution of the R2RDump tool, as well as the ValidateRuntimeFunctions method in src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs to check for some invariants that the crossgen2 compiler should guarantee. (e.g. the runtime functions must be sorted by RVA in the runtime function table)

The idea is that when some invariants ca statically checked, checking them statically is way cheaper than executing the tests many times and hoping that when the list is not sorted then sometimes the binary search might fail.

@EugenioPena who authored the ValidateRuntimeFunctions method.

@jkoritzinsky
Copy link
Member

I'm just wondering how much benefit we're actually getting by just throwing new phases and types of testing at the src/tests suite instead of deliberately authoring tests to cover the various scenarios we support. Especially with a tightening test budget, we should be more deliberate about our testing instead of just throwing everything at the wall and seeing what sticks.

I get that this is the easiest way to cover a wide range of scenarios, but even for that, we know the src/tests tree doesn't cover everything. The JIT team runs the libraries tests against JitStress to catch bugs that the src/tests tree doesn't. Additionally, each new test variation that we add to the runtime test scripts adds more complexity and makes them more difficult to understand to anyone that we want to onboard to the infra team. I'm fine with us starting here, but I strongly feel that we should add deliberate testing instead of just pointing at these tests and hoping that we get enough coverage that way.

I also think that adding unit tests for "did we order this collection correctly" would be a better (and likely cheaper over time) test mechanism for the hot-cold splitting work as well. The asserts in ValidateRuntimeFunctions are good, but I feel that writing tests that explicitly test that things are ordered correctly would be a better test than just throwing things at the src/tests tree.

@trylek
Copy link
Member Author

trylek commented Sep 7, 2022

Thanks Jeremy for your additional feedback. I think I mostly agree with you on the following accounts:

  • runtime tests have many specifics and limitations and an expectation that just making them pass in a new build mode amounts to its full validation is very naive;
  • arbitrarily adding new build modes for the runtime tests exhibits a form of exponential explosion that makes our lab budget suffer;
  • making the test script logic more complicated is in general a pain for ramp-up and maintenance.

My personal feeling regarding the PDB testing from these perspectives is as follows:

  • While I completely agree that runtime tests are by no means a fair proxy for customer projects, I sincerely doubt there is an easy way to put together something more comprehensive in terms of the ILCompiler tests; we're not targeting a particular regression in PDB generation for which we could create a directed unit test, as a first step we want to improve our overall confidence that PDBs are not broken altogether (as was the case for about five months earlier this year without anyone even noticing);
  • I'm certainly not suggesting we should double our PR testing with these new modes; as you can see in this PR, for the purpose of PR / CI testing I'm proposing a surgical addition of this logic to the one crossgen2smoke test, we can expand it to several more tests but that's about it. Beyond that I've been thinking about a broader albeit less frequent validation as part of some of the crossgen2-specific pipelines - according to the measurements Jared recently published, these are small potatoes compared to today biggest Helix / Azure consumers;
  • As you can see yourself, the actual change to the test build scripts is pretty minimal in comparison with their current content, if we decide to clean them up and simplify them I suspect there are more complex places where we could start, I did some of that in the past (e.g. by unifying much of the logic in build.cmd/sh by moving it to the common proj file).

For now I'm inclined to continue working on this PR unless it turns out to be somehow detrimental to overall lab perf (which I deem improbable) as it has the potential to introduce at least some PDB testing to our system that is long overdue. I can try to experiment with leveraging ILCompiler tests for this purpose but I'm afraid it's going to be at least as complicated as this proposed change for the following reasons:

  • The hypothetical ILCompiler test will need the same underlying infrastructure i.e. Dia2Lib et al to parse the produced PDB;
  • IIRC in today lab we're running the ILCompiler test suite on Linux which won't work as PDB is a Windows-specific technology so we'd need to tweak the lab to cater for that somehow;
  • If I'm not mistaken, today the ILCompiler tests run on the AzDO build machines that don't support coverage for all four architectures (x64, x86, arm, arm64).

Thanks

Tomas

@jkoritzinsky
Copy link
Member

jkoritzinsky commented Sep 7, 2022

That all makes sense, and since this isn't adding a new pipeline or scenario flavor, I think that this is okay to add. Your reasoning around hooking up the underlying platform logic for DIA also is quite convincing. Let's go with this for now and we can look back in the future to decide if changing this is worthwhile.

@trylek trylek mentioned this pull request May 3, 2023
46 tasks
@stephentoub
Copy link
Member

@trylek, this is a year old. Should it be closed?

@trylek
Copy link
Member Author

trylek commented Sep 5, 2023

Sure, makes sense, I'll reopen this once I get to figuring out the problem that makes it still fail in the lab.

@trylek trylek closed this Sep 5, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Oct 5, 2023
@trylek
Copy link
Member Author

trylek commented Jan 12, 2024

For the "problem in the lab", according to my discussion with @brianrob we shouldn't be using CLSID-based component creation for DIA, instead we should manually use its public exports similar to what we do in diagnostic tests, cf.

https://github.com/microsoft/perfview/blob/afd33a906961167ba995a6d71f35de9ea6a2bbdd/src/TraceEvent/Symbols/NativeSymbolModule.cs#L1794

using a static copy of dia2lib.dll.

/cc @ivdiazsa

@dotnet dotnet unlocked this conversation Jan 17, 2024
@dotnet dotnet locked and limited conversation to collaborators Jan 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants