Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event to RuntimeEventSource for AppContext switches #57303

Merged
merged 20 commits into from
Aug 20, 2021

Conversation

agocke
Copy link
Member

@agocke agocke commented Aug 12, 2021

Fixes #56142

@agocke
Copy link
Member Author

agocke commented Aug 12, 2021

@josalem Could you give me some advice on how to test this?


lock (s_switches)
{
foreach (var (k, v) in s_switches)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only going to log switches set using AppContext.SetSwitch . It won't log switches set using build properties and runtime.json that I think are much more common these days. Do we need to log those too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would like to include those as well -- do you know which API they go through if not SetSwitch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppDomain.SetData (#47922 tracks adding AppContext.SetData for completeness)

Also, the runtime startup goes over everything that is in runtime.json and calls AppDomain.SetData for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I will also add everything in s_dataStore which has a boolean value. I'm trying to avoid grabbing everything in s_dataStore to limit the amount of potential user data that's recorded.

Copy link
Contributor

@josalem josalem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test this, I would create an EventListener that enables this EventSource and then parses the LogAppContextSwitch.

Since you don't know what settings are going to be set in a given test leg, I'd use AppContext.SetSwitch to set a specific switch and look for it in the events. Alternatively, you could compare the events directly to the values in the dictionaries, but that might be harder to set up due to field visibility.

The listener might look like this:

public sealed class AppContextListener : EventListener
{
	override protected void OnEventSourceCreated(EventSource source)
	{
		if (source.Name == "System.Runtime")
		{
			// the below will change if you add a keyword
			EnableEvents(source, EventLevel.Verbose, (EventKeywords)(-1));
		}
	}

	protected override void OnEventWritten(EventWrittenArgs eventData)
	{
		if (eventData.EventName == "LogAppContextSwitch" && eventData.Payload is not null)
		{
			string switchName = string.Empty;
			int switchValue = -1;
			for (int i = 0; i < eventData.Payload.Length; i++)
			{
				if (eventData.PayloadNames[i] == "switchName")
					switchName = eventData.Payload[i] as string;
				else if (eventData.PayloadNames[i] == "value")
					switchValue = eventData.Payload[i] as int;
			}

			if (switchName != string.Empty && switchValue != -1)
				// validate the switch event
		}
	}
}

@@ -50,6 +50,12 @@ public static void Initialize()
// as you can't make a constructor partial.
private RuntimeEventSource(int _) { }

[Event(2, Level = EventLevel.Informational)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of use, it might be best to assign this a keyword.

When you got to enable this, it'll create the counters. They won't get turned on unless you specify EventCounterIntervalSec as a filter parameter, though. That being said, we don't want to have the presence of the filter string be the switch for enabling counters or not.

You can do this by adding a nested class like this:

public static class Keywords
{
	public const EventKeyword AppContext = (EventKeyword)0x1;
}

Then you can assign it to the event by updating the EventAttribute to have the EventKeyword parameter.

This way a user can turn on just this event without having to turn on all the counters.

CC @brianrob to weigh in on the experience of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not understanding this at all. Are you saying that we should add an additional registration key for just grabbing the AppContext switches?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to re-state @josalem's position and see if I get it right. I think that @josalem is saying that right now, everything in this EventSource gets turned on all at once because there is no ability to filter. The counters get created but not actually enabled because a filter string is required to turn them on, but I think he's saying that in the future, he'd like for this not to be the case, and so wants to future proof this with a keyword. Keywords are one of the constructs that let you filter, and so I believe he's just proposing adding a keyword to allow this to be turned on and off independently of the counters. @josalem did I get that right?

I have another question on this - is this where we would want this and other future managed code events to live, when their functionality is in CoreLib, or do we think that the event should live somewhere like FrameworkEventSource?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also add that I think this event is a great idea, and I will absolutely turn this on by default in PerfView.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a reasonable place for the AppContext switch events to live -- I'm not sure I'd be bold enough to take a stronger position than that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I was aiming that question more at @josalem. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your assessment is spot-on, Brian.

I think it is fine for now. If we add a keyword for the AppContext switches, we can at least turn on just those events. Not specifying EventCounterIntervalSec means we can turn on the world without enabling the counters, but it's not ideal as an enabling mechanism.

@agocke
Copy link
Member Author

agocke commented Aug 12, 2021

Is there an easy way to run just the tracing tests (src/tests/tracing/runtimeeventsource/*) locally? It looks like there's an existing event listener in there that I'd like to augment, but I'm having trouble running just those test cases

@josalem
Copy link
Contributor

josalem commented Aug 12, 2021

Is there an easy way to run just the tracing tests (src/tests/tracing/runtimeeventsource/*) locally?

The fastest way I've used for running the tests is to build them directly and bypass the xunit wrapper cruft:

  1. build.{cmd|sh} clr+libs -rc checked -lc release
  2. ./src/tests/build.{cmd|sh} generatelayoutonly checked
  3. pushd ./src/tests/tracing/runtimeeventsource
  4. ../../../.dotnet/dotnet build <test to build>.csproj -c Checked
  5. pushd <path to built test>
  6. set any env vars you need (I think this test only needs COMPlus_TieredCompilation=0)
  7. runtimeeventsource.{cmd|sh} -coreroot=<path from step 2>

It's ugly, but better than building all the tests. If anyone knows a better way...

@stephentoub
Copy link
Member

Various AppContext switches represent something that's also settable via environment variables. Do we have the ability to reflect those as well easily? I understand we likely don't want to just dump out all environment variables, to avoid dumping out arbitrary user data.

@agocke
Copy link
Member Author

agocke commented Aug 12, 2021

../../../.dotnet/dotnet build .csproj -c Checked

No dice. This seems to just restore the project -- it doesn't build anything

@josalem
Copy link
Contributor

josalem commented Aug 12, 2021

it doesn't build anything

It didn't drop the built test somewhere in the artifacts directory?

@agocke
Copy link
Member Author

agocke commented Aug 12, 2021

Nope. From MSBuild it looks like it imported tests/Common/nobuild.targets so it must think it didn't need to build. I'll see if I can figure out why

@agocke
Copy link
Member Author

agocke commented Aug 12, 2021

Still no idea why it didn't build, but passing -p:_WillCLRTestProjectBuild=true overrode the logic

[Event((int)EventId.AppContextSwitch, Level = EventLevel.Informational)]
internal void LogAppContextSwitch(string switchName, int value)
{
base.WriteEvent((int)EventId.AppContextSwitch, switchName, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insight into GitHub's C# colorizer algorithm:
image
😄

@agocke agocke marked this pull request as ready for review August 16, 2021 00:04
@agocke
Copy link
Member Author

agocke commented Aug 16, 2021

I think this is ready for re-review

@agocke
Copy link
Member Author

agocke commented Aug 17, 2021

ping @josalem @brianrob for review

@josalem
Copy link
Contributor

josalem commented Aug 17, 2021

Looking now

Copy link
Contributor

@josalem josalem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round of comments. I'm going to approve it assuming you address the comment about switch precedence, so we can expedite getting this merged 😄

src/tests/tracing/eventcounter/runtimecounters.cs Outdated Show resolved Hide resolved
src/tests/tracing/eventcounter/runtimecounters.cs Outdated Show resolved Hide resolved
}

[Event((int)EventId.AppContextSwitch, Level = EventLevel.Informational, Keywords = Keywords.AppContext)]
internal void LogAppContextSwitch(string switchName, int value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianrob Would a single event with all of the data be better for something like this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one event per switch is good. It makes it easy to consume the data, and I can't imagine we'll have so many that the cost of this event will be painful. If/when we choose to create events were we dump data in bulk, it makes both the event production and event consumption code more complex. On the production side, you have to chunk the data to ensure that you don't run up against size limitations (e.g. ETW has a 64K size limit per event). On the consumption side, you have to be more sophisticated in how you parse the event. There are definitely places were it's valuable - for example BulkType events in the runtime when we log heap snapshots. There are just tons of types, and so emitting each one in its own event is very costly.

@@ -87,6 +103,8 @@ protected override void OnEventCommand(EventCommandEventArgs command)
_ilBytesJittedCounter ??= new PollingCounter("il-bytes-jitted", this, () => System.Runtime.JitInfo.GetCompiledILBytes()) { DisplayName = "IL Bytes Jitted", DisplayUnits = "B" };
_methodsJittedCounter ??= new PollingCounter("methods-jitted-count", this, () => System.Runtime.JitInfo.GetCompiledMethodCount()) { DisplayName = "Number of Methods Jitted" };
_jitTimeCounter ??= new IncrementingPollingCounter("time-in-jit", this, () => System.Runtime.JitInfo.GetCompilationTime().TotalMilliseconds) { DisplayName = "Time spent in JIT", DisplayUnits = "ms", DisplayRateTimeScale = new TimeSpan(0, 0, 1) };

AppContext.LogSwitchValues(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just logging values at a specific point in time (when the event source is enabled) - is that the desire?

I'm thinking of the possibility of scenarios like:

  • Initial values do not have SwitchX set
  • Event source is enabled, logging that SwitchX is not set
  • App sets SwitchX
  • Runtime (or app/library) reads SwitchX and is now configured a certain way

or

  • Initial values do not have SwitchX set
  • Runtime (or app/library) reads SwitchX and is now configured a certain way
  • App sets SwitchX
  • Event source is enabled, logging that SwitchX is set

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of those are edge cases. The 90% usage is:

  • Switch set in the entry point
  • Switch set in runtime.config.json (which comes from the csproj)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree that those should not be the common case, but wanted to call it out.

  • Switch set in the entry point

I think part of my question is what does this end up looking like when tracing startup (where I'd expect the event source is enabled before the app's entry is is run)? Could it end up being the first scenario I had above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, @brianrob @josalem do event sources OnCommand run before Main executes when you attach the event source at startup (which not many do).

@elinor-fung These are great questions BTW 👍🏾

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They definitely can, but not all will. If the EventSource comes into existence before Main is called and it is told to turn on (e.g. EventPipe, ETW, or some EventListener tells it to), then it will trigger before Main. By design, there isn't anything that requires that Main run before events can be emitted. A good example here is FrameworkEventSource which often comes into existence before Main.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to this, if you want to ensure that these can be emitted early, make sure that the EventSource that emits them comes into existence early. That may already be the case, but if not, make sure it gets instantiated early, even if it doesn't log any events when instantiated. Just the instantiation will allow it to be enabled immediately (at least by ETW, but shortly by EventPipe once it comes up).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elinor-fung, just read your first scenario - I think that the right way to handle a race between logging the current state and changes to switches is to dump the set of switches at the beginning of the tracing session as this code does. Additionally, any time a switch is configured, it should be logged immediately. This ensures that if switches change value or get set after the session is created, we see the change. This, combined with instantiating the EventSource early will ensure that we don't miss any data.

@davidfowl
Copy link
Member

@agocke Can you make sure this goes into the 6.x branches as well once we merge?

{
if (eventSource.Name.Equals("Microsoft-Windows-DotNETRuntime"))
if (!ObservedEvents[key])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have separate checks and failure messages not having switch logged at all versus having the switch logged with the wrong value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not opposed, but I'm hoping to take this for RC 1, so I'd prefer to make test improvements in a later iteration

@agocke
Copy link
Member Author

agocke commented Aug 19, 2021

@josalem @brianrob I think this is ready, could you take a last look?

Copy link
Member

@brianrob brianrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Also, confirmed that RuntimeEventSource is initialized in ProcessStartupHooks, so you should always get switches logged if requested prior to Main.

@agocke agocke merged commit 235915e into dotnet:main Aug 20, 2021
@agocke agocke deleted the appcontext-telemetry branch August 20, 2021 17:18
@agocke
Copy link
Member Author

agocke commented Aug 20, 2021

/backport to release/6.0-rc1

@github-actions
Copy link
Contributor

Started backporting to release/6.0-rc1: https://github.com/dotnet/runtime/actions/runs/1151503566

@lewing
Copy link
Member

lewing commented Aug 21, 2021

@agocke this appears to have broken:

runtime (Build Android x64 Release AllSubsets_Mono_RuntimeTests)
runtime (Build Browser wasm Release AllSubsets_Mono_RuntimeTests)

08-20 22:09:17.695  8753  8771 D DOTNET  : assembly_preload_hook: System.Runtime.InteropServices  /data/user/0/net.dot.tracing_runtimeeventsource/files
      08-20 22:09:17.756  8753  8771 E mono    : 
      08-20 22:09:17.756  8753  8771 E mono    : Unhandled Exception:
      08-20 22:09:17.756  8753  8771 E mono    : System.Collections.Generic.KeyNotFoundException: The given key 'appContextSwitch' was not present in the dictionary.
      08-20 22:09:17.756  8753  8771 E mono    :    at System.Collections.Generic.Dictionary`2[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Boolean, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].get_Item(String key)
      08-20 22:09:17.756  8753  8771 E mono    :    at RuntimeEventListener.Verify()
      08-20 22:09:17.756  8753  8771 E mono    :    at Program.<Main>$(String[] args)
      08-20 22:09:17.756  8753  8771 E DOTNET  : UnhandledException: System.Collections.Generic.KeyNotFoundException The given key 'appContextSwitch' was not present in the dictionary.    at System.Collections.Generic.Dictionary`2[[System.String, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Boolean, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].get_Item(String key)
      08-20 22:09:17.756  8753  8771 E DOTNET  :    at RuntimeEventListener.Verify()
      08-20 22:09:17.756  8753  8771 E DOTNET  :    at Program.<Main>$(String[] args)
      08-20 22:09:17.790  1819  2179 I ActivityManager: Process net.dot.tracing_runtimeeventsource (pid 8753) has died: fore FGS 
      08-20 22:09:17.791  1819  2179 W ActivityManager: Crash of app net.dot.tracing_runtimeeventsource running instrumentation ComponentInfo{net.dot.tracing_runtimeeventsource/net.dot.MonoRunner}
      08-20 22:09:17.791  1819  2179 I ActivityManager: Force stopping net.dot.tracing_runtimeeventsource appid=10114 user=0: finished inst
      08-20 22:09:17.791  1819  6795 W Binder  : Outgoing transactions from this process must be FLAG_ONEWAY
      08-20 22:09:17.791  1819  6795 W Binder  : java.lang.Throwable
      08-20 22:09:17.791  1819  6795 W Binder  : 	at android.os.BinderProxy.transact(BinderProxy.java:480)
      08-20 22:09:17.791  1819  6795 W Binder  : 	at android.app.IInstrumentationWatcher$Stub$Proxy.instrumentationFinished(IInstrumentationWatcher.java:205)
      08-20 22:09:17.791  1819  6795 W Binder  : 	at com.android.server.am.InstrumentationReporter$MyThread.run(InstrumentationReporter.java:86)
      08-20 22:09:17.793  8743  8743 D AndroidRuntime: Shutting down VM
      08-20 22:09:17.803  1549  1549 I Zygote  : Process 8753 exited cleanly (1)
      08-20 22:09:17.803  8743  8774 E app_process: Thread attaching to non-existent runtime: Binder:8743_3
      08-20 22:09:17.803  8743  8774 I AndroidRuntime: NOTE: attach of thread 'Binder:8743_3' failed
      08-20 22:09:17.830  1819  1846 I libprocessgroup: Successfully killed process cgroup uid 10114 pid 8753 in 39ms
      Expected: True
      Actual:   False

Seems to have started here.

@agocke
Copy link
Member Author

agocke commented Aug 22, 2021

Woops, this shouldn't be running on those platforms. I thought I had successfully disabled it but apparently not. I'll fix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Diagnostics for App Context switches
8 participants