Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespaces changed between 2.4.0 and 2.5.0 #727

Closed
xt0rted opened this issue Mar 8, 2018 · 35 comments
Closed

Namespaces changed between 2.4.0 and 2.5.0 #727

xt0rted opened this issue Mar 8, 2018 · 35 comments
Assignees
Labels
Milestone

Comments

@xt0rted
Copy link

xt0rted commented Mar 8, 2018

I reported this in the webjobs sdk repo (Azure/azure-webjobs-sdk#1619), but I wanted to bring it up here too. Between versions 2.4.0 and 2.5.0 the SamplingPercentageEstimatorSettings class, which the webjobs AI logger uses, moved to a new namespace (I'm sure there's many other classes that did as well).

Because of this change if you try to use version 2.1.0 of the webjobs sdk with AI 2.5.0+ it'll compile, but crash when the AI logger starts up.

@TimothyMothra
Copy link
Member

Sorry for the inconvenience.
There had been some shuffling around, but it wasn't supposed to affect anything public.
Give me a couple days to look into this.
Thank you for linking the specific files, that will help my investigation.

@pharring
Copy link
Member

pharring commented Mar 8, 2018

I thought we had the public API checker analyzer to avoid this sort of thing. Or perhaps it wasn't added until later.

@TimothyMothra
Copy link
Member

@pharring we do have a checker and I think the expectation is that this should have been caught.

@SergeyKanzhelev
Copy link
Contributor

SergeyKanzhelev commented Mar 8, 2018 via email

@julKali
Copy link

julKali commented Apr 1, 2018

Hi, do you know any date when this will be fixed?

@TimothyMothra
Copy link
Member

Sorry for the delay, finally getting back to this...

Looks like this changed happened on 2017-11-20 as a part of PR #656.

SamplingPercentageEstimatorSettings's namespace was changed

  • from: Microsoft.ApplicationInsights.WindowsServer.Channel.Implementation.
  • to: Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel.Implementation

(Channel -> TelemetryChannel)

This class is in the "TelemetryChannel" directory, so this change isn't incorrect... but because it's a public class it has affected some customers (see link above).


@SergeyKanzhelev, would it be better to revert this change, or should we reach out to webjobs and ask them to update? Who would we contact to discuss with them?

@SergeyKanzhelev
Copy link
Contributor

Do we have stable version of SDK with those classes moved already? If so - we would need to add classes into older SDK which are basically a proxies to the ones in new namespaces. If it was released in -beta it would be correct to revert the change back. Just follow the semver rules

@xt0rted
Copy link
Author

xt0rted commented Apr 16, 2018

Based on what I found while tracking this down the change happened prior to releasing AI 2.5.0 so the webjobs team needs to update their package, release a new major or minor version with the AI dependency set to >= 2.5.0, and then do a patch release of the effected version that locks the AI version to < 2.5.0.

@TimothyMothra
Copy link
Member

@xt0rted We're going to take ownership and fix it on our end.
We're planning to release 2.6.0 in the first half of May so this makes sense to fix with that.

@xt0rted
Copy link
Author

xt0rted commented May 10, 2018

@MS-TimothyMothra I just updated to 2.6.1 of app insights and with both 2.1.0 and 2.2.0 of the webjobs sdk it crashes on startup. Did the changes in #775 not make it into this release?

This is my error:

System.TypeLoadException
  HResult=0x80131522
  Message=Could not load type 'Microsoft.ApplicationInsights.WindowsServer.Channel.Implementation.SamplingPercentageEstimatorSettings' from assembly 'Microsoft.AI.ServerTelemetryChannel, Version=2.6.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'.
  Source=Microsoft.Azure.WebJobs.Logging.ApplicationInsights
  StackTrace:
   at Microsoft.Extensions.Logging.ApplicationInsightsLoggerExtensions.AddApplicationInsights(ILoggerFactory loggerFactory, String instrumentationKey, Func`3 filter)
   at EquineExchange.Tasks.Emailer.Program.Main(String[] args) in C:\dev\equineexchange\EEI\src\EquineExchange.Tasks.Emailer\Program.cs:line 23

@xt0rted
Copy link
Author

xt0rted commented May 10, 2018

Looks like that PR along with others merged between April 13th and May 2nd exist in the develop branch but not master where 2.6.1 was released from.

@cijothomas
Copy link
Contributor

@xt0rted We'll do a patch release 2.6.2 to include this. Thanks for taking time to report/investigate the issue!

@TimothyMothra
Copy link
Member

@xt0rted I'm sorry this got left out of the previous release.
Thank you for your patience and persistence!

I'm starting our release process now, and I would expect to see 2.6.2 on NuGet either Friday or Monday.

@TimothyMothra
Copy link
Member

Hi @xt0rted, I just published 2.6.2 to NuGet.
Give it about an hour for NuGet to index it.

If you have any issues, feel free to re-open this ticket.
Sorry for all the delays, but thank you for sticking with us and bringing this to our attention!

@xt0rted
Copy link
Author

xt0rted commented May 17, 2018

@MS-TimothyMothra, @paulbatum this update didn't fix the issue, just introduced a different error.

Calling it either way results in an exception.

loggerFactory.AddApplicationInsights(instrumentationKey, null);

loggerFactory.AddApplicationInsights(new DefaultTelemetryClientFactory(instrumentationKey, new SamplingPercentageEstimatorSettings(), null));
Method not found: 'Void Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel.AdaptiveSamplingTelemetryProcessor..ctor(Microsoft.ApplicationInsights.WindowsServer.Channel.Implementation.SamplingPercentageEstimatorSettings, Microsoft.ApplicationInsights.WindowsServer.Channel.Implementation.AdaptiveSamplingPercentageEvaluatedCallback, Microsoft.ApplicationInsights.Extensibility.ITelemetryProcessor)'.
   at Microsoft.Azure.WebJobs.Logging.ApplicationInsights.DefaultTelemetryClientFactory.<>c__DisplayClass10_0.<InitializeConfiguration>b__2(ITelemetryProcessor next)
   at Microsoft.ApplicationInsights.Extensibility.Implementation.TelemetryProcessorChainBuilder.Build()
   at Microsoft.Azure.WebJobs.Logging.ApplicationInsights.DefaultTelemetryClientFactory.InitializeConfiguration()
   at Microsoft.Azure.WebJobs.Logging.ApplicationInsights.DefaultTelemetryClientFactory.Create()
   at Microsoft.Azure.WebJobs.Logging.ApplicationInsights.ApplicationInsightsLoggerProvider..ctor(ITelemetryClientFactory clientFactory)
   at Microsoft.Extensions.Logging.ApplicationInsightsLoggerExtensions.AddApplicationInsights(ILoggerFactory loggerFactory, ITelemetryClientFactory telemetryClientFactory)
   at EquineExchange.Tasks.Emailer.Program.Main(String[] args)

I think the only way to fix this at this point is for the webjobs sdk to bump the app insights dependency to 2.6.2 so it's compiled against the new version.

I can't try wiring this up manually either because ApplicationInsightsLogger and ApplicationInsightsLoggerProvider in the webjobs sdk are marked as internal 😢

@paulbatum
Copy link

@xt0rted I am not sure I follow your reasoning that this needs to be fixed on the webjobs side. If a method not found error is occurring that means this release still has breaking changes and failed to follow semver.

@paulbatum
Copy link

Hmm, I guess the other possibility is that we are getting into a state where two versions of app insights are loaded side by side and that is somehow contributing to the above error. Functions has a binding redirect for app insights that applies up to 2.4.0:

https://github.com/Azure/azure-functions-host/blob/b4313d45ebf47d9a94cc8a6243204dbeb6240d42/src/WebJobs.Script.WebHost/Web.config#L199

@xt0rted
Copy link
Author

xt0rted commented May 17, 2018

I'm using the SDK style projects so there's no binding redirects in my app.config.

Test 1

  • Pulled down the v2 branch of the webjobs sdk
  • Updated the AI references to 2.6.2
  • Made zero code changes
  • Built the project

In my project I commented out the webjobs nuget references and referenced the dlls directly:

<ItemGroup>
    <Reference Include="Microsoft.Azure.WebJobs">
      <HintPath>..\..\..\..\GitHub\azure-webjobs-sdk\src\Microsoft.Azure.WebJobs.Logging.ApplicationInsights\bin\Debug\Microsoft.Azure.WebJobs.dll</HintPath>
    </Reference>
    <Reference Include="Microsoft.Azure.WebJobs.Host">
      <HintPath>..\..\..\..\GitHub\azure-webjobs-sdk\src\Microsoft.Azure.WebJobs.Logging.ApplicationInsights\bin\Debug\Microsoft.Azure.WebJobs.Host.dll</HintPath>
    </Reference>
    <Reference Include="Microsoft.Azure.WebJobs.Logging.ApplicationInsights">
      <HintPath>..\..\..\..\GitHub\azure-webjobs-sdk\src\Microsoft.Azure.WebJobs.Logging.ApplicationInsights\bin\Debug\Microsoft.Azure.WebJobs.Logging.ApplicationInsights.dll</HintPath>
    </Reference>
</ItemGroup>

I then loaded up my webjob and it ran as expected with no crashes.

Test 2

I switched back to the 2.6.2 nuget packages and modified my code as follows:

loggerFactory.AddApplicationInsights(
    new DefaultTelemetryClientFactory(
        instrumentationKey,
        samplingSettings: null,
        filter: null));

I then loaded up my webjob and it ran as expected with no crashes.

Test 3

I reverted my code back to the original setup:

loggerFactory.AddApplicationInsights(instrumentationKey, null);

And then added manual binding redirects to my app.config

<runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
        <dependentAssembly>
            <assemblyIdentity name="Microsoft.ApplicationInsights" publicKeyToken="31bf3856ad364e35" culture="neutral" />
            <bindingRedirect oldVersion="0.0.0.0-2.6.2.0" newVersion="2.6.2.0" />
        </dependentAssembly>
        <dependentAssembly>
            <assemblyIdentity name="Microsoft.AI.ServerTelemetryChannel" publicKeyToken="31bf3856ad364e35" culture="neutral" />
            <bindingRedirect oldVersion="0.0.0.0-2.6.2.0" newVersion="2.6.2.0" />
        </dependentAssembly>
    </assemblyBinding>
</runtime>

I then loaded up my webjob and received the same error as without the binding redirects.

Results

Using v2.6.2 of AI you can use the AI logger, but you have to use the AddApplicationInsights(ITelemetryClientFactory) version of the extension method and make sure to pass null instead of new SamplingPercentageEstimatorSettings() to the DefaultTelemetryClientFactory.

Here's the live metrics in the portal during tests 1 and 2.

image

The underlying fix to 2.6.x was to re-add a SamplingPercentageEstimatorSettings class to the original namespace and inherit from the moved copy of the class. It's my understanding that this is a breaking change which requires a recompile along the lines of if you adjusted a method's parameter to be optional. On the surface it looks ok but the clr isn't going to handle dropping in a new version of the DLL it without a recompile.

It's possible I'm reading all of this wrong and there's something else going on, but updating the AI reference and recompiling does fix the issue for me.

Since 2.5.1 fixed a critical bug, and until now I've been unable to update to it, I'm going to try updating my webjobs to use the setup that passes null for the sampling settings. That seems to be an acceptable workaround for the time being.

@TimothyMothra
Copy link
Member

@paulbatum This is totally our fault, we should have never changed this class's namespace.
I tried fixing it but obviously, that didn't account for this scenario.

Are you willing to update your reference to our 2.6.2?
If not, I need to revert this change entirely and quickly release a 2.6.3.

@TimothyMothra TimothyMothra reopened this May 17, 2018
@paulbatum
Copy link

@MS-TimothyMothra Ahh I was confused, I thought 2.6.2 reverted the change, now I understand. Thanks for helping with this.

Updating the reference for us is technically possible but somewhat scary because when we do this, all of our customers are automatically rolled forward and sometimes even minor updates can have subtle changes that impact applications. So we generally only do this when our hand is forced (such as a security update).

Fully reverting the change is definitely more appealing from my perspective. I would recommend you consider unlisting on NuGet all the intermediate packages that contained the semver break.

@TimothyMothra
Copy link
Member

Yes, please don't upgrade to 2.6.2. After further discussion that would make this worse.
I'm writing a full description and plan of action...

@TimothyMothra
Copy link
Member

Okay, this is the landscape we have created:

  • 2.4 - Channel.SamplingPercentageEstimatorSettings
  • 2.5 - TelemetryChannel.SamplingPercentageEstimatorSettings
  • 2.6.2 (released today)
    • TC.SPES (base class)
    • C.SPES (derived class)
  • 2.6.3 (next hotfix release)
    • C.SPES
    • TC.SPES (duplicate class marked obsolete)

What happened:

The namespace for SamplingPercentageEstimatorSettings was accidentially renamed during some code maintinance. We've since put extra safeguards in place to prevent this from happening.

The fix for 2.6.2 was expected to be a drop in solution, both classes could co-exist. Unfortunately we didn't plan for scenarios where two versions of the SDK could co-exist.


What we are doing:

We will revert all namespaces back to the state of things in 2.4.
However we cannot simply remove the new class because it was a part of the Public API - to solve this we will leave it as a duplicate class (no inheritance) and mark it as Obsolete to gently move customers back to the original class.


For Consideration:

2.4 and 2.5 cannot co-exist because the types (namespaces) are different.
2.4 and 2.6.2 cannot co-exist, because AdaptiveSamplingTelemetryProcessor will expect the other type.
2.4 and 2.6.3 can co-exist because types will be identical.
(2.5-2.6.2) and 2.6.3 cannot co-exist, because AdaptiveSamplingTelemetryProcessor will expect the other type.

@TimothyMothra
Copy link
Member

@paulbatum Does this plan sound good to you?

@xt0rted Would you be available to test a private copy of 2.6.3 before I publish it to NuGet?
I would have this available as early as Friday morning.

@xt0rted
Copy link
Author

xt0rted commented May 17, 2018

yea, whenever it's ready just let me know

@paulbatum
Copy link

This sounds good. Tagging @brettsam as an FYI. I'll rely on the results reported by @xt0rted because the idea is that this is a no-op on the webjobs/functions side.

@TimothyMothra
Copy link
Member

@xt0rted Good Morning, I have the private bits ready for testing, they're on our private package feed (install command below).

I recommend you do not use this package feed for your production app, we frequently post develop builds here for testing!


If yours is a web project, you can just install this and it will install all dependencies.

  • Microsoft.ApplicationInsights.Web
    Install-Package Microsoft.ApplicationInsights.Web -Version 2.6.3 -Source https://www.myget.org/F/applicationinsights/api/v3/index.json

Otherwise you'll want to install these individually

  • Microsoft.ApplicationInsights
    Install-Package Microsoft.ApplicationInsights -Version 2.6.3 -Source https://www.myget.org/F/applicationinsights/api/v3/index.json
  • Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel
    Install-Package Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel -Version 2.6.3 -Source https://www.myget.org/F/applicationinsights/api/v3/index.json

Once I hear back from you I'll continue with our release and should get these published to NuGet early next week!

@xt0rted
Copy link
Author

xt0rted commented May 18, 2018

@MS-TimothyMothra I just did two tests:

  1. Updated one of my webjob projects to 2.6.3 and that resulted in the same Method not found exception that I got with 2.6.2.
  2. Created a new .net core console app and retargeted it to .net 4.7.1 like my other projects are and that resulted in the same Method not found exception that I got with 2.6.2.

Comparing the two versions of AdaptiveSamplingTelemetryProcessor (2.4.0 and 2.6.3) does show some slight differences, but my knowledge of the clr is limited so I can't say for sure if that's causing this.

The only change in SamplingPercentageEstimatorSettings between the two versions is a spelling fix in one of the comments so that shouldn't be enough to cause this, at least I don't think it should.

Diff of AdaptiveSamplingTelemetryProcessor with 2.4.0 on the left and 2.6.3 on the right.

image

Diff of SamplingPercentageEstimatorSettings with 2.4.0 on the left and 2.6.3 on the right.

image

I'm going to clone this repo and play with the class a bit to see if I can get it to run without touching the webjobs sdk code. Just to confirm 2.6.3 is targeting the master branch correct?

@xt0rted
Copy link
Author

xt0rted commented May 18, 2018

@MS-TimothyMothra I think I figured it out. The method not found exception is pointing to the AdaptiveSamplingTelemetryProcessor class which is directly used by the webjobs sdk at DefaultTelemetryClientFactory.cs#L93.

That class uses the AdaptiveSamplingPercentageEvaluatedCallback delegate located in SamplingPercentageEstimatorTelemetryProcessor.cs which also had its namespace change between 2.4.0 to 2.6.3.

Moving just this delegate to a new file using the Microsoft.ApplicationInsights.WindowsServer.Channel.Implementation namespace looks to have fixed the issue.

I didn't do an expansive test, but nothing's crashing yet. I'm also not sure if there's other classes that may need to change back to the old namespace, but this one delegate does for sure.

@TimothyMothra
Copy link
Member

@xt0rted 2.6.4 has been pushed to MyGet

  • Microsoft.ApplicationInsights
    Install-Package Microsoft.ApplicationInsights -Version 2.6.4 -Source https://www.myget.org/F/applicationinsights/api/v3/index.json
  • Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel
    Install-Package Microsoft.ApplicationInsights.WindowsServer.TelemetryChannel -Version 2.6.4 -Source https://www.myget.org/F/applicationinsights/api/v3/index.json

If you need the Web portion, I'm prepping that now and will release before the EOD.


Explaining what happened with AdaptiveSamplingPercentageEvaluatedCallback.
It was included in that same batch of namespace renamings. I missed it because I was only looking for "classes" when I was working on this fix.
This morning I used the object browser and inspected everything in the TelemetryChannel namespace.
I'm more confident that we got it all this time, and of course, testing will confirm :)

@xt0rted
Copy link
Author

xt0rted commented May 21, 2018

@MS-TimothyMothra I'm referencing the Microsoft.ApplicationInsights.WindowsServer package so I just switched to the Microsoft.ApplicationInsights package and it worked 🎉

I've only tested one webjob real quick, but there's no crashes and I'm seeing stats in the live metrics view.

@TimothyMothra
Copy link
Member

@xt0rted that is great news for a Monday! tears of joy
I'll continue with the release process. We should have this published to NuGet by Thurs or Friday.

I can't thank you enough for your help!

@xt0rted
Copy link
Author

xt0rted commented May 22, 2018

@MS-TimothyMothra happy to help and glad to see it's finally resolved 🤞

The 2.6.4 packages on myget, are they the same ones that'll be going up to nuget? If so I'll update my project tomorrow with them and do a deploy.

@TimothyMothra
Copy link
Member

Answering your question: Yes, the bits on MyGet now are the same that will get published to NuGet (assuming they pass acceptance testing).

Out of an abundance of caution, I ask customers to please wait until we publish to NuGet before deploying to your production environment.

@TimothyMothra
Copy link
Member

@xt0rted I just published 2.6.4 to NuGet. It should be available within an hour. :)

@xt0rted
Copy link
Author

xt0rted commented May 23, 2018

Just tested all of my webjobs and there were no issues, so it's going out to production now.

TimothyMothra pushed a commit that referenced this issue Oct 25, 2019
Dev to master for 2.4.0-beta4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants