Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Optimal #124

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

kamilkowalski
Copy link
Contributor

Overview

Fixes #100 by removing opts validation entirely, which is my preferred solution to the Optimal performance problem. Inspecting other tracing libraries for dynamically typed languages, I didn't see any performing validation of options - it always carries a performance penalty and that is something we should avoid in a tracing library.

That said, I'm opening this as a draft - it's quite a change and it is breaking. I'm open to discussing other options.

Why not use nimble_options?

An alternative to this approach would be to use something like dashbitco/nimble_options as suggested by @zachdaniel, but I still think that's a performance penalty added to every trace I'd like us to avoid.

If Spandex's interface was more conservative and didn't allow for modifying every option of the tracer when starting a trace/span, then we could allow passing most options only on tracer initialization, validate them on application startup, then skip validation of the limited set of options passed when creating a trace/span. That's an option I think could work.

Why not make optimal optional?

Another option mentioned by @GregMefford was to make optimal optional, and if it's detected - enable opts validation. The reason for this would be to help developers during test/dev to configure everything appropriately, then remove optimal to ensure maximum performance.

That's a practice I haven't seen previously, and I don't think it would be widely used - after all, one could as well simply test the deployment and its integration with Datadog. At our company we don't even enable Spandex in dev/test - we only do that in a test deployment.

@zachdaniel
Copy link
Member

My only hesitation about fully removing option validation is how many people already have trouble knowing if the values they provide are valid, so removing more validation would likely exacerbate that issue.

However, I think that you're right and that we should just remove this. What should be added, I think, is a testing api adapter for Datadog that runs synchronously and validates traces, raising errors if it would send something invalid (e.g no service, invalid metadata) is provided. That has the added benefit of not having to disable the tracer in test, but instead just configuring your adapter. Seems like it would be the best of both worlds.

@zachdaniel
Copy link
Member

Not suggesting you should add that as a part of this effort, just that we should do that eventually to solve the general problem.

@kamilkowalski
Copy link
Contributor Author

Such a neat idea with the adapter! 👏

@kamilkowalski
Copy link
Contributor Author

If we're all for it, I'll test this change with one of our services and open for review once it's confirmed working.

List of changes:
- Sort deps
- Use common source url
- Source reference by version
- Fix typos
- Add changelog to html doc
- Changelog before readme
- Add logo to nav bar
- Remove sourcelevel 404 badge
- Badges and more badges!
@kamilkowalski
Copy link
Contributor Author

@zachdaniel thanks for your patience - I didn't manage to test the change before going on holiday, but I've plugged it in today on a test environment at Fresha and everything seems to be working fine - no regressions, and I see traces being reported to Datadog successfully.

That said, I've encountered one problem - spandex_phoenix and spandex_datadog both depend on optimal without requiring it in deps - so Optimal was available only as a transitive dependency of spandex. Should we release this change to Spandex as a minor version bump, it will break spandex_phoenix and spandex_datadog - it's what happened to me before I switched both libraries to a version that didn't use Optimal.

Anyway, I'm opening this as a regular pull request since I've verified it to be working - what to do with the change is another matter that I'd appreciate your opinion on.

@kamilkowalski kamilkowalski marked this pull request as ready for review January 18, 2021 14:01
@zachdaniel
Copy link
Member

@GregMefford should we wait for a major version bump for this then?

@GregMefford
Copy link
Member

I suppose we could just release a major version whenever we want to. I need to find some time to spend thinking about and trying out the code in this PR to understand what the impact would be for an actual user... but if it works exactly the same except it throws less errors, that doesn't feel breaking to me. If you just have errors getting thrown and now they get thrown in a different way because we stop validating errors... their code already wouldn't have worked so how much effort do we need to invest to make it continue to break in exactly the same way. 🤔

@zachdaniel
Copy link
Member

The main issue is that they would update spandex and nothing would tell them that they need to update spandex_datadog. But they do, because by updating to spandex they no longer have anything in their mix.exs file that depends on optimal, but spandex_datadog still calls to Optimal

@GregMefford
Copy link
Member

Oh that's a good call-out I wasn't aware of. spandex_datadog should directly depend on optimal if it does in fact depend on it...

@GregMefford
Copy link
Member

We could release a spandex_datadog version that either removes optimal or declares the dependency though, and then release a spandex version that depends on at least that version of spandex_datadog.

@zachdaniel
Copy link
Member

Hmm....can you say "if you have this dependency, it must be more than x"? I guess thats just {:spandex_data, ">= x.x.x", optional: true} right?

@zachdaniel
Copy link
Member

If so then I'm down with that plan 👍

@GregMefford
Copy link
Member

GregMefford commented Jan 28, 2021

Yes, that's how optional deps work. 👍 So that's one way to solve it.
It's also possible to say {:spandex_datadog, "~> 1.2 or ~> 2.0"} if you want to allow either 2.x or 1.y where y must be at least 2, so 1.1.0 would not be allowed.

@kamilkowalski
Copy link
Contributor Author

I'm wondering whether that's the correct approach because adding spandex_datadog to spandex's deps would look like a circular dependency - wouldn't dependency resolution complain about it? Also, spandex_datadog wouldn't be used anywhere in spandex so it's not really a dependency - the usual case for optional deps that I know of is when a library extends its functionality if a given dependency is present, which isn't the case here. Do you know of a precedent where optional deps are used in the way you're suggesting?

Copy link
Member

@GregMefford GregMefford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a closer look through this PR and I'm still actually thinking this is not that big of a deal. The only breaking change I can see is if people were, for some reason, doing something like:

case Tracer.update_span(opts_that_might_not_be_valid) do
  {:error, ["some Optimal error"]} -> Logger.error("Tried to send invalid opts")
  {:ok, _span} -> :ok
end

I don't honestly know why someone would do that, rather than explicitly sending the opts that they wanted, though, because in the case of a validation error, you'd just have to do the same data-processing that you should have done prior to making the call in the first place.

The breaking change is that it will now silently ignore bad data rather than returning validation errors, in exchange for doing the processing faster. We want to make sure that this doesn't cause something else to crash further down the line due to malformed data, but I think we can do that by just explicitly handling the things we do support and ignoring the rest. I suspect that, most of the time, when people are getting these validation errors today, they aren't handling them anyway, like this:

Tracer.update_span(opts_that_might_not_be_valid)

which will simply do nothing today in the case where there is a validation error.

I don't see why spandex would need to even optionally depend on spandex_datadog. Are you saying that removing the protection from spandex means that we need to rely on protection in spandex_datadog so that it doesn't crash when trying to encode the request to the DD API?

@zachdaniel
Copy link
Member

zachdaniel commented Feb 3, 2021

If you try using it yourself you'd see the problem. spandex_datadog has a call to Optimal but doesn't include Optimal in its deps. So if you just upgrade spandex which removes optimal, your application would not compile. (You'd also probably have to mix deps.unlock) something

@GregMefford
Copy link
Member

Oh ok I'm tracking then. That's a bug in spandex_datadog though, so I agree that we should avoid a circular dependency to solve that problem.

@zachdaniel
Copy link
Member

The only issue is that, without the optional dependency on spandex_datadog, nothing will really tell anyone that they need to upgrade, it will just fail to compile and then they'll have to figure it out from there. We could potentially publish a new version of spandex_datadog that doesn't use optimal (or that does and adds it to its deps for now) and then retire the current version. https://hexdocs.pm/hex/Mix.Tasks.Hex.Retire.html

That should produce a warning that people can read and we can add a message so we could tell them what to do.

@kamilkowalski
Copy link
Contributor Author

@zachdaniel retiring might be a good compromise. In any case, we first need to ship spandex_datadog and spandex_phoenix without Optimal - I'll get to those PRs first (I have the code ready too).

@hkrutzer
Copy link

How is this PR doing? 🙂

@dynaum
Copy link

dynaum commented Dec 31, 2021

@kamilkowalski would be amazing to add the service_version here too! 😍

@koudelka
Copy link

i'd also love to see this merged ;_;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance problems with Optimal dependency
7 participants