Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource sdk Implementation #502

Merged
merged 37 commits into from
Jan 22, 2021
Merged

Resource sdk Implementation #502

merged 37 commits into from
Jan 22, 2021

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Jan 5, 2021

Initial release of Resource SDK .

Supported -

  • Create resource
  • Merge resource
  • Empty resource creation
  • Base resource detection package.
  • Specify resource information from env variable ( OTEL_RESOURCE_ATTRIBUTES )
  • Integrating with TracerProvider and Traces

To be added as separate PR:

  • Adding in SpanData and propagating to exporters.
  • Resource detection logic from multiple resources, and merging all.

#334

@lalitb lalitb requested a review from a team January 5, 2021 16:30
@codecov
Copy link

codecov bot commented Jan 5, 2021

Codecov Report

Merging #502 (8cd247e) into master (ad9e57f) will decrease coverage by 0.02%.
The diff coverage is 94.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #502      +/-   ##
==========================================
- Coverage   94.45%   94.42%   -0.03%     
==========================================
  Files         189      194       +5     
  Lines        8388     8509     +121     
==========================================
+ Hits         7923     8035     +112     
- Misses        465      474       +9     
Impacted Files Coverage Δ
sdk/include/opentelemetry/sdk/trace/tracer.h 100.00% <ø> (ø)
.../include/opentelemetry/sdk/trace/tracer_provider.h 100.00% <ø> (ø)
sdk/src/trace/span.h 100.00% <ø> (ø)
sdk/src/trace/tracer_provider.cc 65.00% <50.00%> (-5.59%) ⬇️
sdk/src/resource/resource.cc 88.00% <88.00%> (ø)
sdk/test/resource/resource_test.cc 95.65% <95.65%> (ø)
ext/test/zpages/tracez_data_aggregator_test.cc 97.35% <100.00%> (+<0.01%) ⬆️
ext/test/zpages/tracez_processor_test.cc 98.71% <100.00%> (+<0.01%) ⬆️
sdk/include/opentelemetry/sdk/resource/resource.h 100.00% <100.00%> (ø)
...ude/opentelemetry/sdk/resource/resource_detector.h 100.00% <100.00%> (ø)
... and 12 more


std::unique_ptr<Resource> Merge(const Resource &other);

static std::unique_ptr<Resource> Create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have both Create methods and constructors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, probably because of creating unique pointers. Should we make the constructors private then, to prevent the creation of unmanaged resources?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is challenge to do this in C++. Ideally, I would like to make constructor as private, and make OTELResourceDetector as friend class of Resource, but we actually need all derivations of ResourceDetector have access to this constructor.
opentelementry-java does it using AutoValue annotation, and opentelemetry-dotnet does it using internal keyword.

Copy link
Member Author

@lalitb lalitb Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Please ignore above comment. I have now made constructor as protected ( so that it can be used in test cases ), and made OTELResourceDetector as friend class. It seems we don't need to invoke Resource constructor in any other detector, so we should be fine.

Comment on lines 23 to 36
Resource(const opentelemetry::common::KeyValueIterable &attributes) noexcept;

Resource(const opentelemetry::sdk::trace::AttributeMap &attributes) noexcept;

template <class T,
nostd::enable_if_t<common::detail::is_key_value_iterable<T>::value> * = nullptr>
Resource(const T &attributes) noexcept : Resource(common::KeyValueIterableView<T>(attributes))
{}

Resource(const std::initializer_list<std::pair<nostd::string_view, common::AttributeValue>>
attributes) noexcept
: Resource(nostd::span<const std::pair<nostd::string_view, common::AttributeValue>>{
attributes.begin(), attributes.end()})
{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your opinion about replacing those with a single:

Resource(const std::map<std::string, opentelemetry::sdk::trace::SpanDataAttributeValue> attributes) noexcept;

As resources are an SDK only concept, we don't need to worry about ABI compatibility here.

Also, we should move the attribute_utils.h (containing the AttributeMap) from trace to common. Resources will be used by metrics as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point,
Using std::unordered_map<std::string, opentelemetry::sdk::common::OwnedAttributeValue directly, and getting rid of AttributeMap.

Comment on lines 59 to 69
opentelemetry::sdk::trace::AttributeMap resource_attributes;
std::istringstream iss(attributes);
std::string token;
while (std::getline(iss, token, ','))
{
size_t pos = token.find('=');
std::string key = token.substr(0, pos);
std::string value = token.substr(pos + 1);
resource_attributes.SetAttribute(key, value);
}
return std::unique_ptr<Resource>(new Resource(resource_attributes));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to have this logic in the OTELResourceDetector, as it's related to the format of the environment variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Have changed this accordingly now.

return default_resource.Merge(*(OTELResourceDetector().Detect()));
}

std::unique_ptr<Resource> Resource::CreateEmpty()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when we start to use resources, we have to make them shared pointers. Resources are supposed to be attached to every span. We don't want to have unique pointers there, but share a common resource with a shared pointer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Have changed std::unique_ptr to std::shared_ptr_ptr now.

Copy link
Member Author

@lalitb lalitb Jan 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. based on suggestion from @jsuereth , Resources::Merge() and Resources::Create() are modified to return stack allocated Resource by value. And this resource is then owned by sdk::tracer_provider and passed as reference to span. Please refer to examples/simple/main.cc in this PR on how the usage would be, and let me know if any concerns.


std::unique_ptr<Resource> Resource::CreateEmpty()
{
return Create({});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Using static local for GetDefault() and GetEmpty() now.

class ResourceDetector
{
public:
virtual std::unique_ptr<opentelemetry::sdk::resource::Resource> Detect() = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to return a unique_ptr over just using the stack here?

Resource is non-virtual class, so it might be a simple API to just return it by-value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the motivation behind that is to avoid excessive copying of a resource. A resource might be appended to each span. Having a shared_ptr (I think a unique pointer will not do) enables us to avoid copying the resource and its attributes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on suggestion from @jsuereth , Resources::Merge() and Resources::Create() are modified to return stack allocated Resource by value. And this resource is then owned by sdk::tracer_provider and passed as reference to span. Please refer to examples/simple/main.cc in this PR on how the usage would be, and let me know if any concerns.


const opentelemetry::sdk::trace::AttributeMap &GetAttributes() const noexcept;

std::unique_ptr<Resource> Merge(const Resource &other);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIt: Here and elsewhere in this file. Can we add method documentation about the expectations of calling this?

E.g. for Merge I expect attributes already defined to not be overridden by other's attributes, but want to confirm this is expected.

For this review, yes I'm looking it up in the spec, but we should document it here for other maintainers of C++.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will add the method documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the documentation now.


const ResourceAttributes &GetAttributes() const noexcept;

std::shared_ptr<Resource> Merge(const Resource &other);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just return a stack-allocated resource here? Wouldn't that be a simpler api?

Similar for Create.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is possible to share resources across multiple spans, it would be better to use shared_ptr and avoid un-necessary copying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method doesn't need to determine how resources are shared.

A few points:

I think it's reasonable for this method to return on the Stack. We then have our MetricProvider + TraceProvider have a Resource attribute they own, and Trace/Metrics generated from those provides can use Pointers or references to the resource associated with a provider.

Resources should really only be a startup concern and not something we expect users to pass around. I'd rather see a Resource&& argument on *Provider classes just to make ownership clear. This is similar to other APIs.

References:

Copy link
Member Author

@lalitb lalitb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here. Resource need to be associated with TracerProvider at sdk level, and we can actually pass the reference of same to traces generated from the provider. And in that case, we don't need to manage smart pointers. Let me add other half in this PR, and it would be easier to comprehend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, Resources::Merge() and Resources::Create() are modified to return stack allocated Resource by value. And this resource is then owned by sdk::tracer_provider and passed as reference to span. Please refer to examples/simple/main.cc in this PR on how the usage would be, and let me know if any issues now.

class ResourceDetector
{
public:
virtual std::shared_ptr<opentelemetry::sdk::resource::Resource> Detect() = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit: Is there a reason to return a shared_ptr here? AFAIK Resource detection happens once and generally isn't passing pointers between different memory owners.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in earlier comment - As it is possible to share resources across multiple spans, it would be better to use shared_ptr and avoid un-necessary copying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid the counting of shared_ptr references if we can just use the attached TraceProvider.

I.e. shared_ptr has a cost, and I think we don't need it in this scenario. If we still want to use shared_ptr within the SDK to associate resource from *Provider to Span, that's fine but a different PR.

I think without seeing the other half of the change it's hard to fully comment on merits of shared_ptr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did revisit again, and probably we mayn't need smart pointer here ( including unique_ptr). Let me add other half of implementation in this PR, so that it would be easy to comprehend and review.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are done as advised ( refer above comments ).

{
Resource tmp_resource(attributes);
auto merged_resource = tmp_resource.Merge(default_resource);
return merged_resource->Merge(*(OTELResourceDetector().Detect()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Would it make sense to have both the OTEL resource + default resource be statics within this function?

{
  static auto default_resource = Resource::GetDefault();
  static auto otel_env_resource = OTELResourceDetector().Detect();
  ...
}

This would avoid re-running the OTEL resource detection for every created resource.
  

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I have changed it accordingly.

  static auto otel_resource = OTELResourceDetector().Detect();
  auto default_resource     = Resource::GetDefault();

and

Resource &Resource::GetDefault()
{
  static Resource default_resource({{TELEMETRY_SDK_LANGUAGE, "cpp"},
                                    {TELEMETRY_SDK_NAME, "opentelemetry"},
                                    {TELEMETRY_SDK_VERSION, OPENTELEMETRY_SDK_VERSION}});
  return default_resource;
}

class Resource
{
public:
Resource(const ResourceAttributes &attributes = ResourceAttributes()) noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification says that ANY created resource should go through the environment variable additions + default logic you have in Create. Should we hide this constructor to only the OTELResourceDetector + getDefault ?

Copy link
Member Author

@lalitb lalitb Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is challenge to do this in C++. Ideally, I would like to make constructor as private, and make OTELResourceDetector as friend class of Resource, but we actually need all derivations of ResourceDetector have access to this constructor.
opentelementry-java does it using AutoValue annotation, and opentelemetry-dotnet does it using internal keyword.

Copy link
Member Author

@lalitb lalitb Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Please ignore above comment. I have now made constructor protected ( so that it can be used in test cases ), and made OTELResourceDetector as its friend class. It seems we don't need to invoke Resource constructor in any other detector, so we should be fine.

#include "opentelemetry/sdk/version/version.h"
#include "opentelemetry/version.h"

#include <cstdlib>
Copy link
Contributor

@ThomsonTan ThomsonTan Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this include necessary?

Copy link
Member Author

@lalitb lalitb Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we actually need it in resource.cpp ( to read env variable through std::getenv ). Will move it there. Thanks for noticing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done now.

* @returns the newly merged Resource.
*/

std::shared_ptr<Resource> Merge(const Resource &other) noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to implement these (internal to SDK surface) methods.. can we make it a map - inherited from standard library map (either map or unordered_map), and follow semantics of existing APIs available, e.g. map::merge .. Then we are not inventing some new Visual Basic-style API, but follow existing Standard Library semantics everywhere. This will be easier to comprehend by experienced modern C++ developer... Plus - I think it would be consistent if our container classes are compatible with existing algorithm templates described here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec defines some strict rules around merging, which I think are not satisfied by map::merge:

  • If the value on the primary resource is an empty string, the result has the value of the secondary resource.
  • Otherwise, the value of the primary resource is used.

TEST(ResourceTest, create)
{

std::map<std::string, std::string> expected_attributes = {
Copy link
Contributor

@maxgolov maxgolov Jan 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you supplement this with a test that handles strongly-typed attributes of numeric (integer) type? i.e. map of variants. Also - the cost parameter here should be a double, not string. Somebody who is onboarding to SDK should be able to freely express strongly-typed parameters in general. Just like it's done in nlohmann/json, where you'd provide developers with initializer list constructor, that is capable of accepting arbitrary variant types described within the OpenTelemetry spec.. Maybe not going as far as implementing the array types. But at least all primitive types in spec + std::string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sure that would be good test to showcase. I will add it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have modified the test to include both integer and double resource attributes. Please see if this is fine.

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good now! Just a few minor nits.

std::shared_ptr<Sampler> sampler = std::make_shared<AlwaysOnSampler>()) noexcept;
std::shared_ptr<Sampler> sampler = std::make_shared<AlwaysOnSampler>(),
const opentelemetry::sdk::resource::Resource &resource =
opentelemetry::sdk::resource::Resource::Create({})) noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one may leak a nullptr exception from the default value.

I.e. the default argument won't be retained. AFAIK it should be ok to not give this argument a default value and require the TracerProvider to pass one into the tracer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, default argument for reference can leak. Have removed default now.

@@ -108,6 +109,7 @@ Span::Span(std::shared_ptr<Tracer> &&tracer,
recordable_->SetSpanKind(options.kind);
recordable_->SetStartTime(NowOr(options.start_system_time));
start_steady_time = NowOr(options.start_steady_time);
// recordable_->SetResource(resource_); TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative idea (that might be harder?) we could instead push the Resource into the Exporter via the Processor. I think that's not really specified (and actually the lack of specification around HOW Resource propagates may give us room to do something different if you think it'd be cleaner).

This approach looks fine, just wanted to mention that you have options AFAICT from the spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility: one could just iterate over the attributes in the resource and add them to the recordable at this point. I think that's the least amount of work that would make this PR functionally complete, and we'd have the ability to elaborate on that later on.

I like the idea about propagating the resource into the exporter (that would make it possible to do some optimization, some vendors have a section of common attributes for batches), on the other hand it would make processor and exporter interfaces and the lives of exporter developers harder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth - Good point, this way we don't need to propagate resource every time as part of span, but there would be extra task at exporter side to integrate this resource with span data before exporting.
@pyohannes - My initial plan was to add Recordable::SetResource(). But let me explore more and add propagation in next PR as planned if that's fine ?

@@ -4,4 +4,5 @@ add_executable(batch_span_processor_example main.cc)

target_link_libraries(
batch_span_processor_example ${CMAKE_THREAD_LIBS_INIT} ${CORE_RUNTIME_LIBS}
opentelemetry_exporter_ostream_span opentelemetry_trace)
opentelemetry_exporter_ostream_span opentelemetry_trace
opentelemetry_resources)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link this into opentelemetry_trace, like we do opentelemetry_common? Then we wouldn't need to specify it alongside opentelemetry_trace everywhere. opentelemetry_trace has a hard dependency on opentelemetry_resource anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done.

@@ -57,6 +60,7 @@ class Tracer final : public trace_api::Tracer, public std::enable_shared_from_th
private:
opentelemetry::sdk::AtomicSharedPtr<SpanProcessor> processor_;
const std::shared_ptr<Sampler> sampler_;
const opentelemetry::sdk::resource::Resource &resource_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't make this a reference, but hold the resource by value. Theoretically, we support using a Tracer without a TracerProvider (I remember that was a requirement that came up early in the design process). Having a copy of the resource in the Tracer makes this design safer and doesn't cost much.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Comment on lines 20 to 30
ResourceAttributes merged_resource_attributes(attributes_);
for (auto &elem : other.attributes_)
{
if ((merged_resource_attributes.find(elem.first) == merged_resource_attributes.end()) ||
(nostd::holds_alternative<std::string>(attributes_[elem.first]) &&
nostd::get<std::string>(attributes_[elem.first]).size() == 0))
{
merged_resource_attributes[elem.first] = elem.second;
}
}
return Resource(merged_resource_attributes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some recent spec changes that require this to be revisited: open-telemetry/opentelemetry-specification#1345

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing it. It's kind of reverse of original logic. Have incorporated these changes.

@@ -108,6 +109,7 @@ Span::Span(std::shared_ptr<Tracer> &&tracer,
recordable_->SetSpanKind(options.kind);
recordable_->SetStartTime(NowOr(options.start_system_time));
start_steady_time = NowOr(options.start_steady_time);
// recordable_->SetResource(resource_); TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility: one could just iterate over the attributes in the resource and add them to the recordable at this point. I think that's the least amount of work that would make this PR functionally complete, and we'd have the ability to elaborate on that later on.

I like the idea about propagating the resource into the exporter (that would make it possible to do some optimization, some vendors have a section of common attributes for batches), on the other hand it would make processor and exporter interfaces and the lives of exporter developers harder.


static Resource &GetDefault();

protected:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is protected necessary or just private is fine?

Copy link
Member Author

@lalitb lalitb Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected helps creating unit test case - to subclass Resource and create instance of it for testing. We are doing this in resources_test.cc. I have modified the comment to indicate it's protected :)

namespace resource
{

const std::string TELEMETRY_SDK_LANGUAGE = "telemetry.sdk.language";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name the constants like kTelemetrySdkLanguage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. fixed now :)

Copy link
Contributor

@pyohannes pyohannes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

sdk/src/resource/resource.cc Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants