[API] Add user facing Logging API and Benchmarks #2094

ThomsonTan · 2023-04-13T02:18:10Z

Fixes #2093

Changes

This PR adds a new set of user facing logging API for application developer to instrument their code. This new set of API is designed for production environment with large volume of code instrumented while maintaining high throughput.

Here are the main design considerations:

Add optional event Id to the log inside Logger object because categorization is very important for large volume of logs data. It helps both in doing runtime filtering in SDK/exporters and in backend grouping/querying.
Added Enabled predicate on Logger to check if log should be sent to SDK for processing. It is assumed that most instrumented logs will not be turned on so it is fast-path for such code with minimum runtime performance overhead.
Multi-thread safety is built into the design and the same logger object can be used in multiple threads/cores.
Benchmark is added to measure the runtime cost of API calls. It shows the fast path check only takes a few machine cycles (~5).

This is only for API change, and SDK change will be added later.

For significant contributions please make sure you have completed the following items:

CHANGELOG.md updated for non-trivial changes
Unit tests have been added
Changes in public API reviewed

codecov · 2023-04-13T02:32:09Z

Codecov Report

Merging #2094 (d6e96af) into main (22d0448) will not change coverage.
The diff coverage is n/a.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2094   +/-   ##
=======================================
  Coverage   87.19%   87.19%           
=======================================
  Files         166      166           
  Lines        4784     4784           
=======================================
  Hits         4171     4171           
  Misses        613      613

api/include/opentelemetry/logs/logger.h

api/include/opentelemetry/common/macros.h

api/include/opentelemetry/logs/severity.h

tests for coverage

api/test/logs/logger_test.cc

marcalff

There is a point in the design that I do not understand.

Method Logger::SetMinimumSeverity() is protected, so it can not be invoked by the instrumented application, this is not part of the api.

It sounds like minimum_severity_, a class member, is part of the implementation already.

Why making this code visible in the API, instead of implementing the minimum_severity_ member in the SDK, in a sub class of Logger ?

Instead of using atomic macros in the API, the SDK could use std::atomic instead, and this will avoid exposing all the per platform and compiler code in the API itself.

Please clarify the intent here.

ThomsonTan · 2023-04-17T21:00:10Z

There is a point in the design that I do not understand.

Method Logger::SetMinimumSeverity() is protected, so it can not be invoked by the instrumented application, this is not part of the api.

It sounds like minimum_severity_, a class member, is part of the implementation already.

Why making this code visible in the API, instead of implementing the minimum_severity_ member in the SDK, in a sub class of Logger ?

Instead of using atomic macros in the API, the SDK could use std::atomic instead, and this will avoid exposing all the per platform and compiler code in the API itself.

Please clarify the intent here.

Yes, SetMinimumSeverity() is not part of the user-facing API. It is intended to SDK or any one which implements this API to use internally.

And you are right that minimum_severity_ is part of implementation instead of API, but it is added here for getting the best performance on the common path (no virtual function call is needed if the logger is not enabled which I assume is the common case for most components). So in brief, providing implementation of the minimum severity check in this abstract logger is for eliminating the virtual dispatch for common cases/scenarios so the runtime cost of instrumentation will be minimum.

It is also correct that if minimum_severity_ is moved to SDK, the atomic support will be much easier, but calling into SDK causes performance overhead which I want to avoid here.

Please let me know if anything I missed here.

reyang · 2023-04-18T00:36:08Z

api/include/opentelemetry/logs/logger.h

+
+protected:
+  // TODO: discuss with community about naming for internal methods.
+  virtual bool EnabledImplementation(Severity /*severity*/,


Instead of having a function which child classes can derive from, I wonder if it makes sense to use a function pointer (similar to the severity level field which can be set by the implementations).

Assume this is for internal implementation so doesn't block the PR.

The virtual function here looks a restricted function pointer which is stored in the virtual table. If we store the pointer here like class Logger, it seems more flexible like not SDK but any other library can provide a function to do real Enabled() work. It might also be more risky if the function pointer is compromised then it can invoke anything as code. The virtual function is safer in this regard as it cannot be patched at runtime, and it is usually in the protected path.

I actually consider the patch as a bless than a curse. (and I don't see why virtual functions cannot be patched)

Virtual function tables are in read only area of memory in modern operation system which cannot be patched normally.

But I think the idea still applies to provider a mechanism for any related part to register its own Enabled() check to the API Logger. Opened #2105 to track this because it is NOT API breaking change.

About the name, EnabledImplementation, discussion is deferred to the SDK patch if I understand correctly.

Using a function callback in a class member instead of a virtual method is asking for too much trouble in my opinion:

what if it changes, should the pointer itself be looked up using atomics, similar to the severity ?

Even after an atomic read, is the pointer still valid anyway (for all we know, it can be in a shared library), so is it safe to invoke ?

In short, the current proposal (virtual method) looks good to me, apart from the name to discuss later.

marcalff

More comments / questions

api/test/logs/logger_benchmark.cc

marcalff · 2023-04-18T04:23:42Z

api/include/opentelemetry/logs/logger.h

+  inline void Trace(const EventId &event_id,
+                    nostd::string_view format,
+                    const common::KeyValueIterable &attributes) noexcept
+  {
+    this->Log(Severity::kTrace, event_id, format, attributes);
+  }


I am confused about the intended usage for this helper:

Is the caller supposed to check Enabled for each call, as in:

if (logger->Enabled(xxx)) { logger->Trace(xxx); }

or is Trace() responsible to check the severity flag itself, as in:

logger->Trace(xxx);

The benchmarks shows both usage, is this code writing un wanted traces in the second case ?
The check for Enabled() seems missing from the code path.

Beside, if the whole point of Logger::minimum_severity_ is to check the flags in the API before making a virtual function call, then why making calls here to virtual functions Log() and EmitLogRecord() ?

Suggested changes:

Call Enabled() in every user facing api.

Do not call a user facing api like Log() from another user facing api like Trace(), call EmitLogRecord directly, so the Enabled() flags are not checked multiple times.

For example:

inline void Trace(xxx) { if (Enabled(xxx)) { EmitLogRecord(xxx); } }

Edit: If EmitLogRecord() is also considered user facing (it is in the spec), then:

inline void Trace(xxx) { if (Enabled(xxx)) { DoEmitLogRecord(xxx); } }

where DoEmitLogRecord() is a protected virtual method.

This will add clarity:

user facing apis are public, inline, not virtual, and test if Enabled() up front

user facing api delegate to not user facing, after the Enabled() test.

not user facing apis are protected, virtual, and do not test Enabled().

Thanks for the suggestion @marcalff .

Call Enabled() in every user facing api.
Enabled() should be called by the user's code instead of from our all our wrappers which is intentional. The idea is to avoid calculating any parameters which is only for logging. In the simple case, common::MakeAttributes({"key1", “value1}, {"key2", "value2"}} takes significant extra time than the Enabled() check. And there could be many other heavy query result which passed as parameters to query. Calling Enabled() in user's code can eliminate such heavy query from the first place.

logger->Trace("some message template", common::MakeAttributes({"some_value_takes_time_calculate", long_time_query()}));

If we are not going to call Enabled() in user facing API, then inline the call of DoLogRecord seems unnecessary for all the wrappers then? I don't have strong preference as the name implied they are just wrappers. But I can also revisit this part in the SDK support because changing the implementation will not break the user facing API.

Please let me know if anything is missed here.

Just want to have a discuss.
We have a log macro like this before in our project.

#define WLOGDEBUG(logger, ...) \ if(logger.check_level(log_level::DEBUG)) { logger->log(log_level::DEBUG, __VA_ARGS__); }

And when we use WLOGDEBUG(logger, "Debug message: {}", protobuf_message.DebugString()); and if the debug log is not enabled, protobuf_message.DebugString() will not be called and it's a high cost function.

Is there any way we can achieve a similar effect?

Perhaps we can define similar macro to help the user reduce boilerplate code, but it can avoid the logging cost. Based on the logging statement you shared, the the variable protobuf_message might be only need for logging, and it could do heavy initialization (like map or load string resources). This happens in the definition of protobuf_message which cannot be be optimized here.

WLOGDEBUG(logger, "Debug message: {}", protobuf_message.DebugString());

While providing macros could help, the pointer in our API is make Enabled check explicit to the user, so they are aware of this and encouraged to do the optimization to archive high performance logging.

Let's continue the discussion about Enabled() in the next patch, for the SDK part.

The fact that the caller MAY use Enabled() to optimize paths that evaluate complex arguments is good.

Still we can't force that the caller MUST use Enabled() all the time, so the implementation will need to check it again at some point.

marcalff · 2023-04-18T04:37:41Z

Yes, SetMinimumSeverity() is not part of the user-facing API. It is intended to SDK or any one which implements this API to use internally.

Thanks for the clarifications.

I can see how multiple threads may use the same Logger instance, if GetLogger() returns a shared object used concurrently.

What is missing here is details about when SetMinimumSeverity() is called.

It is really the case that SetMinimumSeverity() can be called concurrently on a logger already created ? How ?

Could Logger::minimum_severity_ be set in the Logger constructor instead, assuming it is constant for a given logger instance.

This will remove all the atomic code entirely.

lalitb · 2023-04-18T06:08:42Z

api/include/opentelemetry/logs/event_id.h

+  {
+    id_   = id;
+    name_ = nostd::unique_ptr<char[]>{new char[name.length() + 1]};
+    std::copy(name.begin(), name.end(), name_.get());


This goes against the standard practice of not doing any memory copy at the API/SDK surface. The recordable interface design ensures that we can directly serialize the data at exporter, without creating any intermediate copies at the API/SDK level. One solution to avoid this copy operation can be to get rid of EventId class, and pass the event-id (as int64_t) and event-name (as notstd::string_view) directly to Log methods. Also we have only used primitive, or non-owningnostd types as API arguments. Even though we can debate that passing reference/pointer to a user created class mayn't break the ABI, in general good to avoid if we can.

This goes against the standard practice of not doing any memory copy at the API/SDK surface.

No memory will be allocated in our logging APIs. The EventId class is supposed to be created by the user with sort of global lifetime, it so copies the string instead of using any string_view on existing string because it then requires the user to manage the original string object explicitly. The allocation and free of such object is not in the scope of logging API as it only takes const reference on such object.

ThomsonTan · 2023-04-18T09:40:52Z

Yes, SetMinimumSeverity() is not part of the user-facing API. It is intended to SDK or any one which implements this API to use internally.

Thanks for the clarifications.

I can see how multiple threads may use the same Logger instance, if GetLogger() returns a shared object used concurrently.

What is missing here is details about when SetMinimumSeverity() is called.

It is really the case that SetMinimumSeverity() can be called concurrently on a logger already created ? How ?

Could Logger::minimum_severity_ be set in the Logger constructor instead, assuming it is constant for a given logger instance.

This will remove all the atomic code entirely.

Logger::minimum_severity_ will only be available for SDK to config (SDK can choose how to present this config to the user. I am working on this). It is not supposed to be a constant after creation of logger object and SDK can configure it as runtime, then atomic r/w will be required for this.

owent · 2023-04-20T02:44:56Z

Just wondering, do you think we can add a new class to contains these functions?The name event seems different from the event in specification .

ThomsonTan · 2023-04-20T03:12:09Z

Just wondering, do you think we can add a new class to contains these functions? The name event seems different from the event in specification .

The OTel event API is still in experiment, and like the log bridge API, I think it is also not user-facing API. And as per the below spec issue, the log bridge API is reaching stable state which we can build our user-facing API upon.

open-telemetry/opentelemetry-specification#2911

With layering of the architecture, our user-facing API is in the upper layer, so ideally we should consider the requirement of users to write logging code, even we've already review the current bridge API spec and made sure it is sufficient to implement our API. The bridge API is in the middle layer supporting the user facing API, so even in future the event API reach stable, we can just update our SDK like to send logs with event Id to that event API, which will help user logging user to upgrade seamlessly.

Finally, I think provide 2 sets of similar API, for logs and events, to user will cause more confuse than the benefit it gives, and it hinders user to switch between logs and events if do so which adds unnecessary complexity (like if something starts as log, as evolving happens, existing it to log makes sense, but the 2 API and separate class approach will require the user to acquire a new EventLogger and call some totally different methods. But with our current approach, the only change will be adding event_id to the parameter list.

Let me know if I missed anything here.

owent · 2023-04-21T02:44:19Z

api/include/opentelemetry/logs/event_id.h

+
+public:
+  int64_t id_;
+  nostd::unique_ptr<char[]> name_;


Why not use std::string name_; here? The usage of this variable above seems just like std::string, and do not add a tail \0 ?

EventId can be read from both user and SDK code, which requires it to be ABI compatible. std::string cannot guarantee such ABI compatibility requirement across the boundary between user code/SDK so cannot be used. I'd expect there is nostd::string but we don't have it, then switch to primitive char pointer wrapped in nostd::unique_ptr. Let me know if this addresses the concern.

Thanks for the explain, on the other hand, should we set name_[name.length()] = 0; here?

Yes, that makes sense. Set the last element of name_ as 0.

marcalff

Thanks for the clarifications.

Looks ok for now, so approving the PR.

Some items here in fact depends on the intended implementation from the SDK,
so I expect we will be able to revisit particular points later, when not affecting the usage of logger but only the implementation from the logger subclass.

marcalff · 2023-04-21T16:11:58Z

api/include/opentelemetry/logs/logger.h

+
+protected:
+  // TODO: discuss with community about naming for internal methods.
+  virtual bool EnabledImplementation(Severity /*severity*/,


About the name, EnabledImplementation, discussion is deferred to the SDK patch if I understand correctly.

Using a function callback in a class member instead of a virtual method is asking for too much trouble in my opinion:

what if it changes, should the pointer itself be looked up using atomics, similar to the severity ?

Even after an atomic read, is the pointer still valid anyway (for all we know, it can be in a shared library), so is it safe to invoke ?

In short, the current proposal (virtual method) looks good to me, apart from the name to discuss later.

marcalff · 2023-04-21T16:17:59Z

api/include/opentelemetry/logs/logger.h

+  virtual void Log(Severity severity,
+                   int64_t event_id,
+                   nostd::string_view format,
+                   const common::KeyValueIterable &attributes) noexcept
+  {


Do we need each of these helpers to be virtual functions ?

A simple inline function seem sufficient.

marcalff · 2023-04-21T16:21:33Z

api/include/opentelemetry/logs/logger.h

+  inline void Trace(const EventId &event_id,
+                    nostd::string_view format,
+                    const common::KeyValueIterable &attributes) noexcept
+  {
+    this->Log(Severity::kTrace, event_id, format, attributes);
+  }


Let's continue the discussion about Enabled() in the next patch, for the SDK part.

The fact that the caller MAY use Enabled() to optimize paths that evaluate complex arguments is good.

Still we can't force that the caller MUST use Enabled() all the time, so the implementation will need to check it again at some point.

owent

LGTM and thanks.

Add user facing Logging API and Benchmarks

60fb322

ThomsonTan requested a review from a team April 13, 2023 02:18

owent reviewed Apr 13, 2023

View reviewed changes

api/include/opentelemetry/logs/logger.h Show resolved Hide resolved

marcalff reviewed Apr 13, 2023

View reviewed changes

api/include/opentelemetry/common/macros.h Outdated Show resolved Hide resolved

api/include/opentelemetry/logs/severity.h Show resolved Hide resolved

ThomsonTan added 3 commits April 14, 2023 16:59

Make minimum_severity_ mutable for interlocked read/write and add more

a613863

tests for coverage

Merge branch 'main' into LogAPIBenchmark_PR

4f6b360

Format with clang-10

9c5080b

lalitb reviewed Apr 15, 2023

View reviewed changes

api/test/logs/logger_test.cc Show resolved Hide resolved

Fix unused parameters

6ab98fe

ThomsonTan mentioned this pull request Apr 17, 2023

Make OPENTELEMETRY_LIKELY_IF more general #2098

Open

ThomsonTan added 2 commits April 17, 2023 08:36

Remove unused parameter in logger_benchmark

ca51896

Remove extra unused parameters

d2a1a45

marcalff reviewed Apr 17, 2023

View reviewed changes

reyang reviewed Apr 18, 2023

View reviewed changes

marcalff reviewed Apr 18, 2023

View reviewed changes

lalitb reviewed Apr 18, 2023

View reviewed changes

Address feedback

fc92d60

Merge branch 'main' into LogAPIBenchmark_PR

c6e2296

ThomsonTan mentioned this pull request Apr 19, 2023

Provide function pointers in logger as catch call check #2105

Open

Assign explicit number to Severity enum

8e82391

Merge branch 'main' into LogAPIBenchmark_PR

11ced70

owent reviewed Apr 21, 2023

View reviewed changes

End event name with explicit 0

8e9c884

marcalff approved these changes Apr 21, 2023

View reviewed changes

ThomsonTan added 2 commits April 21, 2023 10:34

Update changelog

b9846bf

Merge branch 'main' into LogAPIBenchmark_PR

d6e96af

owent approved these changes Apr 23, 2023

View reviewed changes

lalitb approved these changes Apr 23, 2023

View reviewed changes

ThomsonTan merged commit a39e8b5 into open-telemetry:main Apr 23, 2023

ThomsonTan deleted the LogAPIBenchmark_PR branch April 23, 2023 14:52

ThomsonTan mentioned this pull request Apr 27, 2023

Support new user-facing log API in SDK #2122

Closed

marcalff changed the title ~~Add user facing Logging API and Benchmarks~~ [API] Add user facing Logging API and Benchmarks May 23, 2023

chenrui333 mentioned this pull request May 28, 2023

opentelemetry-cpp 1.9.1 Homebrew/homebrew-core#132204

Closed

ThomsonTan mentioned this pull request Jun 1, 2023

[API] Announce stable logging API #2172

Closed

punya mentioned this pull request Aug 28, 2024

Is formating for log messages actually implemented? #2628

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API] Add user facing Logging API and Benchmarks #2094

[API] Add user facing Logging API and Benchmarks #2094

ThomsonTan commented Apr 13, 2023 •

edited

Loading

codecov bot commented Apr 13, 2023 •

edited

Loading

marcalff left a comment

ThomsonTan commented Apr 17, 2023

reyang Apr 18, 2023 •

edited

Loading

ThomsonTan Apr 18, 2023

reyang Apr 19, 2023

ThomsonTan Apr 19, 2023

marcalff Apr 21, 2023

marcalff left a comment

marcalff Apr 18, 2023 •

edited

Loading

ThomsonTan Apr 18, 2023 •

edited

Loading

owent Apr 20, 2023

ThomsonTan Apr 20, 2023

marcalff Apr 21, 2023

marcalff commented Apr 18, 2023

lalitb Apr 18, 2023 •

edited

Loading

ThomsonTan Apr 18, 2023

ThomsonTan commented Apr 18, 2023

owent commented Apr 20, 2023

ThomsonTan commented Apr 20, 2023

owent Apr 21, 2023

ThomsonTan Apr 21, 2023

owent Apr 21, 2023

ThomsonTan Apr 21, 2023

marcalff left a comment

marcalff Apr 21, 2023

marcalff Apr 21, 2023

marcalff Apr 21, 2023

owent left a comment

[API] Add user facing Logging API and Benchmarks #2094

[API] Add user facing Logging API and Benchmarks #2094

Conversation

ThomsonTan commented Apr 13, 2023 • edited Loading

Changes

codecov bot commented Apr 13, 2023 • edited Loading

Codecov Report

marcalff left a comment

Choose a reason for hiding this comment

ThomsonTan commented Apr 17, 2023

reyang Apr 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcalff left a comment

Choose a reason for hiding this comment

marcalff Apr 18, 2023 • edited Loading

Choose a reason for hiding this comment

ThomsonTan Apr 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcalff commented Apr 18, 2023

lalitb Apr 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThomsonTan commented Apr 18, 2023

owent commented Apr 20, 2023

ThomsonTan commented Apr 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marcalff left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

owent left a comment

Choose a reason for hiding this comment

ThomsonTan commented Apr 13, 2023 •

edited

Loading

codecov bot commented Apr 13, 2023 •

edited

Loading

reyang Apr 18, 2023 •

edited

Loading

marcalff Apr 18, 2023 •

edited

Loading

ThomsonTan Apr 18, 2023 •

edited

Loading

lalitb Apr 18, 2023 •

edited

Loading