Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/elasticsearch] Flesh out encoding log records with mapping.mode: ecs #31694

Merged
merged 35 commits into from
May 7, 2024

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Mar 12, 2024

Description:

This PR is a follow up to #31553. In that PR, the foundations were laid for encoding log records with ECS fields when the user specifies mapping.mode: ecs in their exporter configuration.

This PR adds ECS conversions for several additional fields.

I expect there will be more followup PR(s) to add more conversions, especially ones that might not be straightforward.

Link to tracking issue: #29742

Testing: Added unit tests

Documentation: The mapping.mode: ecs configuration setting is already documented.

Comment on lines 85 to 86
semconv.AttributeServiceName: "service.name",
semconv.AttributeServiceVersion: "service.version",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we're mapping it to the same value here. I think we can omit mappings where SemConv and ECS align.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ done in 3ea4512c3e3451d73708fbed7d0ff2576cf3b304.

@@ -137,3 +182,17 @@ func scopeToAttributes(scope pcommon.InstrumentationScope) pcommon.Map {
}
return attrs
}

func mapLogAttributesToECS(document *objmodel.Document, attrs pcommon.Map, conversionMap map[string]string) {
attrs.Range(func(k string, v pcommon.Value) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the mappings so far, I'd expect the conversionMap to be smaller than the attributes. If you agree with that statement, it might make sense to iterate over the conversionMap instead. This may reduce the overhead a bit by having to iterate over fewer values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally pcommon.Map is an array of key-value pairs (https://github.com/open-telemetry/opentelemetry-collector/blob/f0473ca0a55293b92374cc77e0275026537342da/pdata/pcommon/map.go#L26), so even if conversionMap is smaller we'd likely end up iterating attrs multiple times that way.

Copy link
Contributor Author

@ycombinator ycombinator Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special case: I did add an early return for the special case where conversionMap was empty: 47905d4e98ddccb243dcc3eadd2ac8ed1dc6e097.

@ycombinator ycombinator marked this pull request as ready for review March 12, 2024 21:19
@ycombinator ycombinator requested review from a team and atoulme and removed request for axw and felixbarny March 12, 2024 21:19
Copy link
Contributor

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, just a few comments so far.

resourceAttrsConversionMap := map[string]string{
semconv.AttributeServiceInstanceID: "service.node.name",
semconv.AttributeDeploymentEnvironment: "service.environment",
semconv.AttributeTelemetrySDKName: "agent.name",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In apm-data we've been setting agent.name to ${telemetry.sdk.name}/${telemetry.sdk.language}, when both are set. Should we be doing the same here? Also with elastic/apm-data@0e072ce we include telemetry.distro.name. If I were starting from scratch I wouldn't do that, as it's redundant information...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented this logic, including that for telemetry.distro.name, in 4b9cca0bdf. Would you mind taking a look at the unit test cases, particularly the ones in TestEncodeLogECSModeAgentName to make sure I got the logic right? Thanks!

exporter/elasticsearchexporter/model.go Outdated Show resolved Hide resolved
semconv.AttributeHostName: "host.hostname",
semconv.AttributeHostArch: "host.architecture",
semconv.AttributeProcessExecutablePath: "process.executable",
semconv.AttributeOSType: "os.platform",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be vetting the values here? ECS expects os.platform to have one of a restricted set of values -- https://www.elastic.co/guide/en/ecs/current/ecs-os.html#field-os-type

For iOS and Android, I think we should set os.platform from os.name ("Android" or "iOS") rather than os.type ("linux" or "darwin")

See https://github.com/elastic/apm-data/blob/0e072ceaadae2ef8f61d75c3a27c47cc5ce81257/input/otlp/metadata.go#L320-L338

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in d36a76fff8.

var document objmodel.Document

// First, try to map resource-level attributes to ECS fields.
resourceAttrsConversionMap := map[string]string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's missing is the translation of k8s.* SemConv resource attributes. See also https://github.com/elastic/apm-data/blob/0e072ceaadae2ef8f61d75c3a27c47cc5ce81257/input/otlp/metadata.go#L145-L165

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I didn't include them as they're not really part of ECS. But as per the off-PR discussion with @axw and you, I will include them now for practical reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 468043f0494859176c0fd89d3c1d6c62466068d2.

Comment on lines 84 to 97
resourceAttrsConversionMap := map[string]string{
semconv.AttributeServiceInstanceID: "service.node.name",
semconv.AttributeDeploymentEnvironment: "service.environment",
semconv.AttributeTelemetrySDKName: "agent.name",
semconv.AttributeTelemetrySDKVersion: "agent.version",
semconv.AttributeTelemetrySDKLanguage: "service.language.name",
semconv.AttributeCloudPlatform: "cloud.service.name",
semconv.AttributeContainerImageTags: "container.image.tag",
semconv.AttributeHostName: "host.hostname",
semconv.AttributeHostArch: "host.architecture",
semconv.AttributeProcessExecutablePath: "process.executable",
semconv.AttributeOSType: "os.platform",
semconv.AttributeOSDescription: "os.full",
}
Copy link
Member

@gregkalapos gregkalapos Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping of resource level attributes can be shared across logs, trace, and metrics(not yet mapped in the exporter). Maybe some of the record level mapping can be shared as well.

Currently this code is in the encodeLogECSMode function.

No strong opinion from my side, and for the PR this is fine, but I think mid-term we should think about start sharing this logic across signals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this mapping out of this function and into package scope in bd90fbf4076586501b57b7799e6b12797cd30828 so, in the future, conversion code for other signals can use it too.

@ycombinator
Copy link
Contributor Author

@axw @felixbarny @gregkalapos Thank you for your feedback. I have addressed all of it now and this PR is ready for another round of review. Thanks!

Comment on lines 269 to 273
// Set service language name in document.
if telemetrySdkLanguage == "" {
telemetrySdkLanguage = "unknown"
}
document.AddString("service.language.name", telemetrySdkLanguage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Set service language name in document.
if telemetrySdkLanguage == "" {
telemetrySdkLanguage = "unknown"
}
document.AddString("service.language.name", telemetrySdkLanguage)

One thing I think we should not carry across from apm-data is setting service.language.name when telemetry.sdk.language is unspecified. That's only really meaningful for application observability, and not all logs or metrics.

Maybe we can just skip setting service.language.name altogether for now, since it's included in agent.name?

@felixbarny WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I think we should not carry across from apm-data is setting service.language.name when telemetry.sdk.language is unspecified.

Sounds good. But maybe we have to adapt some of the assumptions we have in the APM UI about this field always being present.

Maybe we can just skip setting service.language.name altogether for now, since it's included in agent.name?

The service.language.name attribute is used a lot in our UI. Since it directly translates to service.language.name I think I'd just add it to the resourceAttrsConversionMap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The service.language.name attribute is used a lot in our UI. Since it directly translates to service.language.name I think I'd just add it to the resourceAttrsConversionMap.

This PR is focused on logs though, and I think the service.language.name usage in the (APM) UI is not relevant here. Could we start without it, and add it in later if there's a need?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed code for setting service.language.name in cb5a3b88bb.

semconv.AttributeTelemetrySDKName: "",
semconv.AttributeTelemetryDistroName: "",
semconv.AttributeTelemetrySDKLanguage: "",
semconv.AttributeTelemetrySDKVersion: "agent.version",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
semconv.AttributeTelemetrySDKVersion: "agent.version",
semconv.AttributeTelemetrySDKVersion: "",

We'll need to handle telemetry.distro.version too: agent.version gets set to telemetry.distro.version if specified, and telemetry.sdk.version otherwise. Can't say I love this behaviour, but it's what exists in ECS and apm-data now...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in 04a681bd49.

semconv.AttributeServiceInstanceID: "service.node.name",
semconv.AttributeDeploymentEnvironment: "service.environment",
semconv.AttributeTelemetrySDKName: "",
semconv.AttributeTelemetryDistroName: "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
semconv.AttributeTelemetryDistroName: "",
semconv.AttributeTelemetryDistroName: "",
semconv.AttributeTelemetryDistroVersion: "",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented as part of 04a681bd49.

Copy link
Contributor

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@JaredTan95
Copy link
Member

wait #31553 get merged.

@andrzej-stencel andrzej-stencel merged commit 3987074 into open-telemetry:main May 7, 2024
162 checks passed
@github-actions github-actions bot added this to the next release milestone May 7, 2024
@ycombinator ycombinator deleted the exp-es-ecs-r2 branch May 7, 2024 07:58
TylerHelmuth pushed a commit that referenced this pull request May 7, 2024
…#32917)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Fix a flaky test from #31694 due to non-deterministic JSON key order.

**Link to tracking Issue:** Closes #32910
@BenEfrati
Copy link

BenEfrati commented May 27, 2024

Hi @ycombinator , @andrzej-stencel

There is a regression after this change - duplicate key in json:
before this change request to ELK was:

{
...
   "host":{
      "name":"hostname"
   },
   "os":{
      "description":"Windows 11 10.0",
      "type":"windows"
   },
...
}

after this change:

{
...
   "host":{
      "hostname":"hostname",
      "os":{
         "full":"Windows 11 10.0",
         "platform":"windows"
      }
   },
...
 "host":{
      "os":{
         "type":"windows"
      }
   },
...
}

in this example host key is duplicated, result in invalid json and error from ELK

 "error": {
  "reason": "failed to parse",
  "type": "mapper_parsing_exception"
  },
  "status": 400

Using same resource attributes:

Resource attributes:
host.name: Str(hostname)
os.description: Str(Windows 11 10.0)
os.type: Str(windows)

this is from debug exporter

// The JSON parsing in Elasticsearch does not support parsing JSON documents
// with duplicate fields. The fields in the docuemt can be sort and duplicate entries

if ecsHostOsType == "" {
return
}
document.AddString("host.os.type", ecsHostOsType)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case host is already part of document it'll result in duplicate host key in json request:
Resource attributes:

host.name: Str(hostname)
os.description: Str(Windows 11 10.0)
os.type: Str(windows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants