[receiver/datadog] Add support for sketches #34662

carrieedwards · 2024-08-13T18:35:15Z

Description:
This PR adds support for translating Datadog sketches into Exponential Histograms.

The full version of the code can be found in the cedwards/datadog-metrics-receiver-full branch, or in Grafana Alloy: https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver

Link to tracking Issue:
#18278

Testing:
Unit tests, as well as an end-to-end test, have been added.

krajorama · 2024-08-15T17:41:13Z

receiver/datadogreceiver/internal/translator/sketches.go

+
+	negativeBuckets, positiveBuckets, zeroCount := mapSketchBucketsToHistogramBuckets(sketch.K, sketch.N)
+
+	dp.SetZeroCount(zeroCount)


does the zero threshold line up?

I think we should set the minimum value of sketch boundaries, which is I think: e^((1-1338)/(1/gamma)), which is about 9.941..E-10
EDIT: the value is ok, but the math is actually e^((1-1338)/(1/ln(gamma)))

fedetorres93

LGTM

fionaliao · 2024-08-16T09:32:48Z

receiver/datadogreceiver/internal/translator/sketches.go

+		// In some cases, the exponential histogram index that is mapped from the Sketch index corresponds to a bucket that
+		// has a lower bound that is higher than the Sketch bucket's lower bound. In this case, it is necessary to start


Is this possible? We find a histogram index that covers the lower bound with sketchLowerBoundToHistogramIndex(sketchLowerBound) so the histogram bucket's lower bound shouldn't ever be higher than the sketch's lower bound

I think you're right Fiona. We're taking the Frexp of the sketch lower bound (B). There's two possibilities. Either the frac part is 0.5 , in which case it was a power of two and then we use that as the upper bound of the exponential bucket , the lower being lower obviously (for example if you calculate Frexp(2)-> (0.5, 2)-> we return 31, which is the (1.957, 2] bucket.

Otherwise we take the floor(log_2(B)*32) . When you calculate the bucket boundary: 2^(floor(log_2(B)*32)/32) it's going to give a lower number than 2^(log_2(B)*32/32) which is the original number.

That's fair. I think the reason I implemented it to start in the bucket below the histogram index mapped from the sketch index was in case of floating point rounding errors. I can update it to start at the histogram index that is computed from the sketch index, though

krajorama

Great work! I think we do need some tests that specify the expected exponential buckets and their counters to help review of the mapSketchBucketsToHistogramBuckets bit and make sure that all cases are covered. I can think of these:

There's 1 sketch bucket and it fits inside an exponential bucket.
There's 1 sketch bucket and it spans two exponential buckets.
There's 2 sketch buckets and they overlap with a common exponential bucket.
Is it possible to have more overlap?

receiver/datadogreceiver/internal/translator/sketches.go

krajorama · 2024-08-16T14:33:32Z

receiver/datadogreceiver/internal/translator/sketches.go

+
+	negativeBuckets, positiveBuckets, zeroCount := mapSketchBucketsToHistogramBuckets(sketch.K, sketch.N)
+
+	dp.SetZeroCount(zeroCount)


I think we should set the minimum value of sketch boundaries, which is I think: e^((1-1338)/(1/gamma)), which is about 9.941..E-10
EDIT: the value is ok, but the math is actually e^((1-1338)/(1/ln(gamma)))

krajorama · 2024-08-16T15:56:05Z

receiver/datadogreceiver/internal/translator/sketches.go

+// bucket that has a range covering that lower bound
+// See: https://opentelemetry.io/docs/specs/otel/metrics/data-model/#all-scales-use-the-logarithm-function
+func sketchLowerBoundToHistogramIndex(value float64) int {
+	if frac, exp := math.Frexp(value); frac == 0.5 {


optimization for later: client golang uses lookup table after Frexp is called to get the bucket: https://github.com/prometheus/client_golang/blob/989a6d0a1a2543c2d0d8b9a8c524eced9c81fd3d/prometheus/histogram.go#L59

Find the lowest index in the table where frac is higher. E.g.: if value == 1.98, then frac == 0.99, which finds the last index, index=31 and since exp==1, the exponential bucket index is 31+(exp-1) = 31, the bucket is (1.957, 2].

receiver/datadogreceiver/receiver_test.go

krajorama · 2024-08-16T16:11:51Z

receiver/datadogreceiver/internal/translator/sketches.go

+		// In some cases, the exponential histogram index that is mapped from the Sketch index corresponds to a bucket that
+		// has a lower bound that is higher than the Sketch bucket's lower bound. In this case, it is necessary to start


I think you're right Fiona. We're taking the Frexp of the sketch lower bound (B). There's two possibilities. Either the frac part is 0.5 , in which case it was a power of two and then we use that as the upper bound of the exponential bucket , the lower being lower obviously (for example if you calculate Frexp(2)-> (0.5, 2)-> we return 31, which is the (1.957, 2] bucket.

Otherwise we take the floor(log_2(B)*32) . When you calculate the bucket boundary: 2^(floor(log_2(B)*32)/32) it's going to give a lower number than 2^(log_2(B)*32/32) which is the original number.

jpkrohling

I believe this needs a chagenlog entry, given that this feature (metrics handling) is already exposed to end users.

krajorama · 2024-08-28T06:09:28Z

receiver/datadogreceiver/internal/translator/sketches.go

+	if index < 0 {
+		index = -index
+	}
+	return math.Exp((float64(index-agentSketchOffset) / (1 / math.Log(gamma))))


This returns +Inf over an index value of 47096, just copy paste the code bellow into go playground (https://go.dev/play/p/2mnSdR9G7RS).
Causing the mapSketchBucketsToHistogramBuckets to go into an infinite loop.

import ( "fmt" "math" ) func main() { relativeAccuracy := 1.0 / 128 gamma := 1 + 2*relativeAccuracy var agentSketchOffset int32 = 1338 var index int32 = 47096 lowerBound := math.Exp((float64(index-agentSketchOffset) / (1 / math.Log(gamma)))) fmt.Printf("%g\n", lowerBound) }

krajorama · 2024-09-05T12:33:20Z

receiver/datadogreceiver/internal/translator/sketches.go

+	dp.SetMin(sketch.Min)
+	dp.SetMax(sketch.Max)
+	dp.SetScale(scale)
+	dp.SetZeroThreshold(math.Exp(float64(1-agentSketchOffset) / (1 / gamma))) // See https://github.com/DataDog/sketches-go/blob/7546f8f95179bb41d334d35faa281bfe97812a86/ddsketch/mapping/logarithmic_mapping.go#L48


Sorry, I was making the right calculation but commenting wrong . Also see line 266 down below and playground.

Suggested change

dp.SetZeroThreshold(math.Exp(float64(1-agentSketchOffset) / (1 / gamma))) // See https://github.com/DataDog/sketches-go/blob/7546f8f95179bb41d334d35faa281bfe97812a86/ddsketch/mapping/logarithmic_mapping.go#L48

dp.SetZeroThreshold(math.Exp(float64(1-agentSketchOffset) / (1 / math.Log(gamma)))) // See https://github.com/DataDog/sketches-go/blob/7546f8f95179bb41d334d35faa281bfe97812a86/ddsketch/mapping/logarithmic_mapping.go#L48

krajorama

LGTM from technical point of view. There are opportunities for optimizations for sure, but I think more than enough is there to do some field testing. Good job!

* Add output verification to TestDatadogMetricsV1_EndToEnd Signed-off-by: Federico Torres <[email protected]> * Add output verification to TestDatadogMetricsV2_EndToEnd Signed-off-by: Federico Torres <[email protected]> * Add output verification to TestDatadogSketches_EndToEnd Signed-off-by: Federico Torres <[email protected]> * Add output verification to TestDatadogServices_EndToEnd Signed-off-by: Federico Torres <[email protected]> * fmt Signed-off-by: Federico Torres <[email protected]> * Refactor output verifications in E2E receiver tests Signed-off-by: Federico Torres <[email protected]> * Add TestConvertBucketLayout Signed-off-by: Federico Torres <[email protected]> * Add TestMapSketchBucketsToHistogramBuckets Signed-off-by: Federico Torres <[email protected]> --------- Signed-off-by: Federico Torres <[email protected]>

…ckets Signed-off-by: György Krajcsovits <[email protected]>

Signed-off-by: György Krajcsovits <[email protected]>

**Description:** This PR adds support for translating Datadog sketches into Exponential Histograms. Follow up of open-telemetry#33631, open-telemetry#33957 and open-telemetry#34180. The full version of the code can be found in the `cedwards/datadog-metrics-receiver-full` branch, or in Grafana Alloy: https://github.com/grafana/alloy/tree/main/internal/etc/datadogreceiver **Link to tracking Issue:** open-telemetry#18278 **Testing:** Unit tests, as well as an end-to-end test, have been added. --------- Signed-off-by: Federico Torres <[email protected]> Signed-off-by: György Krajcsovits <[email protected]> Co-authored-by: Federico Torres <[email protected]> Co-authored-by: György Krajcsovits <[email protected]>

github-actions bot added cmd/otelcontribcol otelcontribcol command receiver/datadog testbed labels Aug 13, 2024

github-actions bot requested review from boostchicken, gouthamve, jpkrohling and MovieStoreGuy August 13, 2024 18:35

carrieedwards force-pushed the cedwards/datadog-sketches branch from 5303a82 to bf30ac8 Compare August 13, 2024 19:12

carrieedwards marked this pull request as ready for review August 13, 2024 19:12

carrieedwards requested a review from a team August 13, 2024 19:12

github-actions bot assigned mx-psi Aug 13, 2024

carrieedwards force-pushed the cedwards/datadog-sketches branch from bf30ac8 to 1c5adfd Compare August 13, 2024 19:22

mx-psi assigned jpkrohling and unassigned mx-psi Aug 14, 2024

krajorama reviewed Aug 15, 2024

View reviewed changes

fedetorres93 approved these changes Aug 15, 2024

View reviewed changes

fionaliao reviewed Aug 16, 2024

View reviewed changes

krajorama suggested changes Aug 16, 2024

View reviewed changes

carrieedwards requested review from djaglowski, andrzej-stencel, crobert-1, dashpole, atoulme, jmacd, dmitryax, codeboten, fatsheep9146, TylerHelmuth, yurishkuro and mx-psi as code owners August 26, 2024 16:35

jpkrohling removed request for TylerHelmuth, bryan-aguilar, andrzej-stencel and crobert-1 August 27, 2024 08:15

jpkrohling changed the title ~~[chore] [receiver/datadog] Add support for sketches~~ [receiver/datadog] Add support for sketches Aug 27, 2024

jpkrohling reviewed Aug 27, 2024

View reviewed changes

krajorama reviewed Aug 28, 2024

View reviewed changes

This comment was marked as resolved.

Sign in to view

krajorama reviewed Sep 5, 2024

View reviewed changes

krajorama approved these changes Sep 5, 2024

View reviewed changes

carrieedwards force-pushed the cedwards/datadog-sketches branch from af2eb0d to de7f8dc Compare September 5, 2024 15:25

jpkrohling approved these changes Sep 6, 2024

View reviewed changes

carrieedwards and others added 14 commits September 6, 2024 09:34

Add support for sketches

de69dab

Add tests for sketch translation

9e88797

Fix linting

fb707be

Set zero threshold and update bucket mapping logic

e95544d

Add a couple of simple testcases to TestMapSketchBucketsToHistogramBu…

e1aed38

…ckets Signed-off-by: György Krajcsovits <[email protected]>

Add simple overlapping test case

486666d

Signed-off-by: György Krajcsovits <[email protected]>

Changelog

0bec4ea

Handle sketches that contain buckets with invalid indices

618a733

Fix zero threshold formula

963f05b

Format files

e136292

Fix linting

65bd4ed

Fix linting

6db2fe6

Use require.Len

8c0c054

carrieedwards force-pushed the cedwards/datadog-sketches branch from 10ff46f to 8c0c054 Compare September 6, 2024 16:34

jpkrohling merged commit 5e26464 into open-telemetry:main Sep 9, 2024
156 checks passed

github-actions bot added this to the next release milestone Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/datadog] Add support for sketches #34662

[receiver/datadog] Add support for sketches #34662

carrieedwards commented Aug 13, 2024 •

edited

Loading

krajorama Aug 15, 2024

krajorama Aug 16, 2024 •

edited

Loading

krajorama Sep 5, 2024

fedetorres93 left a comment

fionaliao Aug 16, 2024

krajorama Aug 16, 2024

carrieedwards Aug 26, 2024

krajorama left a comment

krajorama Aug 16, 2024 •

edited

Loading

krajorama Aug 16, 2024

krajorama Aug 16, 2024

jpkrohling left a comment

krajorama Aug 28, 2024

krajorama Sep 5, 2024

This comment was marked as resolved.

krajorama Sep 5, 2024

krajorama left a comment


		negativeBuckets, positiveBuckets, zeroCount := mapSketchBucketsToHistogramBuckets(sketch.K, sketch.N)

		dp.SetZeroCount(zeroCount)

		// In some cases, the exponential histogram index that is mapped from the Sketch index corresponds to a bucket that
		// has a lower bound that is higher than the Sketch bucket's lower bound. In this case, it is necessary to start

	dp.SetZeroThreshold(math.Exp(float64(1-agentSketchOffset) / (1 / gamma))) // See https://github.com/DataDog/sketches-go/blob/7546f8f95179bb41d334d35faa281bfe97812a86/ddsketch/mapping/logarithmic_mapping.go#L48
	dp.SetZeroThreshold(math.Exp(float64(1-agentSketchOffset) / (1 / math.Log(gamma)))) // See https://github.com/DataDog/sketches-go/blob/7546f8f95179bb41d334d35faa281bfe97812a86/ddsketch/mapping/logarithmic_mapping.go#L48

[receiver/datadog] Add support for sketches #34662

[receiver/datadog] Add support for sketches #34662

Conversation

carrieedwards commented Aug 13, 2024 • edited Loading

Choose a reason for hiding this comment

krajorama Aug 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fedetorres93 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krajorama left a comment

Choose a reason for hiding this comment

krajorama Aug 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpkrohling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

krajorama left a comment

Choose a reason for hiding this comment

carrieedwards commented Aug 13, 2024 •

edited

Loading

krajorama Aug 16, 2024 •

edited

Loading

krajorama Aug 16, 2024 •

edited

Loading