Tempo/Vulture: GetObject S3 API operation - costs. #3700

lukasmrtvy · 2024-05-22T09:16:03Z

Hey,
This one is probably not a bug, but a finding that the Vulture is quite expensive regarding S3 costs.

Adding the screenshot from AWS Cost Explorer ( May-19: Vulture was deployed , May-21: Vulture was undeployed ). The difference is a whopping ~$97 per day. This was tested in an environment with 1-2 ppl not actively querying the Tempo.

At the same time, I find Vulture useful as a consistency-checking tool, but honestly, I am not sure if it's worth the price, Thoughts?

Tempo ( distributed ): helm-chart 1.9.9
Vulture: helm-chart 0.4.1

Thanks

EDIT:
The cost for May-21 StandarStorage is probably not correct ( It will be ~$20 ), it takes some time to propagate correctly in AWS, but still it's a huge difference.

Updated:

joe-elliott · 2024-05-22T11:41:51Z

I honestly have no idea how much vulture costs per day. Two options that may help reduce spend:

Increase the time between calls by using these params:

https://github.com/grafana/tempo/blob/main/cmd/tempo-vulture/main.go#L71-L72

Add a bloom/footer cache to reduce GETs on trace by id lookup.

bmteller · 2024-06-06T09:23:23Z

We noticed a similar issue with some of our internal tooling which was doing trace-id lookup and if you know the timestamp where the trace occurs then you can significantly reduce the number of blocks that tempo will have to process by specifying a start and end window around the timestamp where the trace would be created. We have ~300 blocks so that would be 300 requests to s3 for all of the bloom filter lookups. When specifying a window +/- 20 minutes around the trace this reduced the number of blocks checked to 1. Though, it's always possible the window could straddle two blocks then it would check 2 blocks. These blocks were also in the past so I suspect for something like vulture using a window would mean the blocks would not be checked at all because the trace would not be committed to any block yet.

https://github.com/grafana/tempo/blob/main/cmd/tempo-vulture/main.go#L469

This was referenced Jul 17, 2024

feat: improve trace id lookup by setting a date range #3873

Closed

feat: improve trace id lookup by defining a time range #3874

Merged

joe-elliott closed this as completed in #3874 Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tempo/Vulture: GetObject S3 API operation - costs. #3700

Tempo/Vulture: GetObject S3 API operation - costs. #3700

lukasmrtvy commented May 22, 2024 •

edited

Loading

joe-elliott commented May 22, 2024

bmteller commented Jun 6, 2024

Tempo/Vulture: GetObject S3 API operation - costs. #3700

Tempo/Vulture: GetObject S3 API operation - costs. #3700

Comments

lukasmrtvy commented May 22, 2024 • edited Loading

joe-elliott commented May 22, 2024

bmteller commented Jun 6, 2024

lukasmrtvy commented May 22, 2024 •

edited

Loading