Datadog v2 API fails when using formulas in query #2813

tukak · 2023-05-26T10:04:09Z

Checklist:

I've included steps to reproduce the bug.
I've included the version of argo rollouts.

Describe the bug

Using formulas in AnalysisTemplate with Datadog v2 API fails with API error

received non 2xx response code: 400 {"errors":["Functions are not supported in metrics data source field. Use the formulas payload."]}

To Reproduce

Use a query with formula in Datadog provider for AnalysisTemplate and set apiRevision to v2. The query works with v1.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: rollout-test-success-rate
spec:
  metrics:
    - name: success-rate
      interval: 1m
      successCondition: default(result, 0) < 10
      provider:
        datadog:
          apiVersion: v2
          interval: 1m
          query: |
            sum:trace.rack.request.hits{env:staging AND service:service_name AND http.status_code:5*}.as_count() /
            sum:trace.rack.request.hits{env:staging AND service:service_name AND http.status_code:2*}.as_count() * 100

Expected behavior

It should work with v2 as well, or the note in https://argo-rollouts.readthedocs.io/en/stable/analysis/datadog/ should warn about using formulas (now it says that "If you switch to v2, you will not need to change any other field aside from apiVersion.")

Version

1.5.1

Logs

time="2023-05-26T10:02:02Z" level=info msg="Taking 1 Measurement(s)..." analysisrun=rollout-test-6b986bcd5b-7.1 namespace=default
time="2023-05-26T10:02:02Z" level=info msg="Measurement Completed. Result: Error" analysisrun=rollout-test-6b986bcd5b-7.1 metric=success-rate namespace=default
time="2023-05-26T10:02:02Z" level=warning msg="Measurement had error: received non 2xx response code: 400 {\"errors\":[\"Functions are not supported in metrics data source field. Use the formulas payload.\"]}" analysisrun=rollout-test-6b986bcd5b-7.1 metric=success-rate namespace=default

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

The text was updated successfully, but these errors were encountered:

alexef · 2023-06-02T07:41:39Z

Formulas also need separate named queries, so in your case it could look like this:

   provider:
        datadog:
          apiVersion: v2
          interval: 1m
          queries:
            a: sum:trace.rack.request.hits{env:staging AND service:service_name AND http.status_code:5*}.as_count()
            b: sum:trace.rack.request.hits{env:staging AND service:service_name AND http.status_code:2*}.as_count() * 100
          formula: a/b

this is NOT supported yet by argo-rollouts, I will update the docs accordingly.

alexef · 2023-06-02T08:34:05Z

created: #2819

Update datadog.md - clarify formulas Signed-off-by: Alex Eftimie <[email protected]>

meeech · 2023-07-13T11:23:54Z

I'm interested in grabbing this one if no one has yet

joey100 · 2023-07-17T07:53:26Z

Formulas also need separate named queries, so in your case it could look like this:

   provider:
        datadog:
          apiVersion: v2
          interval: 1m
          queries:
            a: sum:trace.rack.request.hits{env:staging AND service:service_name AND http.status_code:5*}.as_count()
            b: sum:trace.rack.request.hits{env:staging AND service:service_name AND http.status_code:2*}.as_count() * 100
          formula: a/b

this is NOT supported yet by argo-rollouts, I will update the docs accordingly.

May I know any plan to support this, when will we be able to use this? This is important if we switch to api version v2, there are lots of functions and formulas will be used in many datadog metrics queries. Otherwise we won't be able to use api version v2.

meeech · 2023-07-17T08:07:18Z

@joey100 just picked up the ticket the other day, so I can't give you a timeframe. that being said, one reason I picked up the ticket is we rely on formulas as well and would like to move to v2.
Will add details once i have some to share

Sineaggi · 2023-07-21T00:34:51Z

Related, would it also make sense to use the datadog-api-client-go library to make our calls to Datadog?

meeech · 2023-08-06T23:32:19Z

@Sineaggi I'll look into that as well and see if it makes sense

meeech · 2023-08-29T17:32:02Z

Adding more context: I will follow up on this with datadog.

The reason we think v1 is deprecated is based on @alexef expeirence
(from slack)
"I don’t have a link, we were getting 429 throttled by the API, and when asking for a rate limit increase, we were told to switch to v2 as they can increase the rate there, but not on v1"
"in communication to datadog support they referred to v1 as the “legacy one”, that’s all I have"

I will follow up with DD, since when working with their go lib, I get warning "2023/08/29 13:21:49 WARNING: Using unstable operation 'v2.QueryTimeseriesData'" so I want to get all this info sorted so we have clear guidance.

aleksey-360 · 2023-10-11T17:08:21Z

What is the status of this issue?

meeech · 2023-10-12T15:16:34Z

@asmyrnov360 PR is ready and itching to be reviewed :D

* Add note in CONTRIBUTING.md that I would have found useful. Signed-off-by: mitchell amihod <[email protected]> * Fix gen-openapi to be more portable - make sure it includes the GOPATH in the call. Signed-off-by: mitchell amihod <[email protected]> * Update docs. * Expand working with v2 information * Contacted Datadog to get latest info re: v1 deprecation, api limits. * Add tips about rate limits, using helm for templates * Add more example templates Signed-off-by: mitchell amihod <[email protected]> * Update Datadog Analysis Type * Make Query omitempty since now possible it won't exist * Add some descriptions * Add new properties we need for v2 Queries: We can pass in key:query for queries Formula: Makes formulas using the keys from queries * Defaults! Use annotations to declare defaults for some fields. This lets us remove some guard rails from the code itself Interval: 5m - Move this from code to here ApiVersion: v1 - Move this from code to here * Enums! Much like defaults, having enums lets us make assumptions about the incoming metric so we dont need as many guardrails. ApiVersion: Enum to restrict to v1 or v2 Signed-off-by: mitchell amihod <[email protected]> * Output of make codegen Everything looks ok. Signed-off-by: mitchell amihod <[email protected]> * Pass in metric to provider factory. Validate metric. Validating the metric on initialization, rather than spread out throughout. You get earlier feedback if you have a bad metric defined. (Not perfect, but there's limitations with our annotation generator for the rules in the crd. eg: If we could use oneOf, we wouldn't need a lot of this validation) We check all the mutually exclusive props. The props where one requires another. We don't have to check for defaults and set them anymore, since they are guaranteed by the crd. rules: - ensure we have only query OR queries - restrict v1 to query only - make sure you only provide a formula with queries - make sure multiple queries are accompanied by a formula Signed-off-by: mitchell amihod <[email protected]> * Remove DefaultApiVersion, remove impossible AnalysisPhaseError ApiVersion is guaranteed to have value, and the enum ensures its v1/v2 when user provided. Updated v1 tests to reflect some of these new realities Signed-off-by: mitchell amihod <[email protected]> * extract urlBuilding from run run was getting a bit long according to the checks Signed-off-by: mitchell amihod <[email protected]> * Remove some unnecessary stuff for interval. It is a straightline to initialize since default is set to 5m for incoming metrics where it is not set. Signed-off-by: mitchell amihod <[email protected]> * Update createRequest Split into createRequest v1/v2 v1 : pretty much unchanged. just extracted v2: support for v2/query/scalar We don't need all the timeseries. I did some testing fetching both scalar and timeseries, and they pretty much lined up. Also confirmed with DD: From support ticket with DD: "...I have also tested the scalar api endpoint with the last aggregator as well as the timeseries api endpoint. They do indeed return the same values when retrieving the values via the api endpoints. Observing the time it takes to retrieve the values, they remain relatively the same..." re: query + v2: Keep backwards compat. if we get in a query, we turn it into a queries object to pass on to requestv2 queries into the QueriesPayload. Signed-off-by: mitchell amihod <[email protected]> * Handle v2 scalar responses * update the datadogResponseV2 for scalar values * handle no results so it has parity with v1 - empty will now usually result in `[]` unless something goes very wrong on dd side Signed-off-by: mitchell amihod <[email protected]> * update v1 no data tests to better reflect reality Signed-off-by: mitchell amihod <[email protected]> * update v2 tests * add some new test cases * update mock server to handle queries / formulas validation * update no data tests to reflect reality * stop all values being the same. it makes it difficult to know find which test case failed. move meaning from comments into the metric name. * stop using deprecated ioutil Signed-off-by: mitchell amihod <[email protected]> * re-codegen Signed-off-by: zachaller <[email protected]> * fix lint Signed-off-by: zachaller <[email protected]> --------- Signed-off-by: mitchell amihod <[email protected]> Signed-off-by: zachaller <[email protected]> Co-authored-by: zachaller <[email protected]>

dieend · 2024-02-13T18:54:13Z

Hi, when will this change included in a release?

We're eager to use it and would love it in the next available version, otherwise we have to fork the repo and build our own image

meeech · 2024-02-13T19:58:44Z

It should go out with 1.7

aleksey-360 · 2024-05-28T00:47:34Z

Any news when this is expected to be available?

tukak added the bug Something isn't working label May 26, 2023

tukak changed the title ~~Datadog V2 API fails when using formulas in query~~ Datadog v2 API fails when using formulas in query May 26, 2023

zachaller pushed a commit that referenced this issue Jun 6, 2023

docs: Update datadog.md - clarify formulas #2813 (#2819)

b853c29

Update datadog.md - clarify formulas Signed-off-by: Alex Eftimie <[email protected]>

kostis-codefresh added the analysis Related to Analysis CRD label Jun 7, 2023

tico24 assigned tico24 and unassigned tico24 Jul 13, 2023

zachaller assigned meeech Jul 13, 2023

meeech mentioned this issue Sep 2, 2023

fix(metricprovider): support Datadog v2 API Fixes #2813 #2997

Merged

6 tasks

zachaller closed this as completed in #2997 Oct 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datadog v2 API fails when using formulas in query #2813

Datadog v2 API fails when using formulas in query #2813

tukak commented May 26, 2023 •

edited

Loading

alexef commented Jun 2, 2023

alexef commented Jun 2, 2023

meeech commented Jul 13, 2023

joey100 commented Jul 17, 2023

meeech commented Jul 17, 2023

Sineaggi commented Jul 21, 2023

meeech commented Aug 6, 2023

meeech commented Aug 29, 2023

aleksey-360 commented Oct 11, 2023

meeech commented Oct 12, 2023

dieend commented Feb 13, 2024 •

edited

Loading

meeech commented Feb 13, 2024

aleksey-360 commented May 28, 2024

Datadog v2 API fails when using formulas in query #2813

Datadog v2 API fails when using formulas in query #2813

Comments

tukak commented May 26, 2023 • edited Loading

alexef commented Jun 2, 2023

alexef commented Jun 2, 2023

meeech commented Jul 13, 2023

joey100 commented Jul 17, 2023

meeech commented Jul 17, 2023

Sineaggi commented Jul 21, 2023

meeech commented Aug 6, 2023

meeech commented Aug 29, 2023

aleksey-360 commented Oct 11, 2023

meeech commented Oct 12, 2023

dieend commented Feb 13, 2024 • edited Loading

meeech commented Feb 13, 2024

aleksey-360 commented May 28, 2024

tukak commented May 26, 2023 •

edited

Loading

dieend commented Feb 13, 2024 •

edited

Loading