Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog v2 analysis ends in Error due to 500 from Datadog #2771

Closed
tukak opened this issue May 10, 2023 · 9 comments · Fixed by #2775
Closed

Datadog v2 analysis ends in Error due to 500 from Datadog #2771

tukak opened this issue May 10, 2023 · 9 comments · Fixed by #2775
Labels
bug Something isn't working

Comments

@tukak
Copy link
Contributor

tukak commented May 10, 2023

Checklist:

  • [ x ] I've included steps to reproduce the bug.
  • [ x ] I've included the version of argo rollouts.

Describe the bug

Switching to Datadog API v2 in analysis causes a 500 error in the AnalysisRun, returning webpage with JS.

To Reproduce

Take working analysis and add apiVersion: v2 to the spec.metrics.[].provider.datadog, ie

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: rollout-test-success-rate
spec:
  metrics:
    - name: success-rate
      interval: 1m
      successCondition: default(result, 0) < 10
      provider:
        datadog:
          apiVersion: v2
          interval: 1m
          query: |
            avg:rollout.test.failures{*}.as_count()

Expected behavior

Finish the Analysis with Failed or Successful, as when the apiVersion: v1 or not set.

Version

argo-rollouts:v1.5.0

Logs

Metric "success-rate" assessed Error due to consecutiveErrors (5) > consecutiveErrorLimit (4): "Error Message: received non 2xx response code: 500

Attached full logs as a file due to size.

rollouts_log.txt

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@tukak tukak added the bug Something isn't working label May 10, 2023
@zachaller
Copy link
Collaborator

@daniddelrio Just letting you know since you implemented this

@alexef
Copy link
Member

alexef commented May 11, 2023

for the record, this is where the feature was implemented: #2592

I am currently trying to reproduce the issue - we reached out to Datadog for more support as the error response is not helpful

@tukak
Copy link
Contributor Author

tukak commented May 11, 2023

Thanks for looking into it. Let me know if you need any input about our setup or any test I can make.

@alexef
Copy link
Member

alexef commented May 11, 2023

@tukak if you can build and validate the fix in #2775 I would appreciate it

@tukak
Copy link
Contributor Author

tukak commented May 11, 2023

@alexef build and tested, now it fails with

received non 2xx response code: 400 {"errors":["Queries before 2010 are invalid"]}

@alexef
Copy link
Member

alexef commented May 11, 2023

awesome. this could mean that the way we pack timestamps is wrong. I'll try to come up with another fix for that

@alexef
Copy link
Member

alexef commented May 11, 2023

@tukak see updated PR with milliseconds

@tukak
Copy link
Contributor Author

tukak commented May 12, 2023

@alexef the updated PR (861c38c) works great in our case. Thanks a lot for the fix

@alexef
Copy link
Member

alexef commented May 12, 2023

@tukak thank you for confirming this works for you. let's wait for @zachaller to review and merge the fixes 🙏

zachaller pushed a commit that referenced this issue May 14, 2023
* Datadog: properly wrap request body

Signed-off-by: Alex Eftimie <[email protected]>

* Use milliseconds in v2 calls to datadog

Signed-off-by: Alex Eftimie <[email protected]>

---------

Signed-off-by: Alex Eftimie <[email protected]>
zachaller pushed a commit that referenced this issue May 24, 2023
* Datadog: properly wrap request body

Signed-off-by: Alex Eftimie <[email protected]>

* Use milliseconds in v2 calls to datadog

Signed-off-by: Alex Eftimie <[email protected]>

---------

Signed-off-by: Alex Eftimie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants