-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
experiments: recovered from panic, on v1.4 #2608
Comments
@Lykathia Can you try v1.4.1 |
@zachaller yep, PR in - will get it reviewed and tested on Monday and report back. Thanks! |
1.4.1 doesn't not appear to have addressed the issue.
|
@Lykathia I would love to see your rollout object with the status field intact if you could share that it would be awesome. I also have a question if you are trying to use a weight on your experiment which I should be able to gather from the sharing of your ro object. |
Sure thing! Some nouns edited and some annotations / labels removed apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
annotations:
argocd.argoproj.io/sync-wave: '10'
rollout.argoproj.io/revision: '46'
rollout.argoproj.io/workload-generation: '52'
creationTimestamp: '2022-06-03T17:50:51Z'
generation: 39
labels:
app.kubernetes.io/instance: production-eks-prod-examplens-example
name: example-rollout-c89ac2e8
namespace: examplens
resourceVersion: '614921626'
uid: 86e21dfc-7adf-40f5-a9de-b5d368b0a843
spec:
analysis:
successfulRunHistoryLimit: 1
unsuccessfulRunHistoryLimit: 1
progressDeadlineAbort: true
replicas: 2
restartAt: '2022-11-09T19:40:00Z'
revisionHistoryLimit: 5
strategy:
canary:
steps:
- experiment:
analyses:
- args:
- name: api-root-url
value: >-
http://example-rollout-preview.examplens.svc.cluster.local:8080
- name: request-timeout-milliseconds
value: '1000'
name: example-smoke-test-analysis
requiredForCompletion: true
templateName: example-smoke-test-analysis
duration: 5m
templates:
- metadata:
labels:
app.kubernetes.io/name: example-rollout-preview
name: preview
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: example-rollout-preview
specRef: canary
- setWeight: 5
- pause:
duration: 10m
trafficRouting:
istio:
destinationRule:
canarySubsetName: canary
name: example-mesh-destination-c833f030
stableSubsetName: stable
virtualService:
name: example-mesh-virtualservice-c8e2aa97
workloadRef:
apiVersion: apps/v1
kind: Deployment
name: example-svc-deployment-c8e30830
status:
HPAReplicas: 2
availableReplicas: 2
blueGreen: {}
canary:
weights:
canary:
podTemplateHash: 5ccdfd574c
weight: 0
stable:
podTemplateHash: 5ccdfd574c
weight: 100
conditions:
- lastTransitionTime: '2023-02-27T14:45:05Z'
lastUpdateTime: '2023-02-27T14:45:05Z'
message: Rollout has minimum availability
reason: AvailableReason
status: 'True'
type: Available
- lastTransitionTime: '2023-02-27T15:12:21Z'
lastUpdateTime: '2023-02-27T15:12:21Z'
message: Rollout is paused
reason: RolloutPaused
status: 'False'
type: Paused
- lastTransitionTime: '2023-02-27T15:12:32Z'
lastUpdateTime: '2023-02-27T15:12:32Z'
message: RolloutCompleted
reason: RolloutCompleted
status: 'True'
type: Completed
- lastTransitionTime: '2023-02-27T15:13:02Z'
lastUpdateTime: '2023-02-27T15:13:02Z'
message: Rollout is healthy
reason: RolloutHealthy
status: 'True'
type: Healthy
- lastTransitionTime: '2023-02-27T15:12:21Z'
lastUpdateTime: '2023-02-27T15:13:02Z'
message: >-
ReplicaSet "example-rollout-c89ac2e8-5ccdfd574c" has successfully
progressed.
reason: NewReplicaSetAvailable
status: 'True'
type: Progressing
currentPodHash: 5ccdfd574c
currentStepHash: 75dcbd8f68
currentStepIndex: 3
observedGeneration: '39'
phase: Healthy
readyReplicas: 2
replicas: 2
restartedAt: '2022-11-09T19:40:00Z'
selector: app.kubernetes.io/name=example-svc-c8333369
stableRS: 5ccdfd574c
updatedReplicas: 2
workloadObservedGeneration: '52' everything seems to complete fine and as expected, just the logs are exploding w/ the NPEs. |
This issue is stale because it has awaiting-response label for 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue is stale because it has been open 60 days with no activity. |
Maybe related #2734 |
This issue is stale because it has been open 60 days with no activity. |
I saw similar exceptions occurring when calculateWeightDestinationsFromExperiment was called. My guess was that Argo was calling this function to determine the weight the experiment should have and it was returning I solved it by adding a |
When running an experiment, a number of nil pointer exceptions are thrown - before the experiment eventually passes and moseys on its merry way.
We use istio in our environment.
It seems to resolve itself after a dozen+ failures.
To Reproduce
Expected behavior
The logs to not raise errors of nil pointer exceptions
Version
1.4
Logs
The text was updated successfully, but these errors were encountered: