You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched the issues and found no similar issues.
KubeRay Component
ci
What happened + What you expected to happen
If running make test several times, the following error may occur(409 error appears 14 times in 100 runs):
Status: "Failure",
Message: "Operation cannot be fulfilled on rayservices.ray.io \"rayservice-sample\": the object has been modified; please apply your changes to the latest version and try again",
Reason: "Conflict",
Details: {
Name: "rayservice-sample",
Group: "ray.io",
Kind: "rayservices",
UID: "",
Causes: nil,
RetryAfterSeconds: 0,
},
Code: 409,
Kubernetes leverages the concept of resource versions to achieve optimistic concurrency. All Kubernetes resources have a "resourceVersion" field as part of their metadata. This resourceVersion is a string that identifies the internal version of an object that can be used by clients to determine when objects have changed. When a record is about to be updated, its version is checked against a pre-saved value, and if it doesn't match, the update fails with a StatusConflict (HTTP status code 409).
It is believed that this error is due to changes made by others(like another client) between the last Get and Update.
In the case of a conflict, the correct client action at this point is to GET the resource again, apply the changes afresh, and try submitting again.
It suggests using the retry strategy(though it emphasizes more on first reading and then writing. While it is doubtful to apply this strategy in the operator(may be better to just fail the update operation and let the decision be made in the next reconciliation), In my own view, it is good to use it in the test.
So, a possible solution(need to discuss) is to use RetryOnConflict for every update operation in the test.
How do others deal with Update?
client-go already has a helper function: RetryOnConflict and an example that use the above strategy. RetryOnConflict will retry until the timeout or the error code is not 409.
Search before asking
KubeRay Component
ci
What happened + What you expected to happen
If running
make test
several times, the following error may occur(409 error appears 14 times in 100 runs):From the k8s api-conventions docuemnt:
It is believed that this error is due to changes made by others(like another client) between the last Get and Update.
From k8s api-conventions docuemnt:
It suggests using the retry strategy(though it emphasizes more on first reading and then writing.
While it is doubtful to apply this strategy in the operator(may be better to just fail the update operation and let the decision be made in the next reconciliation), In my own view, it is good to use it in the test.
So, a possible solution(need to discuss) is to use RetryOnConflict for every update operation in the test.
How do others deal with Update?
client-go already has a helper function: RetryOnConflict and an example that use the above strategy. RetryOnConflict will retry until the timeout or the error code is not 409.
azure-databricks-operator(also link) does in a normal way(no retry just call update and see if success).
Reproduction script
running
make test
several times.Anything else
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: