SDK Warmup requests should be hedged with the AvailabilityStrategy #4672

adamnova · 2024-09-11T14:17:46Z

Is your feature request related to a problem? Please describe.
When Cosmos SDK is starting up it makes warmup request for pkranges to some seemingly built in database (the name seems to be base64 encoded) that will fail if the primary Cosmos region is unavailable for any reason. Since this is a read request I believe it should fall under the hedging availability strategy to not block clients to make GET calls even if the primary region is temporarily unavailable.

During testing I specified two application preferred regions and I blocked requests to the first one to simulate an outage. If I make the first request before this simulated outage, everything is working correctly and my ReadItemAsync fallbacks to the secondary region. But if I simulate the outage before the first request the request for pkranges (https://<your-cosmos-db-account>.documents.azure.com/dbs/<database-id>/colls/<collection-id>/pkranges) fails and never fallbacks to the secondary region even though it is a read request.

Describe the solution you'd like
All Cosmos SDK warm-up requests that read from the database should be able to fallback to secondary region.

Describe alternatives you've considered
I do not think there are any alternatives.

The text was updated successfully, but these errors were encountered:

kirankumarkolli · 2024-09-13T01:39:05Z

@adamnova thank you for reporting it.

It's a good one to follow-up on.

Trying to understand the impact you have seen: Is this a livesite issue resulting in unavailability loss for your service? Or test validation?

adamnova · 2024-09-13T07:20:06Z

In this case I ran into it during evaluation of the new AvailabilityStrategy API. But this is something that happens in production as well and increasing startup resiliency is always a good idea to avoid scale-up issues.

microsoft-github-policy-service bot added needs-investigation customer-reported Issue created by a customer labels Sep 11, 2024

kirankumarkolli added this to the Backlog milestone Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK Warmup requests should be hedged with the AvailabilityStrategy #4672

SDK Warmup requests should be hedged with the AvailabilityStrategy #4672

adamnova commented Sep 11, 2024

kirankumarkolli commented Sep 13, 2024

adamnova commented Sep 13, 2024

SDK Warmup requests should be hedged with the AvailabilityStrategy #4672

SDK Warmup requests should be hedged with the AvailabilityStrategy #4672

Comments

adamnova commented Sep 11, 2024

kirankumarkolli commented Sep 13, 2024

adamnova commented Sep 13, 2024