You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
During a regional outage, CDB client was getting 503 error from the service in the primary region. CDB is configured with 3 read replicas, but it did not fallback to other replicas until the primary region is marked offline. It was getting 503 error for an hour or so.
During investigation, it was realized that the property ApplicationRegion or ApplicationPreferredRegions has to be set for the reads to fallback during 503. Else it just considers only the primary region for the reads. https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/troubleshoot-sdk-availability
So there is a difference in client behavior to 503 based on the ApplicationRegion set vs not set. Moreover, it is not very intuitive and doesn't consider multiple replicas configured in the database.
Describe the solution you'd like
Solution 1:
“If Application region is set, the reads go to preferred available region closer to application region. If Application region is not set, the reads go to preferred available database region in order of failover priority (configured in the database)”.
This ensures consistent behavior whether AppRegion is set or not; improved success rate. potential increase in latency during failover.
Describe alternatives you've considered
Solution 2:
Given the importance of regional preferences (ApplicationRegion or ApplicationPreferredRegions) for the client to be resilient to regional outage, could this be made a required field during client initialization?
This ensures the callers will set this information by default, so it won't get missed and behavior becomes similar in all scenarios.
Additional context
More context to Solution 1:
<style>
</style>
Scenario
User Action
Current Behavior
Proposed Change
Expected Outcome
1
User sets AppRegion
All read regions are populated as preferred, sorted by proximity to AppRegion. Fallback to next closer region on 503.
2
User does not set AppRegion
SDK defaults to first DB region (irrespective of the latency). No fallback on 503.
SDK uses first DB region and considers all DB regions for fallback on 503 (in the order of failover priority)
Consistent behavior whether AppRegion is set or not; improved success rate. potential increase in latency during failover.
For an example,
Consider a Cosmos DB instance 'CDB A' with three regions: East US (EUS) with priority 0, West US (WUS) with priority 1, and West Europe (WEU) with priority 2:
Currently, if a user specifies the ApplicationRegion as EUS, the SDK populates all read regions as preferred regions, sorted by their proximity to EUS. In the event of a 503 error from EUS, the SDK falls back to WUS, the next closest region.
If the user does not specify the ApplicationRegion, the SDK defaults to the first database region, EUS. Upon encountering 503 errors, the SDK does not have a fallback mechanism and fails.
Proposed enhancement is to consider all the read replicas and fallback to next replica (in the order of failover priority) if the ApplicationRegion is not set.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
During a regional outage, CDB client was getting 503 error from the service in the primary region. CDB is configured with 3 read replicas, but it did not fallback to other replicas until the primary region is marked offline. It was getting 503 error for an hour or so.
During investigation, it was realized that the property ApplicationRegion or ApplicationPreferredRegions has to be set for the reads to fallback during 503. Else it just considers only the primary region for the reads.
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/troubleshoot-sdk-availability
So there is a difference in client behavior to 503 based on the ApplicationRegion set vs not set. Moreover, it is not very intuitive and doesn't consider multiple replicas configured in the database.
Describe the solution you'd like
Solution 1:
“If Application region is set, the reads go to preferred available region closer to application region. If Application region is not set, the reads go to preferred available database region in order of failover priority (configured in the database)”.
This ensures consistent behavior whether AppRegion is set or not; improved success rate. potential increase in latency during failover.
Describe alternatives you've considered
Solution 2:
Given the importance of regional preferences (ApplicationRegion or ApplicationPreferredRegions) for the client to be resilient to regional outage, could this be made a required field during client initialization?
This ensures the callers will set this information by default, so it won't get missed and behavior becomes similar in all scenarios.
Additional context
<style> </style>More context to Solution 1:
For an example,
Consider a Cosmos DB instance 'CDB A' with three regions: East US (EUS) with priority 0, West US (WUS) with priority 1, and West Europe (WEU) with priority 2:
Currently, if a user specifies the ApplicationRegion as EUS, the SDK populates all read regions as preferred regions, sorted by their proximity to EUS. In the event of a 503 error from EUS, the SDK falls back to WUS, the next closest region.
If the user does not specify the ApplicationRegion, the SDK defaults to the first database region, EUS. Upon encountering 503 errors, the SDK does not have a fallback mechanism and fails.
Proposed enhancement is to consider all the read replicas and fallback to next replica (in the order of failover priority) if the ApplicationRegion is not set.
The text was updated successfully, but these errors were encountered: