Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Handling 403:1008 for address resolution calls. #4710

Open
Tracked by #4703
jeet1995 opened this issue Sep 19, 2024 · 3 comments
Open
Tracked by #4703

[BUG]: Handling 403:1008 for address resolution calls. #4710

jeet1995 opened this issue Sep 19, 2024 · 3 comments
Assignees

Comments

@jeet1995
Copy link
Member

We are continuously addressing and improving the SDK, if possible, make sure the problem persist in the latest SDK version.

Describe the bug
When a region has been removed and the SDK possibly reaches out to removed region and performs address resolution calls against the removed region could have the operation fail.

To Reproduce
Use the chaos framework in the Gateway mode to inject 403:1008s for Address resolution requests [or] subject a workload to single-write multi-region account where the write region is failed over.

Expected behavior
Ideally, address resolution in-lined with the document operation should be part of the document operation's ClientRetryPolicy and be retriable to other available regions.

@kirankumarkolli
Copy link
Member

kirankumarkolli commented Oct 23, 2024

Related exceptions stck

2024-09-24T17:10:00.008184Z	CosmosDbRequestEndWithClientFailure	CosmosItemDataProvider.Query.PrivateLinkAssociation	Response status code does not indicate success: Forbidden (403); Substatus: 1008; ActivityId: 123fc2f9-43ae-4fa7-bcd1-5b284e0e7354; Reason: ( RequestUri: https://xxxx.documents.azure.com/dbs/LxkPAA==/colls/LxkPAMPhU1A=/pkranges; RequestMethod: GET; Header: authorization Length: 86; Header: x-ms-date Length: 29; Header: x-ms-max-item-count Length: 2; Header: A-IM Length: 16; Header: x-ms-activity-id Length: 36; Header: Cache-Control Length: 8; Header: User-Agent Length: 94; Header: x-ms-version Length: 10; Header: x-ms-cosmos-sdk-supportedcapabilities Length: 1; Header: Accept Length: 16; ActivityId: 123fc2f9-43ae-4fa7-bcd1-5b284e0e7354, Request URI: /dbs/LxkPAA==/colls/LxkPAMPhU1A=/pkranges, RequestStats: Microsoft.Azure.Cosmos.Tracing.TraceData.ClientSideRequestStatisticsTraceDatum, SDK: Windows/10.0.20348 cosmos-netstandard-sdk/3.34.4);	at Microsoft.Azure.Cosmos.GatewayStoreClient.<ParseResponseAsync>d__9.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.GatewayStoreClient.<InvokeAsync>d__5.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.GatewayStoreModel.<ProcessMessageAsync>d__9.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Microsoft.Azure.Cosmos.GatewayStoreModel.<ProcessMessageAsync>d__9.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Routing.PartitionKeyRangeCache.<ExecutePartitionKeyRangeReadChangeFeedAsync>d__12.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Microsoft.Azure.Documents.BackoffRetryUtility`1.<ExecuteRetryAsync>d__6`2.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)
    at Microsoft.Azure.Documents.BackoffRetryUtility`1.<ExecuteRetryAsync>d__6`2.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Microsoft.Azure.Documents.BackoffRetryUtility`1.<ExecuteRetryAsync>d__6`2.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Microsoft.Azure.Cosmos.Routing.PartitionKeyRangeCache.<GetRoutingMapForCollectionAsync>d__11.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.AsyncCacheNonBlocking`2.<GetAsync>d__8.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Routing.PartitionKeyRangeCache.<TryLookupAsync>d__9.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Routing.PartitionKeyRangeCache.<TryGetOverlappingRangesAsync>d__7.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at Microsoft.Azure.Cosmos.IRoutingMapProviderExtensions.<TryGetOverlappingRangesAsync>d__3.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.CosmosQueryClientCore.<GetTargetPartitionKeyRangesAsync>d__14.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.ExecutionContext.CosmosQueryExecutionContextFactory.<GetTargetPartitionKeyRangesAsync>d__18.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.ExecutionContext.CosmosQueryExecutionContextFactory.<TryCreateFromPartitionedQueryExecutionInfoAsync>d__10.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.ExecutionContext.CosmosQueryExecutionContextFactory.<TryCreateCoreContextAsync>d__9.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.AsyncLazy`1.<GetValueAsync>d__7.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.Pipeline.LazyQueryPipelineStage.<MoveNextAsync>d__7.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.Pipeline.NameCacheStaleRetryQueryPipelineStage.<MoveNextAsync>d__10.MoveNext() --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at Microsoft.Azure.Cosmos.Query.Core.Pipeline.CatchAllQueryPipelineStage.<MoveNextAsync>d__1.MoveNext(```

@kirankumarkolli
Copy link
Member

It's a cold start scenario

  • Query started on a cold client
  • It tried to read the collection Partitions, and it got 403/1008 (DatabaseAccount not found in that region)
    • It's a newly added region

@kirankumarkolli
Copy link
Member

Another related

Microsoft.Azure.Cosmos.CosmosException : Response status code does not indicate success: Forbidden (403); Substatus: 1008; ActivityId: ba79d888-129d-4424-a108-5c92fdc2a1a4; Reason: (
RequestUri: https://XXX.documents.azure.com//addresses/?$resolveFor=dbs%2fXzQbAA%3d%3d%2fcolls%2fXzQbAInIqrI%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0;
RequestMethod: GET;
Header: Authorization Length: 1593;
Header: x-ms-date Length: 29;
Header: x-ms-force-refresh Length: 4;
Header: Cache-Control Length: 8;
Header: User-Agent Length: 79;
Header: x-ms-version Length: 10;
Header: x-ms-cosmos-sdk-supportedcapabilities Length: 1;
Header: Accept Length: 16;

ActivityId: ba79d888-129d-4424-a108-5c92fdc2a1a4, Request URI: //addresses/?$resolveFor=dbs%2fXzQbAA%3d%3d%2fcolls%2fXzQbAInIqrI%3d%2fdocs&$filter=protocol%20eq%20rntbd&$partitionKeyRangeIds=0, RequestStats: Microsoft.Azure.Cosmos.Tracing.TraceData.ClientSideRequestStatisticsTraceDatum, SDK: Windows/10.0.20348 cosmos-netstandard-sdk/3.32.0);
   at Microsoft.Azure.Cosmos.GatewayStoreClient.ParseResponseAsync(HttpResponseMessage responseMessage, JsonSerializerSettings serializerSettings, DocumentServiceRequest request)
   at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.GetServerAddressesViaGatewayAsync(DocumentServiceRequest request, String collectionRid, IEnumerable`1 partitionKeyRangeIds, Boolean forceRefresh)
   at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.GetAddressesForRangeIdAsync(DocumentServiceRequest request, PartitionAddressInformation cachedAddresses, String collectionRid, String partitionKeyRangeId, Boolean forceRefresh)
   at Microsoft.Azure.Cosmos.AsyncCacheNonBlocking`2.AsyncLazyWithRefreshTask`1.CreateAndWaitForBackgroundRefreshTaskAsync(Func`2 createRefreshTask)
   at Microsoft.Azure.Cosmos.AsyncCacheNonBlocking`2.UpdateCacheAndGetValueFromBackgroundTaskAsync(TKey key, AsyncLazyWithRefreshTask`1 initialValue, Func`2 callbackDelegate, String operationName)
   at Microsoft.Azure.Cosmos.AsyncCacheNonBlocking`2.GetAsync(TKey key, Func`2 singleValueInitFunc, Func`2 forceRefresh)
   at Microsoft.Azure.Cosmos.Routing.GatewayAddressCache.TryGetAddressesAsync(DocumentServiceRequest request, PartitionKeyRangeIdentity partitionKeyRangeIdentity, ServiceIdentity serviceIdentity, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)
   at Microsoft.Azure.Cosmos.AddressResolver.TryResolveServerPartitionAsync(DocumentServiceRequest request, ContainerProperties collection, CollectionRoutingMap routingMap, Boolean collectionCacheIsUptodate, Boolean collectionRoutingMapCacheIsUptodate, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)
   at Microsoft.Azure.Cosmos.AddressResolver.ResolveAddressesAndIdentityAsync(DocumentServiceRequest request, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)
   at Microsoft.Azure.Cosmos.AddressResolver.ResolveAsync(DocumentServiceRequest request, Boolean forceRefreshPartitionAddresses, CancellationToken cancellationToken)
   at Microsoft.Azure.Cosmos.Routing.GlobalAddressResolver.ResolveAsync(DocumentServiceRequest request, Boolean forceRefresh, CancellationToken cancellationToken)
   at Microsoft.Azure.Documents.AddressSelector.ResolveAddressesAsync(DocumentServiceRequest request, Boolean forceAddressRefresh)
   at Microsoft.Azure.Documents.ConsistencyWriter.WritePrivateAsync(DocumentServiceRequest request, TimeoutHelper timeout, Boolean forceRefresh)
   at Microsoft.Azure.Documents.BackoffRetryUtility`1.ExecuteRetryAsync[TParam,TPolicy](Func`1 callbackMethod, Func`3 callbackMethodWithParam, Func`2 callbackMethodWithPolicy, TParam param, IRetryPolicy retryPolicy, IRetryPolicy`1 retryPolicyWithArg, Func`1 inBackoffAlternateCallbackMethod, Func`2 inBackoffAlternateCallbackMethodWithPolicy, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action`1 preRetryCallback)
   at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)
   at Microsoft.Azure.Doc--TRUNCATED--

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants