-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CosmosClient Initialization: Adds implementation for opening Rntbd connections to backend replica nodes in Direct mode. #3508
CosmosClient Initialization: Adds implementation for opening Rntbd connections to backend replica nodes in Direct mode. #3508
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good!
77fe491
to
58bf46c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work!
...oft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/ClientCreateAndInitializeTest.cs
Outdated
Show resolved
Hide resolved
90923fd
to
b2331fa
Compare
b2331fa
to
91e4ffc
Compare
...oft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/ClientCreateAndInitializeTest.cs
Outdated
Show resolved
Hide resolved
...oft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/ClientCreateAndInitializeTest.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.Tests/GatewayAddressCacheTests.cs
Outdated
Show resolved
Hide resolved
...oft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/ClientCreateAndInitializeTest.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments around the test cases, great progress!
...oft.Azure.Cosmos/tests/Microsoft.Azure.Cosmos.EmulatorTests/ClientCreateAndInitializeTest.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks :)
throw new ArgumentNullException(resourceName); | ||
} | ||
|
||
if (this.StoreModel != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is it expected?
If so isn't it better to fail than silently easting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is an scenario where StoreModel is null, we have checks like this in the same file, but can potentially be removed without issues
string containerLinkUri, | ||
CancellationToken cancellationToken) | ||
{ | ||
if (string.IsNullOrEmpty(databaseName) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Personally I like one if-statement for argument for readablity.
@@ -482,5 +482,10 @@ private Uri GetFeedUri(DocumentServiceRequest request) | |||
{ | |||
return new Uri(this.endpointManager.ResolveServiceEndpoint(request), PathsHelper.GeneratePath(request.ResourceType, request, true)); | |||
} | |||
|
|||
public Task OpenConnectionsToAllReplicasAsync(string databaseName, string containerLinkUri, CancellationToken cancellationToken = default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument per line
} | ||
} | ||
|
||
foreach (DocumentServiceResponse response in await Task.WhenAll(tasks)) | ||
foreach (TryCatch<DocumentServiceResponse> task in await Task.WhenAll(tasks)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Monoid type used jsut to capture exception?
One alternative is to do simple try-catch and use Task to get the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Monad is simpler/easier to work with afterwards, you cannot use try/catch inside the task with Task.WhenAll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally - We needed an approach where the task is throwing an exception, it should continue the execution and makes sure, it doesn't fail the other pending tasks. There is a way we could achieve the same using Task.ContinueWith()
but it's not a preferred way in dotnet. per this .net best practices and recommendations, it appears that await
ing on a task is a preferred design choice over .ContinueWith()
. Therefore, we decided to leverage the TryCatch
framework.
System.Diagnostics.Trace.CorrelationManager.ActivityId); | ||
try | ||
{ | ||
await openConnectionHandlerAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As contract the caller is deciding to serialize creation.
Isn't it better to let the transport decide on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only 1 transport that would call this (Direct) though?
CancellationToken cancellationToken) | ||
{ | ||
await this.DocumentClient.EnsureValidClientAsync(NoOpTrace.Singleton); | ||
await this.DocumentClient.OpenConnectionsToAllReplicasAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.DocumentClient.InitializeCachesAsync
will take care of populating both partitionKeyRange and addresses caches.
One option is we could leverage it and start creating connections after its populated. It's a serial execution but that might suffice for current scenarios. Thoghts?
/// <param name="containerLinkUri">A string containing the container's link uri.</param> | ||
/// <param name="openConnectionHandlerAsync">The transport client callback delegate to be invoked at a later point of time.</param> | ||
/// <param name="cancellationToken">An Instance of the <see cref="CancellationToken"/>.</param> | ||
public async Task OpenConnectionsToAllReplicasAsync( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I would prefer to not overload caches with unnecessary context.
Caches are critical and already very complex. This pushed unnecessary complexity into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this aligns / is consistent with the approach Java took though
Pull Request Template
Description
Background:
Currently, during the
CosmosClient.CreateAndInitializeAsync()
, the .net SDK uses a method calledInitializeContainerAsync()
, which leverages dummy query on all feed ranges to open connection, with retry capability to increase the chances of touching max replica. However, there are few drawbacks with current approach, which are highlighted below.The Solution:
Instead of using the dummy query during the initialization, we could leverage the Rntbd context negotiation to finish the connection establishment. This PR is adding the necessary implementation in the v3 SDK codebase, on top of the
Cosmos.Direct
package changes v3.29.2, which contains the methods that will help establishing the Rntbd connection to the backend replica nodes.Design:
Please follow this link to understand more on the problem statement and designing the solution.
Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #3443