-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HttpRequestException when using SocketsHttpHandler #26629
Comments
Most / all of these requests were using |
A repro 😄 |
That error message is for SocketError.OperationAborted, which likely means the socket was closed while the send or receive was in progress. |
@stephentoub Is there anything here that we can look at? Note that we never use The code is actually doing something like: try
{
return await httpHandler.SendAsync(...);
}
catch(HttpRequestException e)
{
} So we can catch the exception and then check state / logging / something. Anything you can recommend that we'll look at to get to the root cause? |
Is it possible you're hitting a timeout? It shouldn't manifest this way, but it would give us a place to start looking. For example, if you change your HttpClient's Timeout to Timeout.InfiniteTimeSpan, does this still happen? Or if it's potentially related to connection pooling, what happens if you change the SocketHttpHandler's PooledConnectionLifetime and PooledConnectionIdleTimeout to infinite... does it still repro? |
It is all running locally, and we timed the error and it was about 11 ms from calling |
Actually, I have a theory for how this could manifest... now I just need to go look through the code for where the cause could be. Is it possible that before these errors occur for you, some other request timed out or could have otherwise been canceled (e.g. via the token you pass in)? |
Or... are you potentially disposing of the response stream from a request while it's concurrently being read? e.g. something buggy like: Task t = responseStream.ReadAsync(...);
responseStream.Dispose(); ? |
It is entirely possible that the previous request can time out, yes. |
I don't think that I have anything like the second option, but it is possible. We might be in the middle of a read and need to abort the connection. |
And... I have a repro. On my machine, this reproduce the scenario instantly. using System;
using System.Net.Http;
using System.Threading.Tasks;
namespace ConsoleApp15
{
class Program
{
static async Task Main(string[] args)
{
// http://slowwly.robertomurray.co.uk/delay/3000/url/http://www.google.co.uk
var client = new HttpClient(new SocketsHttpHandler
{
})
{
Timeout = TimeSpan.FromMilliseconds(250)
};
var tasks = new[]
{
Run(client),
Run(client),
Run(client),
Run(client),
Run(client),
Run(client),
Run(client),
Run(client),
Run(client),
};
await Task.WhenAll(tasks);
Console.WriteLine("Done");
}
public static async Task Run(HttpClient client)
{
for (int i = 0; i < 25; i++)
{
try
{
await client.GetStringAsync("http://slowwly.robertomurray.co.uk/delay/3000/url/http://www.google.co.uk");
Console.WriteLine("Should timeout");
}
catch (Exception e)
{
}
await client.GetStringAsync("http://slowwly.robertomurray.co.uk/delay/100/url/http://www.google.co.uk");
}
}
}
} |
That's hitting a timeout and throwing an OperationCanceledException... that's expected. What's not expected is getting an IOException (not wrapped in an OperationCanceledException) out of SendAsync and when nothing on the request is being canceled / timing out. |
@stephentoub There are two exceptions thrown here. The first is expected, because we are hitting the timeout. I think that this shows that the timeout on the request indeed poison the pool |
Why shouldn't the second request time out? It takes a very long time to access and the timeout is set at 250ms. (I'm trying to access it now and it's just sitting waiting.) |
Hm... looks like I was too excited. var sp = Stopwatch.StartNew();
try
{
await client.GetStringAsync("http://www.google.co.uk");
}
catch (Exception)
{
Console.WriteLine(sp.ElapsedMilliseconds);
throw;
} Always gives > 250 ms. I assumed that this was because of the previous run, not that it is (probably) waiting there. |
Okay, I think that I'm on a better track here. See the following code. I was able to reproduce the |
We have some more information (still no direct repro). We have managed to find out exactly which particular test is causing those errors. This is The server is started with random port (we are passing After this test completes, 20-30 seconds later, few tests are failing with the exception that I wrote about in the issue. Those tests are connecting to global server that is available for whole test suite (and this server is working on a different url and port). After few (2-3) of those failures the tests are starting to work in a normal way. It looks like something is going on after 20-30 seconds after process was killed and for some time it affects other socket connections. This is our CI log (the 7787 turns on a switch that writes on console url's and http methods for sanity check - to know if we are hitting appropriate server). The code is available here: https://github.com/ppekrol/ravendb/commits/v4.1-temp4 FYI, skipping this test resolves the issue.
|
I have this behavior on nodes in Service Fabric Cluster 1. In SF Cluster 2 it works OK |
We have a statistical repro. Unfortunately, it require us to run about 3,400 tests to run into this. |
We can try our logging on it. Do you have steps we can "easily" replicate locally? |
Get the code from this PR Go to This reproduce in one of five runs or so. Typically more on lower end machines. |
Most of our logging went in AFTER 2.1 :( ... I am not even sure if it is going to cover this particular scenario ... |
I'm not sure what is needed to run from 'master', but if you want to update
Also our CI is executing tests in 'Release'. |
@rmkerr will take a look on our side - I asked him to try the repro in house as first step.
|
It started to happening all of a sudden. First we thought that it might be related to 2.0 -> 2.1 migration, but at the end we think it is related to Check my explanation here: https://github.com/dotnet/corefx/issues/30691#issuecomment-401775334 |
I'm in the process of setting up the repro now. I think that the connection management logging in 2.1 actually isn't bad, so we can hopefully get some useful info there. |
@ayende are there any additional steps I need to follow to get the tests working? I'm currently:
As output I see a large number of tests skipped, and then several failures that occur every time I try to run the tests. This is running on Windows 10 RS2 Logs:D:\ravendb\test\SlowTests>dotnet xunit Detecting target frameworks in SlowTests.csproj... Building for framework netcoreapp2.1... Sparrow -> D:\ravendb\src\Sparrow\bin\Debug\netcoreapp2.1\Sparrow.dll Sparrow -> D:\ravendb\src\Sparrow\bin\Debug\netstandard2.0\Sparrow.dll Voron -> D:\ravendb\src\Voron\bin\Debug\netstandard2.0\Voron.dll Raven.Client -> D:\ravendb\src\Raven.Client\bin\Debug\netstandard2.0\Raven.Client.dll Raven.Embedded -> D:\ravendb\src\Raven.Embedded\bin\Debug\netstandard2.0\Raven.Embedded.dll Raven.Server -> D:\ravendb\src\Raven.Server\bin\Debug\netcoreapp2.1\Raven.Server.dll Tests.Infrastructure -> D:\ravendb\test\Tests.Infrastructure\bin\Debug\netcoreapp2.1\Tests.Infrastructure.dll FastTests -> D:\ravendb\test\FastTests\bin\Debug\netcoreapp2.1\FastTests.dll Raven.TestDriver -> D:\ravendb\src\Raven.TestDriver\bin\Debug\netstandard2.0\Raven.TestDriver.dll SlowTests -> D:\ravendb\test\SlowTests\bin\Debug\netcoreapp2.1\SlowTests.dll Running .NET Core 2.1.1 tests for framework netcoreapp2.1... xUnit.net Console Runner v2.4.0-beta.1.build3958 (64-bit .NET Core 4.6.26606.02) Discovering: SlowTests Discovered: SlowTests Starting: SlowTests Max number of concurrent tests is: 2 To attach debugger to test process (x64), use proc-id: 8640. SlowTests.Issues.RavenDB_3365.index_pretty_printer_ignores_whitespaces [SKIP] RavenDB-4185 To attach debugger to test process (x64), use proc-id: 8640. Url http://127.0.0.1:50061 SlowTests.Issues.RavenDB_5256.ShouldLoadUserFromAsyncSessionTwiceInEmbeddedDocumentStore [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadUserFromSessionTwiceInShardingDocumentStore [SKIP] RavenDB-5918, RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadUserFromAsyncSessionTwiceInShardingDocumentStore [SKIP] RavenDB-5918, RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadMultipleUsersWithIncludesFromAsyncSessionInShardingDocumentStore [SKIP] RavenDB-5918, RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadMultipleUsersWithIncludesFromSessionInShardingDocumentStore [SKIP] RavenDB-5918, RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadMultipleUsersWithIncludesFromSessionInEmbeddedDocumentStore [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadMultipleUsersWithIncludesFromAsyncSessionInEmbeddedDocumentStore [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_5256.ShouldLoadUserFromSessionTwiceInEmbeddedDocumentStore [SKIP] RavenDB-6283 SlowTests.MailingList.ShardedFacets.FacetTest [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_3465.get_metadata_for_async_sharded [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_3465.get_metadata_for_sharded [SKIP] RavenDB-6283 SlowTests.Tests.Queries.CanQueryOnCustomClass.UsingConverter [SKIP] RavenDB-6263 SlowTests.MailingList.StreamingHalfWay.ShouldWork [SKIP] Missing feature: /docs/stream SlowTests.Issues.RavenDB_4904.can_create_side_by_side_index_to_replace_index_with_errors [SKIP] RavenDB-5919 SlowTests.Issues.RavenDB_4904.can_create_side_by_side_index_with_errors_to_replace_index_with_errors [SKIP] RavenDB-5919 SlowTests.Issues.RavenDB_4904.can_create_side_by_side_index_with_errors_to_replace_index [SKIP] RavenDB-5919 SlowTests.Server.Documents.SqlMigration.MySQLSchemaTest.CanFetchSchema [SKIP] Test requires MySQL database SlowTests.Issues.RavenDB_4708.CanUseSingleSyncSharded [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_4708.CanUseFirstAsyncSharded [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_4708.CanUseFirstSyncSharded [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_4708.CanUseLazilySyncShaded [SKIP] RavenDB-6283 SlowTests.MailingList.JohanNilsson.WithCustomizedTagNameAndIdentityProperty [SKIP] RavenDB-6124 SlowTests.Issues.RavenDB_3609.Test [SKIP] RavenDB-6283 SlowTests.Authentication.AuthenticationChangesTests.ChangesWithAuthentication [SKIP] RavenDB-9580: Does not work on current version .NET core. SlowTests.Issues.RavenDB_3420.bulk_insert_sharded [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_3420.bulk_insert_sharded_when_specifying_default_database [SKIP] RavenDB-6283 SlowTests.MailingList.DeserializationToObjectTests.Query_GivenDbWithComplexObjects_ShouldDeserializePropertiesToOriginalType [SKIP] RavenDB-6124 SlowTests.Bugs.Indexing.IndexingEachFieldInEachDocumentSeparately.ForIndexing [SKIP] Missing feature : RavenDB-6152 SlowTests.Bugs.Indexing.WiseShrek.Isolated [SKIP] Missing features SlowTests.Issues.RavenDB_579.OneShardPerSessionStrategy [SKIP] RavenDB-6283 SlowTests.Issues.RavenDB_579.OneShardPerSessionStrategyAsync [SKIP] RavenDB-6283 SlowTests.Cluster.ClusterModesForRequestExecutorTest.Fastst_node_should_choose_the_node_without_delay [SKIP] RavenDB-9020 SlowTests.Issues.RavenDB_6596.Run [SKIP] Culture tests are disabled. Please set 'RAVEN_ENABLE_CULTURE_TESTS' environment variable to 'true' to enable them. SlowTests.MailingList.Lindblom.Test [SKIP] RavenDB-6124 SlowTests.MailingList.Kushnir.SortOnMetadata [SKIP] RavenDB-6264 SlowTests.Issues.RavenDB_2944.CanCreateTestMapReduceIndexes [SKIP] RavenDB-6572 SlowTests.Issues.RavenDB_2944.CanCreateTestMapIndexes [SKIP] RavenDB-6572 SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.upload_archive [SKIP] Requires Amazon AWS Credentials SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.put_blob_in_folder [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.can_get_correct_error_s3 [SKIP] Requires Amazon AWS Credentials SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.upload_archive_multipart [SKIP] Requires Amazon AWS Credentials SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.put_blob [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.can_get_correct_error_glacier [SKIP] Requires Amazon AWS Credentials SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.put_object_multipart [SKIP] Requires Amazon AWS Credentials SlowTests.Server.Documents.PeriodicBackup.RavenDB_2181.put_object [SKIP] Requires Amazon AWS Credentials SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_70MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_64MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_256MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_100MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_into_folder_500MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_765MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_500MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.can_get_container_not_found [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_into_folder_765MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_into_folder_64MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_into_folder_256MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_into_folder_70MB [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.can_get_and_delete_container [SKIP] Azure Storage Emulator is not installed SlowTests.Server.Documents.PeriodicBackup.RavenDB_4163.put_blob_into_folder_100MB [SKIP] Azure Storage Emulator is not installed SlowTests.MailingList.ScriptedPatchBug.Test [SKIP] Missing feature: Tasks (operations) and their results SlowTests.Issues.RavenDB_3726.Test [SKIP] RavenDB-6283 SlowTests.Server.Documents.SqlMigration.RecursiveMigrationTest.CanEmbedOnParent(provider: MsSQL) [FAIL] System.InvalidOperationException : Use a valid connection Stack Trace: D:\ravendb\test\SlowTests\Server\Documents\ETL\SQL\SqlEtlTests.cs(73,0): at SlowTests.Server.Documents.ETL.SQL.SqlEtlTests.<>c.<.cctor>b__32_0() at System.Lazy`1.ViaFactory(LazyThreadSafetyMode mode) at System.Lazy`1.ExecutionAndPublication(LazyHelper executionAndPublication, Boolean useDefaultConstructor) at System.Lazy`1.CreateValue() D:\ravendb\test\SlowTests\Server\Documents\SqlMigration\SqlAwareTestBase.cs(113,0): at SlowTests.Server.Documents.SqlMigration.SqlAwareTestBase.WithMsSqlDatabase(String& connectionString, String& databaseName, String dataSet, Boolean includeData) D:\ravendb\test\SlowTests\Server\Documents\SqlMigration\SqlAwareTestBase.cs(104,0): at SlowTests.Server.Documents.SqlMigration.SqlAwareTestBase.WithSqlDatabase(MigrationProvider provider, String& connectionString, String& schemaName, String dataSet, Boolean includeData) D:\ravendb\test\SlowTests\Server\Documents\SqlMigration\RecursiveMigrationTest.cs(119,0): at SlowTests.Server.Documents.SqlMigration.RecursiveMigrationTest.CanEmbedOnParent(MigrationProvider provider) --- End of stack trace from previous location where exception was thrown --- SlowTests.Server.Documents.SqlMigration.RecursiveMigrationTest.CanEmbedOnParent(provider: MySQL) [SKIP] Test requires MySQL database SlowTests.Voron.Bugs.RavenDB_6971.Applying_new_diff_requires_to_zero_destination_bytes_first [FAIL] Raven.Client.Exceptions.Database.DatabaseLoadTimeoutException : Raven.Client.Exceptions.Database.DatabaseLoadTimeoutException: Database Applying_new_diff_requires_to_zero_destination_bytes_first_42 after 00:01:00 is still loading, try again later. Database initialization log: [Load Database] 7/11/2018 7:41:01 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Starting database initialization [Load Database] 7/11/2018 7:41:01 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Creating db.lock file [Load Database] 7/11/2018 7:41:01 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Initializing NotificationCenter [Load Database] 7/11/2018 7:41:01 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Initializing DocumentStorage [Load Database] 7/11/2018 7:41:01 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Starting Recovery [Load Database] 7/11/2018 7:41:01 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Recovering journal 6 (upto last journal 7) [Load Database] 7/11/2018 7:46:27 PM :: Database 'Applying_new_diff_requires_to_zero_destination_bytes_first_42' : Recovering journal 7 (upto last journal 7) at Raven.Server.Routing.RouteInformation.ThrowDatabaseLoadTimeoutWithLog(StringSegment databaseName, TimeSpan timeout, String log) in D:\ravendb\src\Raven.Server\Routing\RouteInformation.cs:line 143 at Raven.Server.Routing.RouteInformation.UnlikelyWaitForDatabaseToLoad(RequestHandlerContext context, Task`1 database, DatabasesLandlord databasesLandlord, StringSegment databaseName) in D:\ravendb\src\Raven.Server\Routing\RouteInformation.cs:line 122 at Raven.Server.Routing.RouteInformation.WaitForDb(Task databaseLoading) in D:\ravendb\src\Raven.Server\Routing\RouteInformation.cs:line 162 at Raven.Server.Routing.RequestRouter.HandlePath(HttpContext context, String method, String path) in D:\ravendb\src\Raven.Server\Routing\RequestRouter.cs:line 73 at System.Threading.Tasks.ValueTask`1.get_Result() at Raven.Server.RavenServerStartup.RequestHandler(HttpContext context) in D:\ravendb\src\Raven.Server\RavenServerStartup.cs:line 162 Response.StatusCode - ServiceUnavailable Stack Trace: D:\ravendb\src\Raven.Client\Http\RequestExecutor.cs(839,0): at Raven.Client.Http.RequestExecutor.ThrowFailedToContactAllNodes[TResult](RavenCommand`1 command, HttpRequestMessage request) D:\ravendb\src\Raven.Client\Http\RequestExecutor.cs(795,0): at Raven.Client.Http.RequestExecutor.ExecuteAsync[TResult](ServerNode chosenNode, Nullable`1 nodeIndex, JsonOperationContext context, RavenCommand`1 command, Boolean shouldRetry, SessionInfo sessionInfo, CancellationToken token) D:\ravendb\src\Raven.Client\Http\RequestExecutor.cs(828,0): at Raven.Client.Http.RequestExecutor.ExecuteAsync[TResult](ServerNode chosenNode, Nullable`1 nodeIndex, JsonOperationContext context, RavenCommand`1 command, Boolean shouldRetry, SessionInfo sessionInfo, CancellationToken token) D:\ravendb\src\Raven.Client\Documents\Operations\OperationExecutor.Operations.cs(22,0): at Raven.Client.Documents.Operations.OperationExecutor.SendAsync(IOperation`1 operation, SessionInfo sessionInfo, CancellationToken token) D:\ravendb\src\Raven.Client\Util\AsyncHelpers.cs(104,0): at Raven.Client.Util.AsyncHelpers.<>c__DisplayClass3_0`1.<b__0>d.MoveNext() --- End of stack trace from previous location where exception was thrown --- D:\ravendb\src\Raven.Client\Util\AsyncHelpers.cs(124,0): at Raven.Client.Util.AsyncHelpers.RunSync[T](Func`1 task) D:\ravendb\src\Raven.Client\Documents\Operations\OperationExecutor.Operations.cs(13,0): at Raven.Client.Documents.Operations.OperationExecutor.Send(IOperation`1 operation, SessionInfo sessionInfo) D:\ravendb\test\SlowTests\Voron\Bugs\RavenDB_6971.cs(75,0): at SlowTests.Voron.Bugs.RavenDB_6971.Applying_new_diff_requires_to_zero_destination_bytes_first() SlowTests.Server.Documents.ETL.RavenDB_8866.CanResetEtl [FAIL] Raven.Client.Exceptions.Commercial.LicenseLimitException : Raven.Client.Exceptions.Commercial.LicenseLimitException: Your current license doesn't include the RavenDB ETL feature at Raven.Server.Commercial.LicenseManager.AssertCanAddRavenEtl() in D:\ravendb\src\Raven.Server\Commercial\LicenseManager.cs:line 1269 at Raven.Server.Web.System.OngoingTasksHandler.AssertCanAddOrUpdateEtl(String databaseName, BlittableJsonReaderObject& etlConfiguration, JsonOperationContext context) in D:\ravendb\src\Raven.Server\Web\System\OngoingTasksHandler.cs:line 719 at Raven.Server.Documents.DatabaseRequestHandler.DatabaseConfigurations(Func`4 setupConfigurationFunc, String debug, RefAction beforeSetupConfiguration, Action`3 fillJson, HttpStatusCode statusCode) in D:\ravendb\src\Raven.Server\Documents\DatabaseRequestHandler.cs:line 64 at Raven.Server.Web.System.OngoingTasksHandler.AddEtl() in D:\ravendb\src\Raven.Server\Web\System\OngoingTasksHandler.cs:line 685 at Raven.Server.Routing.RequestRouter.HandlePath(HttpContext context, String method, String path) in D:\ravendb\src\Raven.Server\Routing\RequestRouter.cs:line 123 at System.Threading.Tasks.ValueTask`1.get_Result() at Raven.Server.RavenServerStartup.RequestHandler(HttpContext context) in D:\ravendb\src\Raven.Server\RavenServerStartup.cs:line 162 State for A-term1: Passive=>LeaderElect at 2018-07-11T19:34:45.5363246Z because I'm the only one in the cluster, so I'm the leader Stack Trace: D:\ravendb\src\Raven.Client\Exceptions\ExceptionDispatcher.cs(117,0): at Raven.Client.Exceptions.ExceptionDispatcher.Throw(JsonOperationContext context, HttpResponseMessage response, Action`1 additionalErrorInfo) D:\ravendb\src\Raven.Client\Http\RequestExecutor.cs(1036,0): at Raven.Client.Http.RequestExecutor.HandleUnsuccessfulResponse[TResult](ServerNode chosenNode, Nullable`1 nodeIndex, JsonOperationContext context, RavenCommand`1 command, HttpRequestMessage request, HttpResponseMessage response, String url, SessionInfo sessionInfo, Boolean shouldRetry, CancellationToken token) D:\ravendb\src\Raven.Client\Http\RequestExecutor.cs(786,0): at Raven.Client.Http.RequestExecutor.ExecuteAsync[TResult](ServerNode chosenNode, Nullable`1 nodeIndex, JsonOperationContext context, RavenCommand`1 command, Boolean shouldRetry, SessionInfo sessionInfo, CancellationToken token) D:\ravendb\src\Raven.Client\Http\RequestExecutor.cs(828,0): at Raven.Client.Http.RequestExecutor.ExecuteAsync[TResult](ServerNode chosenNode, Nullable`1 nodeIndex, JsonOperationContext context, RavenCommand`1 command, Boolean shouldRetry, SessionInfo sessionInfo, CancellationToken token) D:\ravendb\src\Raven.Client\Documents\Operations\MaintenanceOperationExecutor.cs(61,0): at Raven.Client.Documents.Operations.MaintenanceOperationExecutor.SendAsync[TResult](IMaintenanceOperation`1 operation, CancellationToken token) D:\ravendb\src\Raven.Client\Util\AsyncHelpers.cs(104,0): at Raven.Client.Util.AsyncHelpers.<>c__DisplayClass3_0`1.<b__0>d.MoveNext() --- End of stack trace from previous location where exception was thrown --- D:\ravendb\src\Raven.Client\Util\AsyncHelpers.cs(124,0): at Raven.Client.Util.AsyncHelpers.RunSync[T](Func`1 task) D:\ravendb\src\Raven.Client\Documents\Operations\MaintenanceOperationExecutor.cs(43,0): at Raven.Client.Documents.Operations.MaintenanceOperationExecutor.Send[TResult](IMaintenanceOperation`1 operation) D:\ravendb\test\SlowTests\Server\Documents\ETL\EtlTestBase.cs(21,0): at SlowTests.Server.Documents.ETL.EtlTestBase.AddEtl[T](DocumentStore src, EtlConfiguration`1 configuration, T connectionString) D:\ravendb\test\SlowTests\Server\Documents\ETL\RavenDB_8866.cs(43,0): at SlowTests.Server.Documents.ETL.RavenDB_8866.CanResetEtl() SlowTests.Server.Documents.ETL.ConnectionStringTests.CanGetConnectionStringByName [FAIL] System.InvalidOperationException : Use a valid connection Stack Trace: D:\ravendb\test\SlowTests\Server\Documents\ETL\SQL\SqlEtlTests.cs(73,0): at SlowTests.Server.Documents.ETL.SQL.SqlEtlTests.<>c.<.cctor>b__32_0() at System.Lazy`1.ViaFactory(LazyThreadSafetyMode mode) --- End of stack trace from previous location where exception was thrown --- at System.Lazy`1.CreateValue() D:\ravendb\test\SlowTests\Server\Documents\ETL\SQL\SqlEtlTests.cs(1132,0): at SlowTests.Server.Documents.ETL.SQL.SqlEtlTests.GetConnectionString(DocumentStore store) D:\ravendb\test\SlowTests\Server\Documents\ETL\ConnectionStringTests.cs(171,0): at SlowTests.Server.Documents.ETL.ConnectionStringTests.CanGetConnectionStringByName() |
This is expected. There are over 3500 tests there and it takes around 30-40min for them to complete. I suggest running Some of them might fail due to the missing license, or can be skipped because of missing SQL Server, but you can safely ignore that fact. |
I've pushed 1 additional commit to the branch https://github.com/ppekrol/ravendb/commits/v4.1-temp4 that disables test parallelization. After that I've ran in
Following two tests have failed with the exception that we are trying to trace:
Log is available here: https://drive.google.com/file/d/1syhM_yR2KwtWvzp7LGABag5_duU8cKLZ/view?usp=sharing |
Can you share a repro? |
We have an API in .netcore 2.2 that is talking to an internal API (.netcore 2.2). Single requests are working fine but when sending multiple requests through our automation using multiple threads it fails with this error. Is this what you are looking for? |
No, I was actually looking for code that I could run to see the issue. What you describe is very unlikely to be the same underlying problem as this issue. |
I will work on it and get back to you. How do i give the code to you? |
You can post it here. Hopefully you can create a standalone repro that highlights what you're seeing and is generally shareable. |
I also had this issue and was working on code to reproduce it, but I realized it can be reproduced fairly trivially by setting HttpClient.Timeout to an extremely low value and sending a HTTPS POST to a IIS Express server running the default ASP.Net Core project template. However that might be a "false positive", in my case there was really a timeout and the problem would be that the error message is misleading (it should clearly say there was a timeout instead of "either a thread exit or an application request") Setting AppContext.SetSwitch("System.Net.Http.UseSocketsHttpHandler", false); also caused an exception due to a timeout, it just changed the error message and stack trace. |
Correction: Not exactly the default ASP.Net Core project template, it needs to be modified to add a POST operation, and in that operation wait some delay (higher than the client's timeout) before returning a response. |
Thanks @ckarras . |
Since I’m facing this issue I wanted to add my input to help find a solution. @stephentoub I can add that i get the error in System.Net.Http.HttpConnection.SendAsyncCore Some details about my specific case: If you have any questions about the implementation of this please let me know. When I compare, I can see similarities to the previously posted logs.
|
@stephentoub I see you moved it to Future, while we had it tracking for servicing 2.1/2.2 (and 3.0). Was there new info about the scenario impact? AFAIK we had 2-3 independent customers affected. |
@karelz, no one has been able to supply a repro, and I can't repro it. |
@rmkerr was able to repro it locally. He was able to minimize the repro (to running one test massively in parallel). I asked him to transfer it to you. Did it not happen? I can check my old emails to see if I was on cc. |
@karelz, he pointed me to https://github.com/dotnet/corefx/issues/30691#issuecomment-443926134, but that's from 8 months ago, and a lot has changed since then in both the product and in the ASP.NET tests he refers to. I tried various permutations of what he suggested and never got such a failure. |
@ppekrol, have you moved to .NET Core 3.0? If yes, are you still hitting this? |
@stephentoub would it make sense to ask someone on our team to try to sync back and repro it on older 3.0 build or on 2.2? (with Max's steps) |
@stephentoub We haven't moved to 3.0 yet. We will start migration closer to the release date, probably when RC builds with hit the shelves. I might be wrong, but I see two separate issues here that ppl submit, 1st is probably expected (Operation has timeout out) and 2nd one (the one that we have submitted) is more generic (An error occurred while sending the request). |
Yes, there are multiple reasons this could occur. As you say, most are expected error conditions; that's just the nature of networking. The one you highlighted is not expected and is much more elusive. |
@karelz, if you think it will be fruitful, go for it. |
@scalablecory can you please help with it? |
Have you tried to play with |
I am running into this too if i can get any real information i will share. |
for me it stems from System.IO.Stream.ReadAsync |
Unable to replicate with current code; tried pulling old versions of RavenDb and ASP.NET and didn't happen after some playing around either. Going to close and we can open a new ticket if we find something in 3.0. Many of the replications in this thread appear to be different from what was reported so it's hard to get a solid idea of how much relevancy the issue has today. |
This issue drove me crazy for a really long time. In case it helps anyone, I added some retries with Polly: var apiExceptionPolicy = Policy
.Handle<Refit.ApiException>() // in my case I was using Refit, but change this to HttpRequestException or whatever works for you
.RetryAsync(3, (exception, retryCount) => Task.Delay(50));
await apiExceptionPolicy.ExecuteAsync(() => randomlyFailingCode()); This seems to have fixed it for me - I haven't had any random exceptions during local development or emails from the production server for several weeks now 👍 |
Hi all,
we had to drop the
SocketsHttpHandler
(ravendb/ravendb@9ad7c41)because for quite some time we started receiving random test failures on our CI (Jenkins) Agent (Hyper-V machines, 6 vCPU, 6GB RAM on WS2012R2 and WS2016 - fully updated). We tried to reproduce this on our local dev machines, but without any luck. After turning off the SocketsHttpHandler, the random failures stopped showing up.
The exception we are getting is:
We are using latest .NET Core 2.1.1 (2.1.0 was also affected).
Do you know if there are some problems in SocketsHttpHandler that could cause this? What can we do to help you solve this issue? Any pointers?
The text was updated successfully, but these errors were encountered: