Triggering a cancellation token may not cancel a running query based on timing #1065

roji · 2021-05-10T13:58:21Z

When an async cancellation token is triggered, a running query is expected to be cancelled at the server; this apparently works fine when the token is triggered during ExecuteReaderAsync, or during ReadAsync.

However, if the token happens to be triggered while no SqlClient code is in progress (e.g. ReadAsync), the next time ReadAsync is called, it would simply return a cancelled Task immediately, without cancelling the query at the server. This means that there's a race condition - depending on when the token triggering happens, the query may be cancelled (if ReadAsync happens to be in progress) or it may not.

In Npgsql, we handled this race conditions by cancelling the (running) server query from ReadAsync, even when it's called with an already-cancelled token. This is being discussed in DapperLib/Dapper#1590 (comment), where Dapper may be working around this issue by registering a callback to call SqlCommand.Cancel when the token is triggered - this should IMHO not be necessary.

An alternative strategy would be to continue monitoring the cancellation token originally passed to ExecuteReaderAsync - even after ExecuteReaderAsync itself has already completed. This would have the advantage of allowing immediate cancellation at the server at any point, and not only when ReadAsync is called.

For a bit more context, I wrote a small blog post on ADO.NET and cancellation.

cheenamalhotra · 2021-05-10T17:21:49Z

One note I'd like to make is that in SqlClient, ReadAsync() can be extremely slow, not due to server side processing delay, but due to client side delay. And it's not just a few seconds but it can go to as much as feeling like a hung-up application (#1058), we've reproduced up to 90 seconds (case taken internally from #44 (comment)).

i.e. when a normal Read() takes 2-3 seconds, ReadAsync() takes 90+ seconds.

Now at that point, a CancellationToken is "expected" to cancel the processing, because we accept it. But we do not cancel reading the packets already received from the server, since TDS is a Stream based protocol, and not reading stream packets is just not an option without breaking the reader/results. Only when DataReader is closed using CloseAsync(), Cancellation Token is accepted to cancel reading further, but it's too late if someone already triggered a ReadAsync() call and is awaiting result.

The only time cancellation occurs in ReadAsync() is in the beginning when the token is already expired. After driver processing begins, token is not registered for any further use.

roji · 2021-05-10T17:50:22Z

@cheenamalhotra thanks for your comments...

But we do not cancel reading the packets already received from the server, since TDS is a Stream based protocol, and not reading stream packets is just not an option without breaking the reader/results.

Agreed; neither cancellation nor closing of the reader should ever cause the connection to break, lose protocol sync or similar. However, the user may still wish to cancel a query once it has starting running, and it's not currently clear to me how that can be done reliably.

Only when DataReader is closed using CloseAsync(), Cancellation Token is accepted to cancel reading further, but it's too late if someone already triggered a ReadAsync() call and is awaiting result.

Can you please provide more details on this? DbDataReader.CloseAsync does not accept a cancellation token (Close/Dispose generally do not in .NET).

The only time cancellation occurs in ReadAsync() is in the beginning when the token is already expired. After driver processing begins, token is not registered for any further use.

If I'm reading the code right, if a cancelled token is passed to ReadAsync, it just returns a cancelled Task but does not do any actual cancellation of the query at the server. Further down in ReadAsync, a cancellation callback does seem to be registered.

cheenamalhotra · 2021-05-10T18:17:17Z

If I'm reading the code right, if a cancelled token is passed to ReadAsync, it just returns a cancelled Task but does not do any actual cancellation of the query at the server. Further down in ReadAsync, a cancellation callback does seem to be registered.

Yes that's right. Query processing completes when "ExecuteReaderAsync" completes, when we "Read" or "ReadAsync" there's no server-side processing of query. The results are read 1 row at a time, and if we run out of packets more packets continue to arrive till socket is open. We can only Cancel query that causes Attention signal to be passed to the server which discards the entire query results. So unless and until query needs to be discarded/canceled, Cancellation Token on "ReadAsync" will not perform server-side cancellation.

Can you please provide more details on this? DbDataReader.CloseAsync does not accept a cancellation token (Close/Dispose generally do not in .NET).

Sorry, my bad. It's the "Close()" method, not "CloseAsync()" that cancels everything. But you can find _cancelAsyncOnCloseToken in the code, which is the only "cancellationToken" that can cancel async executions: ExecuteAsyncCall. The async design is very heavy on OS and leads to numerous back-to-back replays. The only way to cancel is to "Close" the reader.

Wraith2 · 2021-05-10T21:51:44Z

Is this affected by #956 ?

cheenamalhotra · 2021-05-10T22:04:07Z

No they're different issues.
I've uploaded Repro here: ReadAsyncTest.cs

roji · 2021-05-10T22:18:40Z

So unless and until query needs to be discarded/canceled, Cancellation Token on "ReadAsync" will not perform server-side cancellation.

So to be sure I understand :) It does seem to be possible to cancel a running query by triggering the cancellation token passed to ReadAsync, right? At least, this is what I think I'm seeing in this code in ReadAsync, which shows a registration to call `SqlCommand.s_cancelIgnoreFailure?

If so, then again, the reason I opened this issue is the inherent race condition that the user faces: the code triggering the cancellation token has no way of knowing whether the code calling ReadAsync is right before the IsCancellationRequested check - in which case no server cancellation occurs - or after - in which case it does seem to occur. I hope that's clear.

Stepping back... Let's say I execute a heavy query returning lots of data. ExecuteReader (or ExecuteReaderAsync) has already completed, and I'm now at the point where I'm consuming rows by calling Read (or ReadAsync). What are my options at this point for cancelling the query - stopping its processing at the server and stopping more rows from being generated? I'm assuming SqlCommand.Cancel should work reliably, but should the async cancellation token passed to SqlDataReader.ReadAsync also work?

roji · 2021-05-10T22:22:12Z

And one additional question. You wrote above:

It's the "Close()" method, not "CloseAsync()" that cancels everything.

Does this mean that calling Close on SqlDataReader performs server cancellation of the running query (once again, cutting it short and stopping more rows from being sent)? If that's so, then there would be no reason to manually call SqlCommand.Cancel (or trigger a cancellation token) before calling SqlDataReader.Close, since the latter already performs cancellation, right? If so, does CloseAsync not perform cancellation in a similar way as Close?

Wraith2 · 2021-05-10T22:24:13Z

As I understand it, you can request cancellation but the protocol requires that results be processed and discarded until a header is encountered with a done bit set, as @cheenamalhotra said the only other way to approach it would be to instantly kill the connection.

Since when you're using a SqlDataReader it assumes control of the parsing responsibility this means that some thread somewhere needs to be responsible for keeping parsing and you're going to need to wait for at least a single new packet. You can't have the connection that the parse loop corresponds to back until that's done. So if you want to cancel things go ahead but if you want to quickly do something else then you'll need to discard the connection and get another one from the pool to avoid a wait.

roji · 2021-05-10T22:39:23Z

As I understand it, you can request cancellation but the protocol requires that results be processed and discarded until a header is encountered with a done bit set

Sure, but it cancellation is still a way of stopping a long query earlier, and making sure it generates less results, right? In other words, I'm not talking about any sort of instantaneous cancellation, and definitely not about anything that breaks the connection; just a means of cutting a query short, so that it doesn't get processed until the very end (and generates the full resultset, which would have to be sent over the network).

The problem I'm trying to outline in this issue, is that the async mechanism for doing the above (triggering the async cancellation token) seems to be unreliable because of a race condition.

cheenamalhotra · 2021-05-11T00:00:58Z

Does this mean that calling Close on SqlDataReader performs server cancellation of the running query (once again, cutting it short and stopping more rows from being sent)?

No, it's only SqlCommand execute reader APIs that when completed mark completion of query execution on server and we have a reader instantiated that parses network packets to rows of data. *~~No public API in reader communicates to server~~, not even "Close()" as closing a reader is not considered as a hard stop. The driver safely closes all ongoing operations and prepares command for next execution.

"SqlCommand.Cancel()" is the only API that will send attention signal for ongoing reader query.

*Edit: Token passed to ReadAsync() does trigger call to SqlCommand.Cancel() internally if it's found expired before row processing begins.

just a means of cutting a query short, so that it doesn't get processed until the very end (and generates the full resultset, which would have to be sent over the network).

Driver does not read to the end when a SqlDataReader instance is created from ExecuteReader APIs. It only requests packets and prepares buffer for parsing. Row pointer is at -1 till here but packets could have been all received by now. SqlDataReader starts generating rows of data by parsing packets one by one when Read()/ReadAsync() or NextResult() are called as that moves index pointer to next row/resultset. You can still close reader if you did not move pointer to the row with large data, but if you did move your pointer (by calling Read/ReadAsync) driver will need to parse entire row then, and that cannot be cancelled.

roji · 2021-05-11T00:35:25Z

No public API in reader communicates to server, not even "Close()" as closing a reader is not considered as a hard stop. [...]
"SqlCommand.Cancel()" is the only API that will send attention signal for ongoing reader query.

@cheenamalhotra I'm sorry if I'm being dense - help me understand... SqlDataReader.ReadAsync registers SqlCommand.s_cancelIgnoreFailure on the user-provided cancellation token (see code). SqlCommand.s_cancelIgnoreFailure points to SqlCommand.CancelIgnoreFailureCallback, which calls SqlCommand.CancelIgnoreFailure, which itself calls SqlCommand.Cancel.

So unless I'm misreading the code, triggering the cancellation token passed to SqlDataReader.ReadAsync internally calls SqlCommand.Cancel, no?

You can still close reader if you did not move pointer to the row with large data, but if you did move your pointer (by calling Read/ReadAsync) driver will need to parse entire row then, and that cannot be cancelled.

The question is what happens if the reader is closed when the server is still processing the query and generating rows. If I'm reading you correctly, SqlClient will wait for the query to complete at the server (without attempting to cancel it), and silently consume any additional rows that get returned from the server. For example, if I execute a very heavy query that generates a million rows, and then immediately dispose the reader, I still have to wait for the entire query to complete and for all million rows to be sent back the the client over the network, right?

Unless I'm mistaken, an alternative would be to internally call SqlCommand.Cancel in this case, making the query complete much earlier and preventing so much data from uselessly being sent over the network.

cheenamalhotra · 2021-05-11T02:27:17Z

@cheenamalhotra I'm sorry if I'm being dense - help me understand... SqlDataReader.ReadAsync registers SqlCommand.s_cancelIgnoreFailure on the user-provided cancellation token (see code). SqlCommand.s_cancelIgnoreFailure points to SqlCommand.CancelIgnoreFailureCallback, which calls SqlCommand.CancelIgnoreFailure, which itself calls SqlCommand.Cancel.

Yes it does (sorry I missed that - corrected my last statement). I guess I've been emphasizing mostly on client side delay, as there's no promise that when cancellation occurs on this token, further processing will stop. You could try to run my above repro, where data is already retrieved so sending attention to server doesn't help.

The question is what happens if the reader is closed when the server is still processing the query and generating rows.

If server is still processing query, there's no reader. Until processing completes, ExecuteReader and ExecuteReaderAsync APIs will continue to wait for response. They will be cancelled (with server side termination) on CommandTimeout or CancellationToken or when SqlCommand.Cancel() is called.

roji · 2021-05-11T09:07:11Z

Yes it does (sorry I missed that - corrected my last statement). I guess I've been emphasizing mostly on client side delay, as there's no promise that when cancellation occurs on this token, further processing will stop.

I agree that there shouldn't be any promise or expectation of immediate cancellation or similar; since cancellation communicates with the server, there will of course be a delay before it takes effect.

However, to go back to the original reason for this issue; depending on the precise timing when ReadAsync's cancellation token is triggered, cancellation may not happen at all, since ReadAsync returns immediately if passed an already-cancelled cancellation token. In other words, I'm saying it may make sense to modify that check, and to immediately call SqlCommand.s_cancelIgnoreFailure if ReadAsync is invoked with a cancelled token.

cheenamalhotra · 2021-05-12T01:41:11Z

In other words, I'm saying it may make sense to modify that check, and to immediately call SqlCommand.s_cancelIgnoreFailure if ReadAsync is invoked with a cancelled token.

I will look into that.

I agree that there shouldn't be any promise or expectation of immediate cancellation or similar; since cancellation communicates with the server, there will of course be a delay before it takes effect.

Regarding this, what happens in driver is that even though cancellation occurs > attention is sent > response is received > but driver continues to process row with large data till the end of row for 90+ seconds. There's definitely no server side delay, our test proves that. That's why there's no guarantee in the driver, that cancellation token will terminate "ReadAsync" from processing and return back control.

It may make sense to generalize statement for all ADO.NET drivers, but for SqlClient this is an issue. The only solution we know of is to fix this huge performance lag.

roji · 2021-05-12T08:51:10Z

That's why there's no guarantee in the driver, that cancellation token will terminate "ReadAsync" from processing and return back control.

@cheenamalhotra sure. Once again, it makes perfect sense for cancellation not to be immediate. But what should be guaranteed, is that the attention (cancellation?) is actually sent by SqlClient to SQL Server - and that's not currently the case (because of the race condition described above). What happens after the attention is sent - i.e. how long before the query actually is cancelled at the server, and how many additional rows are delivered to SqlClient before the query terminates - those are all unrelated questions where there are no expected guarantees.

To summarize, this issue is about making sure that SqlClient always sends a cancellation to SQL Server when a cancellation token is triggered.

binki · 2022-09-01T02:04:36Z

I was hoping I could rely on SqlDataReader.ReadAsync() sending the attention to the server, but it looks like I have to manually keep track of SqlCommand and call SqlCommand.Cancel() prior to disposing any reader until this bug is fixed. Thanks for the guidance!

jnm2 · 2022-09-29T18:37:55Z

I'm seeing this cause a wait for the entire result set inside SqlDataReader.Dispose. Instead of cancelling after the first row, this code waits for the full query (a billion rows in this case):

using Microsoft.Data.SqlClient; // 5.0.0. Same with System.Data.SqlClient on .NET Framework 4.8 too.
using System.Threading;
using System.Threading.Tasks;

public static class Program
{
    public static async Task Main()
    {
        const string longRunningQuery = @"
with TenRows as (select Value from (values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10)) as TenRows (Value)),
    ThousandRows as (select A.Value as A, B.Value as B, C.Value as C from TenRows as A, TenRows as B, TenRows as C)
select *
from ThousandRows as A, ThousandRows as B, ThousandRows as C;";

        var connectionString = new SqlConnectionStringBuilder
        {
            DataSource = "(local)",
            IntegratedSecurity = true,
            TrustServerCertificate = true,
        }.ToString();

        using (var source = new CancellationTokenSource())
        using (var connection = new SqlConnection(connectionString))
        {
            await connection.OpenAsync(source.Token);

            using (var command = new SqlCommand(longRunningQuery, connection))
            using (var reader = await command.ExecuteReaderAsync(source.Token))
            {
                while (await reader.ReadAsync(source.Token))
                {
                    source.Cancel();
                }
            } // If you pause the debugger, it highlights this line
        }
    }
}

In practice, I find real apps blocking on SNIReadSyncOverAsync:

jnm2 · 2022-09-30T11:12:18Z

Can the fix be backported to .NET Framework's System.Data.SqlClient?

JRahnama · 2022-09-30T16:43:20Z

@jnm2 S.D.SqlClient is in servicing mode and is not accepting any fixes but security ones.

This was referenced May 10, 2021

Register cmd.Cancel() to token DapperLib/Dapper#1590

Open

Disposing an EF enumerator takes long time on large/complex queries on SQL Server. dotnet/efcore#24857

Closed

cheenamalhotra self-assigned this Sep 1, 2021

roji mentioned this issue Sep 8, 2021

Query: Use cancellationToken appropriately dotnet/efcore#25918

Merged

smartguest mentioned this issue Sep 18, 2021

Fix for cancel query by adding synchronous readers. microsoft/sqltoolsservice#1246

Merged

jnm2 mentioned this issue Sep 29, 2022

SqlDataReader.Close blocks in SNIReadSyncOverAsync after early breaking from reading large dataset #34

Closed

cheenamalhotra mentioned this issue Sep 30, 2022

Fixes ReadAsync() behavior to register Cancellation token action before streaming results #1781

Merged

cheenamalhotra closed this as completed in #1781 Oct 4, 2022

Shane32 mentioned this issue Nov 2, 2022

CancellationToken registration leak, regression #1810

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triggering a cancellation token may not cancel a running query based on timing #1065

Triggering a cancellation token may not cancel a running query based on timing #1065

roji commented May 10, 2021 •

edited

Loading

cheenamalhotra commented May 10, 2021 •

edited

Loading

roji commented May 10, 2021 •

edited

Loading

cheenamalhotra commented May 10, 2021 •

edited

Loading

Wraith2 commented May 10, 2021

cheenamalhotra commented May 10, 2021

roji commented May 10, 2021

roji commented May 10, 2021

Wraith2 commented May 10, 2021

roji commented May 10, 2021

cheenamalhotra commented May 11, 2021 •

edited

Loading

roji commented May 11, 2021

cheenamalhotra commented May 11, 2021 •

edited

Loading

roji commented May 11, 2021

cheenamalhotra commented May 12, 2021 •

edited

Loading

roji commented May 12, 2021

binki commented Sep 1, 2022

jnm2 commented Sep 29, 2022 •

edited

Loading

jnm2 commented Sep 30, 2022

JRahnama commented Sep 30, 2022

Triggering a cancellation token may not cancel a running query based on timing #1065

Triggering a cancellation token may not cancel a running query based on timing #1065

Comments

roji commented May 10, 2021 • edited Loading

cheenamalhotra commented May 10, 2021 • edited Loading

roji commented May 10, 2021 • edited Loading

cheenamalhotra commented May 10, 2021 • edited Loading

Wraith2 commented May 10, 2021

cheenamalhotra commented May 10, 2021

roji commented May 10, 2021

roji commented May 10, 2021

Wraith2 commented May 10, 2021

roji commented May 10, 2021

cheenamalhotra commented May 11, 2021 • edited Loading

roji commented May 11, 2021

cheenamalhotra commented May 11, 2021 • edited Loading

roji commented May 11, 2021

cheenamalhotra commented May 12, 2021 • edited Loading

roji commented May 12, 2021

binki commented Sep 1, 2022

jnm2 commented Sep 29, 2022 • edited Loading

jnm2 commented Sep 30, 2022

JRahnama commented Sep 30, 2022

roji commented May 10, 2021 •

edited

Loading

cheenamalhotra commented May 10, 2021 •

edited

Loading

roji commented May 10, 2021 •

edited

Loading

cheenamalhotra commented May 10, 2021 •

edited

Loading

cheenamalhotra commented May 11, 2021 •

edited

Loading

cheenamalhotra commented May 11, 2021 •

edited

Loading

cheenamalhotra commented May 12, 2021 •

edited

Loading

jnm2 commented Sep 29, 2022 •

edited

Loading