-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL Server: Optimize SQL Server OUTPUT clause usage when retrieving database-generated values #27372
Comments
Thanks @ajcvickers, I was looking into IDENTITY as well. Reopened #7188 to revisit that pattern as well. |
@AndriySvyryd's suggestion of trying a memory-optimized TVP does reduce the time to something much closer to regular OUTPUT. However:
Note: I also benchmarked using ExecuteReaderAsync instead of ExecuteScalarAsync, and there was no difference. Benchmark codeBenchmarkRunner.Run<SequenceBenchmark>();
public class SequenceBenchmark
{
const string ConnectionString = "Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false";
private SqlConnection _connection;
[GlobalSetup]
public async Task Setup()
{
_connection = new SqlConnection(ConnectionString);
await _connection.OpenAsync();
await using var cmd = new SqlCommand(@"
DROP TYPE IF EXISTS [tempTableType];
DROP TABLE IF EXISTS [Foo];
DROP SEQUENCE IF EXISTS [FooSeq];
CREATE SEQUENCE [FooSeq] AS int START WITH 1 INCREMENT BY 1 NO MINVALUE NO MAXVALUE NO CYCLE;
CREATE TABLE [Foo] (
[Id] int PRIMARY KEY NOT NULL DEFAULT (NEXT VALUE FOR FooSeq),
[BAR] int
);
CREATE TYPE [tempTableType] AS TABLE ([Id] int PRIMARY KEY NONCLUSTERED) WITH (MEMORY_OPTIMIZED = ON)", _connection);
await cmd.ExecuteNonQueryAsync();
}
[Benchmark]
public async Task NoOutput()
{
await using var cmd = new SqlCommand("INSERT INTO [Foo] ([Bar]) VALUES (8)", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark]
public async Task Output()
{
await using var cmd = new SqlCommand("INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] VALUES (8)", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark(Baseline = true)]
public async Task OutputInto()
{
await using var cmd = new SqlCommand(@"DECLARE @inserted TABLE ([Id] int);
INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] INTO @inserted VALUES (8);
SELECT [i].[Id] FROM @inserted i;
", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark]
public async Task OutputInto_MemoryOptimized_persistent_type()
{
await using var cmd = new SqlCommand(@"DECLARE @inserted [tempTableType];
INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] INTO @inserted VALUES (8);
SELECT [i].[Id] FROM @inserted i;
", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[Benchmark]
public async Task OutputInto_MemoryOptimized_transient_type()
{
await using var cmd = new SqlCommand(@"CREATE TYPE [transientTempTableType] AS TABLE ([Id] int PRIMARY KEY NONCLUSTERED) WITH (MEMORY_OPTIMIZED = ON)
DECLARE @inserted [tempTableType];
INSERT INTO [Foo] ([Bar]) OUTPUT INSERTED.[Id] INTO @inserted VALUES (8);
SELECT [i].[Id] FROM @inserted i;
DROP TYPE [transientTempTableType];
", _connection);
_ = await cmd.ExecuteScalarAsync();
}
[GlobalCleanup]
public ValueTask Cleanup()
=> _connection.DisposeAsync();
} |
Why are you testning with SEQUENCE ? No one uses that, not even migrations conventions. |
@ErikEJ it doesn't really matter, it can be any sort of HasDefaultValue/HasDefaultValueSql. It's just that for IDENTITY specifically EF Core uses a different mechanism to bring back the value, so I preferred not to benchmark that. Basically anyone not using IDENTITY, but having a database-generated column is affected by this. Ths above is also specific to inserting few rows - when batching many rows we use MERGE. A similar thing may exist there, but more research is needed. |
Seems like quite a narrow audience. For me it is usually either IDENTITY or a client side generated guid. |
Maybe you're right, though if you have a guid key plus some other generated column, you're affected by this too. I'm looking at the batching version too (with MERGE), but we may end up deciding it's not worth addressing this. |
Yeah, but again a very uncommon edge case scenario, in my experience. |
I've taken a look at our MERGE usage, which also uses OUTPUT INTO. Here are some ideas:
The above are just ideas at this point... I do believe that 3ms are a pretty huge difference though. Benchmark results
Benchmark codeBenchmarkRunner.Run<SequenceBenchmark>();
public class SequenceBenchmark
{
const string ConnectionString = "Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false";
private SqlCommand _command;
private SqlConnection _connection;
[Params(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)]
public int Rows { get; set; }
private async Task Setup()
{
_connection = new SqlConnection(ConnectionString);
await _connection.OpenAsync();
await using var cmd = new SqlCommand(@"
DROP TABLE IF EXISTS [Foo];
DROP SEQUENCE IF EXISTS [FooSeq];
CREATE SEQUENCE [FooSeq] AS int START WITH 1 INCREMENT BY 1 NO MINVALUE NO MAXVALUE NO CYCLE;
CREATE TABLE [Foo] (
[Id] int PRIMARY KEY NOT NULL DEFAULT (NEXT VALUE FOR FooSeq),
[BAR] int
);", _connection);
await cmd.ExecuteNonQueryAsync();
}
[GlobalSetup(Target = nameof(Merge_with_OutputInto))]
public async Task Setup_Merge_with_OutputInto()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.AppendLine(@"DECLARE @inserted0 TABLE ([Id] int, [_Position] [int]);
MERGE [Blogs] USING (
VALUES");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i}, {i})");
if (i < Rows - 1)
builder.AppendLine(",");
_command.Parameters.Add(new("p" + i, i));
}
builder.Append(@") AS i ([Name], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([Name])
VALUES (i.[Name])
OUTPUT INSERTED.[Id], i._Position
INTO @inserted0;
SELECT [i].[Id] FROM @inserted0 i
ORDER BY [i].[_Position];");
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Merge_with_Output))]
public async Task Setup_Merge_with_Output()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.AppendLine(@"MERGE [Blogs] USING (
VALUES");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i})");
if (i < Rows - 1)
builder.AppendLine(",");
_command.Parameters.Add(new("p" + i, i));
}
builder.Append(@") AS i ([Name]) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([Name])
VALUES (i.[Name])
OUTPUT INSERTED.[Id];");
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(BatchedInserts))]
public async Task Setup_BatchInserts()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
for (var i = 0; i < Rows; i++)
{
if (i > 0)
builder.Append("; ");
builder.Append($"INSERT INTO [Blogs] ([Name]) VALUES (@p{i})");
_command.Parameters.Add(new("p" + i, i));
}
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Inserts))]
public async Task Setup_Inserts()
{
await Setup();
_command = new SqlCommand("INSERT INTO [Blogs] ([Name]) VALUES (@p0)", _connection);
_command.Parameters.Add(new SqlParameter { ParameterName = "p0" });
}
[Benchmark(Baseline = true)]
public async Task Merge_with_OutputInto()
=> _ = await _command.ExecuteScalarAsync();
[Benchmark]
public async Task Merge_with_Output()
=> _ = await _command.ExecuteScalarAsync();
[Benchmark]
public async Task BatchedInserts()
=> _ = await _command.ExecuteScalarAsync();
[Benchmark]
public async Task Inserts()
{
for (var i = 0; i < Rows; i++)
{
_command.Parameters[0].Value = i;
_ = await _command.ExecuteScalarAsync();
}
}
[GlobalCleanup]
public ValueTask Cleanup()
=> _connection.DisposeAsync();
} |
So, for generated keys, you mean pass in the temporary values for all the key columns, and then read them back out correlated with the real values? |
No need to pass them in - just ask for them to be sent back to you via the OUTPUT clause. Yeah, this adds more traffic, but given the extreme overhead of OUTPUT INTO there's a good chance it's justified. (note also the very common scenario where the ID itself is database-generated, so must be in the OUTPUT clause in any case). |
One additional note from the above: if we did switch to MERGE with OUTPUT instead of OUTPUT INTO, it becomes an efficient strategy starting with 1 row, so we could drop the INSERT/MERGE threshold and just always do MERGE (of course the above caveats still hold). |
But how can we look up the entity for which to set the generated ID when we don't know the ID until it is generated? |
Oops, you're right - I got tangled up here between the various scenarios. Will think about this some more and post something later. |
We still have the option of using MERGE with the "fake" position column, but with simple OUTPUT instead of OUTPUT INTO: MERGE [Foo] USING (
VALUES (100, 0),
(101, 1),
(102, 2),
(103, 3),
(104, 4),
(105, 5),
(106, 6),
(107, 7),
(108, 8),
(109, 9)) AS i ([BAR], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([BAR])
VALUES (i.[BAR])
OUTPUT INSERTED.[Id], i._Position; That precludes doing a server-side ORDER BY (as we do today with OUTPUT INTO), and the ordering isn't guaranteed. But at the client we could look up the entity instance by its position, and propagate the server-generated values. The cost here is the transfer of the additional position value to the client (per row), weighed against the 3x perf hit of using OUTPUT INTO. Though it seems pretty clear which side this goes perf-wise, I'll do a benchmark later and post it. |
See below for another, more complete benchmark run. tl;dr doing OUTPUT with position has the same perf as OUTPUT without it.
Benchmark codeBenchmarkRunner.Run<SequenceBenchmark>();
public class SequenceBenchmark
{
const string ConnectionString = "Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false";
private SqlCommand _command;
private SqlConnection _connection;
[Params(1, 2, 3, 4, 5, 8, 10, 50, 100, 500)]
// [Params(1)]
public int Rows { get; set; }
private async Task Setup()
{
_connection = new SqlConnection(ConnectionString);
await _connection.OpenAsync();
await using var cmd = new SqlCommand(@"
DROP TABLE IF EXISTS [Foo];
CREATE TABLE [Foo] (
[Id] int IDENTITY PRIMARY KEY,
[Bar] int
);", _connection);
await cmd.ExecuteNonQueryAsync();
}
[GlobalSetup(Target = nameof(Merge_with_OutputInto))]
public async Task Setup_Merge_with_OutputInto()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.AppendLine(@"DECLARE @inserted0 TABLE ([Id] int, [_Position] [int]);
MERGE [Foo] USING (
VALUES");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i}, {i})");
if (i < Rows - 1)
builder.AppendLine(",");
_command.Parameters.Add(new("p" + i, 8));
}
builder.Append(@") AS i ([Bar], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([Bar])
VALUES (i.[Bar])
OUTPUT INSERTED.[Id], i._Position
INTO @inserted0;
SELECT [i].[Id] FROM @inserted0 i
ORDER BY [i].[_Position];");
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Merge_with_Output))]
public async Task Setup_Merge_with_Output()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.AppendLine(@"MERGE [Foo] USING (
VALUES");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i})");
if (i < Rows - 1)
builder.AppendLine(",");
_command.Parameters.Add(new("p" + i, 8));
}
builder.Append(@") AS i ([Bar]) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([Bar])
VALUES (i.[Bar])
OUTPUT INSERTED.[Id];");
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Merge_with_Output_with_position))]
public async Task Setup_Merge_with_Output_with_position()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.AppendLine(@"MERGE [Foo] USING (
VALUES");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i}, {i})");
if (i < Rows - 1)
builder.AppendLine(",");
_command.Parameters.Add(new("p" + i, 8));
}
builder.Append(@") AS i ([Bar], _Position) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([Bar])
VALUES (i.[Bar])
OUTPUT INSERTED.[Id], i._Position;");
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Merge_without_Output))]
public async Task Setup_Merge_without_Output()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.AppendLine(@"MERGE [Foo] USING (
VALUES");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i})");
if (i < Rows - 1)
builder.AppendLine(",");
_command.Parameters.Add(new("p" + i, 8));
}
builder.Append(@") AS i ([Bar]) ON 1=0
WHEN NOT MATCHED THEN
INSERT ([Bar])
VALUES (i.[Bar]);");
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Insert_row_values_without_Output))]
public async Task Setup_Insert_row_values_without_Output()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
builder.Append(@"INSERT INTO [Foo] ([Bar]) VALUES ");
for (var i = 0; i < Rows; i++)
{
builder.Append($"(@p{i})");
if (i < Rows - 1)
builder.Append(",");
_command.Parameters.Add(new("p" + i, 8));
}
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(BatchedInserts))]
public async Task Setup_BatchInserts()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
for (var i = 0; i < Rows; i++)
{
if (i > 0)
builder.Append("; ");
builder.Append($"INSERT INTO [Foo] ([Bar]) VALUES (@p{i})");
_command.Parameters.Add(new("p" + i, 8));
}
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Inserts))]
public async Task Setup_Inserts()
{
await Setup();
_command = new SqlCommand("INSERT INTO [Foo] ([Bar]) VALUES (@p0)", _connection);
_command.Parameters.AddWithValue("p0", 8);
}
[GlobalSetup(Target = nameof(Insert_and_Select_batched))]
public async Task Setup_Insert_and_Select_batched()
{
await Setup();
_command = new SqlCommand { Connection = _connection };
var builder = new StringBuilder();
for (var i = 0; i < Rows; i++)
{
if (i > 0)
builder.Append("; ");
builder.Append(@$"INSERT INTO [Foo] ([Bar]) VALUES (@p{i});
SELECT [Id] FROM [Blogs] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();");
_command.Parameters.Add(new("p" + i, 8));
}
_command.CommandText = builder.ToString();
}
[GlobalSetup(Target = nameof(Insert_and_Select))]
public async Task Setup_Insert_and_Select()
{
await Setup();
_command = new SqlCommand(@"INSERT INTO [Foo] ([Bar]) VALUES (@p0);
SELECT [Id] FROM [Blogs] WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity()", _connection);
_command.Parameters.AddWithValue("p0", 8);
}
[Benchmark(Baseline = true)]
public async Task Merge_with_OutputInto()
{
using var reader = await _command.ExecuteReaderAsync();
while (reader.Read())
_ = reader.GetInt32(0);
}
[Benchmark]
public async Task Merge_with_Output()
{
using var reader = await _command.ExecuteReaderAsync();
while (reader.Read())
_ = reader.GetInt32(0);
}
[Benchmark]
public async Task Merge_with_Output_with_position()
{
using var reader = await _command.ExecuteReaderAsync();
while (reader.Read())
_ = reader.GetInt32(0);
}
[Benchmark]
public async Task Merge_without_Output()
=> await _command.ExecuteNonQueryAsync();
[Benchmark]
public async Task Insert_row_values_without_Output()
=> await _command.ExecuteNonQueryAsync();
[Benchmark]
public async Task BatchedInserts()
=> await _command.ExecuteNonQueryAsync();
[Benchmark]
public async Task Inserts()
{
for (var i = 0; i < Rows; i++)
await _command.ExecuteNonQueryAsync();
}
[Benchmark]
public async Task Insert_and_Select_batched()
{
using var reader = await _command.ExecuteReaderAsync();
while (reader.Read())
_ = reader.GetInt32(0);
}
[Benchmark]
public async Task Insert_and_Select()
{
for (var i = 0; i < Rows; i++)
_ = await _command.ExecuteScalarAsync();
}
[GlobalCleanup]
public ValueTask Cleanup()
=> _connection.DisposeAsync();
} |
@roji You'd need to also take into account the client-side lookup for OUTPUT with position, though I doubt it will be significant. |
@AndriySvyryd you're right. Ideally that would be a single array/list lookup per row, which should be really negligible. But in any case, assuming we move forward with this, I'll do proper benchmarking on the final implementation too. |
Note for triage: clearing milestone since nobody is assigned. |
When I'm trying to save into a table with computed columns, I get this error:
Then I follow the link in the suggested solution (https://aka.ms/efcore-docs-sqlserver-save-changes-and-computed-columns) and the "mitigation" says to use the HasTrigger("SomeTrigger") method. If I do that in my code - even if I actually don't have any trigger on that table - then the save operation completes successfully. But I don't like the "solution/mitigation" because:
Can you, please, give me a better solution? Thank you. |
You should be able to add the HasTrigger statement in a partial OnModelCreatingPartial method in a partial class that will not get overwritten. |
I agree, but that would fix only my second point. What about the first one? I should not "lie" to EF about having a trigger, when I don't have one. I should have something like
|
For the first one, the docs are being updated. |
And do we have an estimation about when they would be ready? |
Next time the docs are published, but you can have a peek here: https://github.com/dotnet/EntityFramework.Docs/pull/4131/files |
Hi, i dont see an example of how to solve this breaking change when i try to do an update to a table with computed columns. |
@gpovedaDev just pretend the table has a trigger. |
I wonder if i need to specify something to indicate that the name i put in HasTrigger comes from a computed column. Lets say i have a table called transaction with a computed column Commission |
@gpovedaDev no, the name you give the trigger is irrelevant; just telling EF that there's a trigger on the table should automatically make it revert to use the less efficient updating method which is compatible with compuetd columns. If that's not working for you, please open a new issue with a runnable, minimal repro. |
@roji Is there any way to globally disable the new behavior? I'm just asking because I'm using Azure SQL Data Sync which adds multiple triggers on all tables, which is really time-consuming to add |
@ajcvickers Thanks. Sorry for the noise, I missed the section with the |
while still an option, it would be nice to have a simple flag to opt-out from these improvements and the underlying breaking changes instead of copy-pasting the |
@mayureshs EF 8.0 introduced UseSqlOutputClause() as an alternative to HasTrigger(), see these docs. But aside from that, what's your exact issue with having this in the model snapshot? |
@roji , Thanks for replying. Our database does not have any application level triggers but uses Azure Data Sync. This worked fined with EF6. But it was my understanding that while upgrading to EF8 we need to add the BlankTriggerConvention. Correct me if wrong, but instead of the BlankTriggerConvention we now have to specify tb.UseSqlOutputClause(false) to every table in our model? |
@mayureshs I'm not very familiar with Azure Data Sync, but if it doesn't support the SQL OUTPUT clause (like tables with triggers), then you'll have to inform EF to avoid using the OUTPUT clause for those tables. Which mechanism you use to do that - by saying there's a trigger or by using UseSqlOutputClause - doesn't matter, you can use either one. |
When inserting an entity with database-generated columns, we currently generate the following (as long as there's no IDENTITY column):
This could be simplified into this:
The roundabout through the
inserted0
TVP is probably because the OUTPUT clause won't work if there's a trigger defined, unless it's anOUTPUT INTO
(??) (@AndriySvyryd it this right, any more context?).Unfortunately, using
OUTPUT INTO
instead ofOUTPUT
adds a lot of overhead:That's over 3ms just for passing through a TVP! Also, mysteriously the version with no OUTPUT clause at all performs worse...
Remarks:
OUTPUT
, and switch back toOUTPUT INTO
if the user tells us the table has triggers (via metadata).Benchmark code
The text was updated successfully, but these errors were encountered: