Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop wrapping single changes in transactions where possible #27439

Closed
Tracked by #26797 ...
roji opened this issue Feb 14, 2022 · 19 comments · Fixed by #27500
Closed
Tracked by #26797 ...

Stop wrapping single changes in transactions where possible #27439

roji opened this issue Feb 14, 2022 · 19 comments · Fixed by #27500
Assignees
Labels
area-perf area-save-changes closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. providers-beware type-enhancement
Milestone

Comments

@roji
Copy link
Member

roji commented Feb 14, 2022

We currently always wrap changes in a transaction - but that isn't necessary when only a single change is involved. Here's a comparison of a single change with and without a transaction:

Method DatabaseType Mean Error StdDev Median Ratio RatioSD
UpdateWithTransaction Postgres 1,171.1 us 20.53 us 19.20 us 1,172.1 us 1.00 0.00
UpdateWithoutTransaction Postgres 998.9 us 19.17 us 16.99 us 999.4 us 0.85 0.02
UpdateWithoutTransaction_SetImplicitTransaction Postgres NA NA NA NA ? ?
Query Postgres 202.0 us 5.40 us 15.93 us 206.9 us 0.18 0.01

So the difference on PostgreSQL is 14.6%. Note that since unneeded roundtrips are eliminated (2 on Npgsql, 3 on SQL Server and most other providers), the perf gain increases as server latency increases (the above 14.6% are against localhost, so it's a minimum figure).

On SQL Server (localhost) the gain is higher, 27.8%.

Note that when an external transaction already exists, we still need to create a savepoint, even if there's only one change, since databases behave quite differently when a failure occurs during transaction, and a rollback to a savepoint is necessary in at least some cases (e.g. PostgreSQL).

Benchmark code
BenchmarkRunner.Run<Benchmark>();

public class Benchmark
{
    private DbConnection _connection;
    private DbCommand _command;

    [Params(DatabaseType.Postgres, DatabaseType.SqlServer)]
    public DatabaseType DatabaseType { get; set; }

    private async Task Setup()
    {
        _connection = DatabaseType == DatabaseType.Postgres
            ? new NpgsqlConnection("Host=localhost;Username=test;Password=test;Max Auto Prepare=10")
            : new SqlConnection("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Trust Server Certificate=true");
        await _connection.OpenAsync();

        using var cmd = _connection.CreateCommand();
        cmd.CommandText = @"
DROP TABLE IF EXISTS data;
CREATE TABLE data (id int PRIMARY KEY, num INT);
INSERT INTO data (id, num) VALUES (1, 1)";
        await cmd.ExecuteNonQueryAsync();
    }

    [GlobalSetup(Targets = new[] { nameof(UpdateWithTransaction), nameof(UpdateWithoutTransaction)})]
    public async Task Setup_Update()
    {
        await Setup();

        _command = _connection.CreateCommand();
        _command.CommandText = "UPDATE data SET num = num + 1";
    }

    [GlobalSetup(Target = nameof(UpdateWithoutTransaction_SetImplicitTransaction))]
    public async Task Setup_UpdateWithoutTransaction_SetImplicitTransaction()
    {
        if (DatabaseType != DatabaseType.SqlServer)
            throw new NotSupportedException("Benchmark relevant only for SQL Server");
        await Setup();

        _command = _connection.CreateCommand();
        _command.CommandText = "SET IMPLICIT_TRANSACTIONS OFF; UPDATE data SET num = num + 1";
    }


    [GlobalSetup(Target = nameof(Query))]
    public async Task Setup_Query()
    {
        await Setup();

        _command = _connection.CreateCommand();
        _command.CommandText = "SELECT num FROM data WHERE id = 1";
    }

    [Benchmark(Baseline = true)]
    public async Task UpdateWithTransaction()
    {
        var tx = _connection.BeginTransaction();
        _command.Transaction = tx;
        await _command.ExecuteNonQueryAsync();
        tx.Commit();
    }

    [Benchmark]
    public async Task UpdateWithoutTransaction()
    {
        await _command.ExecuteNonQueryAsync();
    }

    [Benchmark]
    public async Task UpdateWithoutTransaction_SetImplicitTransaction()
    {
        await _command.ExecuteNonQueryAsync();
    }

    [Benchmark]
    public async Task Query()
    {
        await _command.ExecuteNonQueryAsync();
    }
}

public enum DatabaseType
{
    Postgres,
    SqlServer
}
@roji
Copy link
Member Author

roji commented Feb 14, 2022

Note: I still need to do some due dilligence that dropping single-statement transactions has no effect across databases.

@ajcvickers
Copy link
Member

See #6342.

@roji
Copy link
Member Author

roji commented Feb 14, 2022

@ajcvickers thanks... I updated the issue with the benchmark code - note that this was a pure ADO.NET benchmark (though such gains are very likely to show through with EF too).

@ajcvickers
Copy link
Member

@roji Very possible things have changed since 2016. Also, that was SQL Server only. Also, I thought we added an option to DbContext to opt out of transaction creation, but I can't find it. So either we didn't do it, we removed it, or my long-term memory is corrupted.

@ajcvickers
Copy link
Member

ajcvickers commented Feb 14, 2022

@roji Found it. Issue #6339. PR #6413. API: context.Database.AutoTransactionsEnabled.

@roji
Copy link
Member Author

roji commented Feb 14, 2022

Ah, but that disables all transactions, regardless of the number of changes, right? That's quite different.

BTW am surprised that disabling the transaction speeds things up - usually it's the opposite. Interesting. Maybe removing the transaction was faster... with just one update...

@roji
Copy link
Member Author

roji commented Feb 14, 2022

Note: IMPLICIT_TRANSACTIONS is indeed a problem. Although this is an obscure feature, if it happens to be on, then changes aren't actually committed until the next COMMIT (see code sample below). We could mitigate this by prepending SET IMPLICIT_TRANSACTIONS OFF before each update (should be fine perf-wise); we'd be messing with connection state, though we already do that elsewhere (e.g. with NOCOUNT).

IMPLICIT_TRANSACTION demo
await using var conn = new SqlConnection("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false");
await conn.OpenAsync();

using var cmd = new SqlCommand(@"DROP TABLE IF EXISTS data; CREATE TABLE data (id int PRIMARY KEY, num INT)", conn);
await cmd.ExecuteNonQueryAsync();

// Regular (IMPLICIT_TRANSACTIONS is OFF)
cmd.CommandText = "INSERT INTO data (id, num) VALUES (1, 1)";
await cmd.ExecuteNonQueryAsync();

await conn.CloseAsync();
await conn.OpenAsync();

cmd.CommandText = "SELECT COUNT(*) FROM data";
Console.WriteLine("Rows1: " + await cmd.ExecuteScalarAsync()); // 1

// IMPLICIT_TRANSACTIONS is ON

cmd.CommandText = "SET IMPLICIT_TRANSACTIONS ON";
await cmd.ExecuteNonQueryAsync();

cmd.CommandText = "INSERT INTO data (id, num) VALUES (2, 2)";
await cmd.ExecuteNonQueryAsync();

await conn.CloseAsync();
await conn.OpenAsync();

cmd.CommandText = "SELECT COUNT(*) FROM data";
Console.WriteLine("Rows2: " + await cmd.ExecuteScalarAsync()); // 1

@roji
Copy link
Member Author

roji commented Feb 14, 2022

SQLite does autocommit (the sane thing) by default (docs), and I don't see a way to disable that (good).

MySQL has the same situation as SQL Server (docs): while it does autocommit by default, you can do set autocommit = 0, at which point there's an implicit transaction.

So:

  • We could allow providers to opt into no-transaction-for-single-updates.
  • PostgreSQL and SQLite would just turn this on
  • SQL Server and MySQL can decide if they want to turn this on, and to prepend all batches with a statement to turn autocommit on.

@roji
Copy link
Member Author

roji commented Feb 15, 2022

Note: updated the benchmark above with a scenario where we prepend SET IMPLICIT_TRANSACTIONS OFF to the outgoing batch, to force autocommit mode. The perf degradation is only 20us, or 1%; so perf-wise this technique is OK.

@roji
Copy link
Member Author

roji commented Feb 15, 2022

Note by @AndriySvyryd: this may cause us to prefer a single-statement for insert when returning generated values (#27372). For example, today for single-updates which involve identity columns, we do INSERT+SELECT:

INSERT INTO [Blogs] ([Name])
VALUES (@p0);
SELECT [Id], [Foo]
FROM [Blogs]
WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();

However, it's not certain that these two statements actually need to be wrapped in a transaction. Because we use read commited isolation (the default), concurrent updates between the INSERT and the SELECT are already possible (see #27446). Since scope_identity() doesn't care if it's in a transaction or not, I'm not sure the transaction is important.

@roji
Copy link
Member Author

roji commented Feb 18, 2022

Design decisions:

  • We will stop wrapping single-updates SaveChanges with transactions.
  • For SQL Server, we will prepend SET IMPLICIT_TRANSACTIONS OFF for SQL Server to ensure autocommit is on. Other providers which have autocommit opt-outs will need to do the same (/cc @lauxjpn for MySQL, where you could prepend SET autocommit=0).
  • For single-update SaveChanges that uses multiple SQL commands (INSERT+SELECT), we'll be stopping to use those because of concurrency issues which can affect the results (#27446) (at the same time, INSERT+SELECT should be fine without a transaction).

@GSPP
Copy link

GSPP commented Feb 22, 2022

Even the documentation for IMPLICIT_TRANSACTIONS says "it's not popular"... Maybe just ignore this? Are there customers who want this? I'd be surprised if there is an ask for that.

@roji
Copy link
Member Author

roji commented Feb 22, 2022

@GSPP the docs do say that, but they also say that when IMPLICIT_TRANSACTIONS is on, that's usually a result of ANSI_DEFAULTS - and it's hard to know how many people use that.

In any case, my logic here is that if we drop the transaction without handling IMPLICIT_TRANSACTIONS, the worst kind of breaking change would occur for those (few) who do turn it on: data will silently stop being committed; at the same time, compensating by setting IMPLICIT_TRANSACTIONS to off at the beginning of the batch has very little perf impact (otherwise I'd have hesitated more). This is very similar to doing SET NOCOUNT ON, which EF Core already adds at the beginning of every SaveChanges batch.

@roji roji changed the title Stop wrapping single changes in transactions Stop wrapping single changes in transactions where possible Feb 23, 2022
roji added a commit to roji/efcore that referenced this issue Feb 23, 2022
roji added a commit to roji/efcore that referenced this issue Feb 23, 2022
roji added a commit to roji/efcore that referenced this issue Mar 1, 2022
roji added a commit to roji/efcore that referenced this issue Mar 1, 2022
roji added a commit to roji/efcore that referenced this issue Mar 1, 2022
roji added a commit to roji/efcore that referenced this issue Mar 2, 2022
roji added a commit to roji/efcore that referenced this issue Mar 2, 2022
roji added a commit that referenced this issue Mar 4, 2022
@roji roji added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Mar 4, 2022
@ajcvickers ajcvickers modified the milestones: 7.0.0, 7.0.0-preview3 Mar 15, 2022
@polterguy

This comment was marked as off-topic.

@roji

This comment was marked as off-topic.

@polterguy

This comment was marked as off-topic.

@roji

This comment was marked as off-topic.

@roji

This comment was marked as off-topic.

@polterguy

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf area-save-changes closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. providers-beware type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants