Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidOperationException when using MySQLExecutor and Pipelined query #1469

Closed
piotr-wieruszewski opened this issue May 20, 2020 · 8 comments

Comments

@piotr-wieruszewski
Copy link

The above Exception is thrown when executing the ExecuteAsync method, CommandDefinition having flags: CommandFlags.Pipelined and more than 2 rows require inserting. E.g.

await connection.ExecuteAsync(
    new CommandDefinition(
        "INSERT INTO Orders(orderId) VALUES (@orderId)", 
         orders, flags: CommandFlags.Pipelined));

The exception:

System.InvalidOperationException: Cannot set MySqlCommand.CommandText when there is an open DataReader for this command; it must be closed first.
   at MySql.Data.MySqlClient.MySqlCommand.set_CommandText(String value) in /_/src/MySqlConnector/MySql.Data.MySqlClient/MySqlCommand.cs:line 147
   at Dapper.CommandDefinition.SetupCommand(IDbConnection cnn, Action`2 paramReader) in /_/Dapper/CommandDefinition.cs:line 114
   at Dapper.SqlMapper.ExecuteMultiImplAsync(IDbConnection cnn, CommandDefinition command, IEnumerable multiExec) in /_/Dapper/SqlMapper.Async.cs:line 610

Exception is happening for MySqlConnector version 0.66.0 (not tested with previous versions) and Dapper versions 1.50.5-2.0.35.
Application is run on .NET Core 2.1.

When switching to MySQL.Data library the above query works fine.

@mgravell
Copy link
Member

It sounds like your chosen provider doesn't support pipelining,or it isn't enabled. Most simply do not, and even those that do: sometimes (for example SqlClient) require it to be opt-in enabled on the connection string. So: if it isn't enabled: don't specify the pipelined flag. That's why we make it opt-in.

@piotr-wieruszewski
Copy link
Author

Just to clarify - MySQL.Data library with exactly the same connection string (there is not much to opt-in in MySQL in regards to pipelined queries, no MARS support) works and I can see performance improvements - is using this method (with pipelined flag passed): https://github.com/StackExchange/Dapper/blob/d403fa5b5481291ca84aaf11ac4d61a1034d3b72/Dapper/SqlMapper.Async.cs#L547.
However, when switched to MySQLConnector: https://mysqlconnector.net/ - which passes ADO.NET specification tests: https://mysql-net.github.io/AdoNetResults/ is failing with exactly the same connection string when trying to execute the same pipelined query.
Could you confirm that you mean that MySQL doesn't support Dapper pipelined queries and that feature shouldn't be used with MySQL? Or, something wrong with that connector (MySQLConnector )when trying to execute pipelined queries?

@mgravell
Copy link
Member

No, I absolutely cannot "confirm" anything about either of those two providers. They're not my providers. But anecdotally, it would seem a reasonable conclusion based on the available data. It is a niche feature that very few providers support. It wouldn't surprise me to find another that doesn't.

@bgrainger
Copy link
Contributor

I'm not familiar with Dapper's pipelining feature (it's not mentioned at https://github.com/StackExchange/Dapper/blob/master/Readme.md).

This may be a MySqlConnector bug (over-zealous enforcement of preconditions), or it may not actually be supported. If it does work with MySql.Data, it could be because none of its async methods are actually asynchronous (https://bugs.mysql.com/bug.php?id=70111) so async methods might just be executing in a synchronous loop.

I've opened a case to investigate: mysql-net/MySqlConnector#823

This issue can be closed, because the problem is highly unlikely to be on Dapper's end.

@bgrainger
Copy link
Contributor

(Commenting here rather than on the MySqlConnector issue so Marc can set me straight if I'm misunderstanding.)

I've looked over the pipelining code in SqlMapper.Async.cs line 547ff: https://github.com/StackExchange/Dapper/blob/d403fa5b5481291ca84aaf11ac4d61a1034d3b72/Dapper/SqlMapper.Async.cs#L547

It appears to create a queue of up to 100 MySqlCommand objects and execute them all on the same MySqlConnection "simultaneously" (or at least not waiting for each one to complete before initiating the next one). This is definitely not officially supported by MySqlConnector: https://mysqlconnector.net/troubleshooting/connection-reuse/

(Also, my guess above was right, because MySql.Data will always return an already-completed Task from this line of code: https://github.com/StackExchange/Dapper/blob/d403fa5b5481291ca84aaf11ac4d61a1034d3b72/Dapper/SqlMapper.Async.cs#L589)

The simple answer is: this isn't supported.

The real answer is: it's complicated. Some MySQL servers (e.g., MySQL, MariaDB) do support pipelining. Others (e.g., Aurora) will fail hard if this is attempted. However, this isn't exposed through the MySqlConnector ADO.NET API at all. For "safety" reasons (most attempts to do this are incorrect usage of the API that will generate incorrect results), and to keep the code (much) simpler (from not having to track multiple in-flight operations with thread-safe code and align the results to the MySqlCommand that generated them), I don't have any plans to implement this.

If you want a more efficient way of executing multiple commands, you could bypass Dapper and use MySqlDbBatch (see https://mysqlconnector.net/api/mysql-batch/; background at mysql-net/MySqlConnector#650).

@piotr-wieruszewski
Copy link
Author

I am using Piplined Dapper queries with MySQL.Data and Aurora RDS for MySQL (also tested with Aurora Serverless for MySQL) and it works well.

(Also, my guess above was right, because MySql.Data will always return an already-completed Task from this line of code:

Dapper/Dapper/SqlMapper.Async.cs

Line 589 in d403fa5

var task = cmd.ExecuteNonQueryAsync(command.CancellationToken);
)

I am not so sure about it (that the task is already completed when using MySQL.Data) as based on my observation Dapper Pipelined query improves the query speed by around 30% (vs query without Pipeline flag) for a query with large enough list of records to update/insert.
It seems that Pipelined query is a Dapper concept - it's a form of a Pipeline pattern: https://michaelscodingspot.com/pipeline-pattern-implementations-csharp/ - as @bgrainger explained - there is a queue with max of 100 unfinished tasks. In case of MySQL, by Pipelined query support we mean that it allows to execute more than one query using the same MySqlCommand (just the text of the command is modified).
As you said, it seems that it doesn't work with MySQLConnector because that provider enforces one DataReader object per connection: https://github.com/mysql/mysql-connector-net/blob/68c54371821c87ff40a773acc127ce357b46a5ae/Source/MySql.Data/command.cs#L321
I am not sure what is the reason for that, but potentially Dapper could use unique MySqlCommand object per Task?
Or, I could potentially move that Pipeline pattern to my code as I don't want to use MySQL batch feature in MySQLConnector as it is experimental feature.

@bgrainger
Copy link
Contributor

I am using Piplined Dapper queries with MySQL.Data and Aurora RDS for MySQL (also tested with Aurora Serverless for MySQL) and it works well.

I can guarantee you it's not pipelining them. The only occurrence of ExecuteNonQueryAsync in the MySql.Data library is in this code, where it clearly returns a completed task: https://github.com/mysql/mysql-connector-net/blob/3b8c67a21eda57eaecbef38d291fc7d026cd184e/MySQL.Data/src/MySqlHelper.cs#L370-L390

(The method you're calling is actually in the base DbCommand class which does exactly the same thing.)

The commands are being executed synchronously one after the other.

based on my observation Dapper Pipelined query improves the query speed by around 30% (vs query without Pipeline flag) for a query with large enough list of records to update/insert.

I can't see an obvious reason for that in the Dapper code. (Unless reusing a MySqlCommand in MySql.Data is somehow much worse performance-wise than creating 100 distinct MySqlCommand objects?) Anything more is just speculation at this point; we'd need to see a comparison of performance traces captured with dotTrace or similar to understand why Pipelined is faster for MySql.Data.

I am not sure what is the reason for that, but potentially Dapper could use unique MySqlCommand object per Task?

In MySqlConnector, a MySqlConnection can only be used by one thread/command at a time. The same is actually true for MySql.Data, but you don't experience the problem with that library because all its async I/O is actually synchronous; you'd have to explicitly use multiple threads to try to implement pipelining to get it to happen.

@piotr-wieruszewski
Copy link
Author

Closing this issue as it's explained well here - pipelined queries are actually not supported by MySQL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants