Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysqlnd: support ER_CLIENT_INTERACTION_TIMEOUT #13618

Merged

Conversation

Appla
Copy link
Contributor

@Appla Appla commented Mar 7, 2024

MySQL introduced a new error type for wait_timeout in (mysql/mysql-server@14508bb )
which causes mysqlnd report warning Packets out of order. Expected 1 received 0. Packet size=145 see (https://bugs.php.net/bug.php?id=81335)

This MR supports this new error type in a conservative way (and only handles this type of error).

@kamil-tekiela
Copy link
Member

Could you provide a unit test for this?

If I remember correctly there was a bug report about this. Do you know the number?

ext/mysqlnd/mysqlnd_wireprotocol.c Outdated Show resolved Hide resolved
@@ -292,9 +310,12 @@ mysqlnd_read_packet_header_and_body(MYSQLND_PACKET_HEADER * packet_header,
{
DBG_ENTER("mysqlnd_read_packet_header_and_body");
DBG_INF_FMT("buf=%p size=%zu", buf, buf_size);
assert(error_info->error_no == 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for? Is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following part required error_info->error_no are 0 here, is this always true in here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't know if this is guaranteed, but it seems to me like it should be. Either way, I don't think adding an assert is correct here. Can you remove it and see if it still works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this assert has no effect. I originally considered keeping it in debug mode to catch the impact of other changes in time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this assert is unnecessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait a minute. The problem doesn't seem to be here.
It seems like the JIT test is failing, so I'll do some research.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MYSQLND_ERROR_INFO * error_info can be shared when using persistent connections.
In other words, depending on the processing timing of the processes running in parallel, there is a slight possibility that a value other than 0 may be entered here.

I don't know why such a situation occurred in CI testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, it does not reproduce locally even if I run it in parallel. If it's thread-safe, this probably won't happen...

Copy link
Member

@SakiTakamachi SakiTakamachi Mar 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that there are routes such as mysqlnd_handle_local_infile that call mysqlnd_read_packet_header_and_body with an error set. Therefore, assert should not be done.

(I'm probably wrong about parallel execution and the value after closing. Please forget it.)

FYI

  1. SET_CLIENT_ERROR(conn->error_info, CR_LOAD_DATA_LOCAL_INFILE_REJECTED, UNKNOWN_SQLSTATE,
  2. if (FAIL == conn->payload_decoder_factory->m.send_command_handle_response(
  3. php_mysqlnd_ok_read(MYSQLND_CONN_DATA * conn, void * _packet)
    (PROT_OK_PACKET)
  4. if (FAIL == mysqlnd_read_packet_header_and_body(&(packet->header), pfc, vio, stats, error_info, connection_state, buf, buf_len, "OK", PROT_OK_PACKET)) {

Copy link
Contributor Author

@Appla Appla Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the deeply analysis.

ext/mysqlnd/mysqlnd_enum_n_def.h Outdated Show resolved Hide resolved
ext/mysqlnd/mysqlnd_wireprotocol.c Outdated Show resolved Hide resolved
ext/mysqlnd/mysqlnd_wireprotocol.c Outdated Show resolved Hide resolved
@Appla
Copy link
Contributor Author

Appla commented Mar 8, 2024

Could you provide a unit test for this?
I will add a test

If I remember correctly there was a bug report about this. Do you know the number?

Is this one? 81335

<?php
if (getenv("SKIP_SLOW_TESTS")) die("skip slow test");

if (!defined('MYSQLI_STORE_RESULT_COPY_DATA')) die('skip requires mysqlnd');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are making this change against master branch, so mysqlnd is the only option available. This check is redundant and will possibly break in PHP 9.0 once the constant is removed.

The CI jobs are failing. Could you please check why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make this MR for 8.3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check the reason for the CI failure on windows later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could even go onto 8.2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will update later.

Copy link
Contributor Author

@Appla Appla Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI jobs are failing. Could you please check why?

This looks like windows or mysql-server for handle half-close TCP connection differently.

---- update
When a client-timeout occurs, the MySQL server sends an error 4021 and a TCP-FIN packet, and the TCP state changes to FIN_WAIT_2. At this time, the TCP state of the client is CLOSE_WAIT.

MySQL-Server side: when the server TCP state is FIN_WAIT_2, the server will send a TCP RST packet after client sending a query.
If the client is on Linux, it can still read the error message packet sent by the server.
If the client is on Windows, an error (10053) will occur when executing receive.

I'm not sure if this is due to the TCP implementation or something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could even go onto 8.2.

The MR target 8.2 now.

echo "\n";
echo $e->getCode();
}
$mysqli->close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not part of the test, right? Can you remove it please?

Copy link
Contributor Author

@Appla Appla Mar 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not part of the test, right? Can you remove it please?

$mysqli->close();
Is this one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it (●'◡'●)


require_once 'connect.inc';
if (!$link = @my_mysqli_connect($host, $user, $passwd, $db, $port, $socket)) die("skip cannot connect");
if (mysqli_get_server_version($link) < 80024) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an error message that does not occur with MariaDB, so please skip it if using MariaDB.

Suggested change
if (mysqli_get_server_version($link) < 80024) {
if (mysqli_get_server_version($link) < 80024 || str_contains(mysqli_get_server_info($link), 'MariaDB')) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

Comment on lines +311 to +312
if (error_info->error_no == 0) {
SET_CLIENT_ERROR(error_info, CR_SERVER_GONE_ERROR, UNKNOWN_SQLSTATE, mysqlnd_server_gone);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason, I don't think this check should be done either, but is there any reason why error_no has to be 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider 0 to mean that no error occurred, but this cannot be guaranteed. so may be compare with CLIENT_INTERACTION_TIMEOUT here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried various things.
There may already be a meaningful error set here, so checking if it's 0 makes sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@Appla Appla force-pushed the mysqlnd_support_ER_CLIENT_INTERACTION_TIMEOUT branch from 67ef59b to c4ee2ee Compare March 11, 2024 02:19
@Appla Appla changed the base branch from master to PHP-8.2 March 11, 2024 02:20
@Appla Appla force-pushed the mysqlnd_support_ER_CLIENT_INTERACTION_TIMEOUT branch from cae3483 to c3fa03b Compare April 10, 2024 08:18
@Appla Appla force-pushed the mysqlnd_support_ER_CLIENT_INTERACTION_TIMEOUT branch from c3fa03b to 92eccc3 Compare April 10, 2024 08:19
@mbeccati
Copy link
Contributor

The fix is currently breaking several DBAL tests. I don't think it is safe for 8.2 and 8.3 and/or it is not working properly.

See: https://github.com/mbeccati/php-latest-builds/runs/23707742797

@bukka
Copy link
Member

bukka commented Apr 11, 2024

Agreed I think we need to revert this

@kamil-tekiela
Copy link
Member

Sure, we can revert it, but the test failures seem to be because DBAL is not handling error code 4031

https://github.com/doctrine/dbal/blob/c70bae9e69e0e543044d5e282a624d9b3cbea6fb/src/Driver/API/MySQL/ExceptionConverter.php#L34

I am pretty sure that they need to update their code. Am I missing something here?

@bukka
Copy link
Member

bukka commented Apr 11, 2024

Yeah looking at it, it seems like it needs to be updated. It has got some specific expectation in some tests like that connection lost is triggered on connection timeout (e.g. ConnectionLostTest expecting ConnectionLost on timeout). The way how I see it is that this is essentially a breaking fix and as such it can happen only in master and it should be propertly noted in UPGRADING.

@mbeccati
Copy link
Contributor

Not a DBAL deveoper, but I agree with @bukka that this kind of behaviour change can only happen on master.

However, it's not just the ConnectionLost test that fails, so there might be other edge cases that need to be looked at.

@bukka
Copy link
Member

bukka commented Apr 11, 2024

Yeah I think it should be reverted and then properly tested with DBAL and make sure that each failure is expected and also ideally document what needs to be done in DBAL to address all failures (even more ideally provide a PR for them).

@bukka
Copy link
Member

bukka commented Apr 11, 2024

@kamil-tekiela are you happy to revert it or do you want me to do that?

@kamil-tekiela
Copy link
Member

@bukka Can you please do it? Thanks.

@alexdowad
Copy link
Contributor

Hmm, I'm wondering why my review was requested on this PR.

@iluuu1994
Copy link
Member

@alexdowad It's because of a rebase before adjusting the target branch.

@Appla
Copy link
Contributor Author

Appla commented Apr 22, 2024

Yeah looking at it, it seems like it needs to be updated. It has got some specific expectation in some tests like that connection lost is triggered on connection timeout (e.g. ConnectionLostTest expecting ConnectionLost on timeout). The way how I see it is that this is essentially a breaking fix and as such it can happen only in master and it should be propertly noted in UPGRADING.

The merge target is master now. Anything else?

@mbeccati
Copy link
Contributor

@Appla I can try to give some guidance if needed

@mbeccati
Copy link
Contributor

  1. I've tested DBAL (both mysqli and pdo_mysql) with your branch and once 4031 is properly handled, all failures go away - it was simply a matter of consecutive tests not being happy to start without a functioning connection
  2. There should be a mention in the UPGRADING file that mysqlnd now could also raise error 4031 when the connection is lost due to timeouts.
  3. I can take care of making a PR do DBAL so that they are aware of the incoming change

Perhaps other projects could benefit from having better visibility about this. Not sure how best to proceed with that.

mbeccati added a commit to mbeccati/dbal that referenced this pull request Apr 22, 2024
mbeccati added a commit to mbeccati/dbal that referenced this pull request Apr 22, 2024
@Appla
Copy link
Contributor Author

Appla commented Apr 24, 2024

  1. I've tested DBAL (both mysqli and pdo_mysql) with your branch and once 4031 is properly handled, all failures go away - it was simply a matter of consecutive tests not being happy to start without a functioning connection
  2. There should be a mention in the UPGRADING file that mysqlnd now could also raise error 4031 when the connection is lost due to timeouts.
  3. I can take care of making a PR do DBAL so that they are aware of the incoming change

Perhaps other projects could benefit from having better visibility about this. Not sure how best to proceed with that.

Thanks

@Appla
Copy link
Contributor Author

Appla commented Jun 17, 2024

Yeah looking at it, it seems like it needs to be updated. It has got some specific expectation in some tests like that connection lost is triggered on connection timeout (e.g. ConnectionLostTest expecting ConnectionLost on timeout). The way how I see it is that this is essentially a breaking fix and as such it can happen only in master and it should be propertly noted in UPGRADING.

Any updates?

derrabus pushed a commit to doctrine/dbal that referenced this pull request Jul 22, 2024
|      Q       |   A
|------------- | -----------
| Type         | improvement
| Fixed issues | Prevents future issues ;-)

#### Summary

PHP 8.4 will support a new error code 4031 in mysqlnd when the
connection is dropped due to timeouts.

It has been introduced in:

mysql/mysql-server@14508bb

And PHP 8.4 will support it:
php/php-src#13618

The PR gets the test suite green again (mysqli + pdo_mysql). I have used
4.0.x as base, but feel free to change according to your preferences.
@mbeccati mbeccati self-requested a review July 23, 2024 21:29
Copy link
Contributor

@mbeccati mbeccati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bukka DBAL has applied the change. Anything else preventing us from merging this?

@mconkin
Copy link

mconkin commented Aug 7, 2024

Correct me if I'm wrong, but isn't there an underlying problem within this bug that eventually propagates into num_affected_rows misreporting?

There was a merge request for it around 10-8 years ago, I think someone named ma11oc? IDK, I read it in passing and they just closed it and deleted their account.

Pretty unprofessional, if I do say so myself.

I digress. This looks incredibly close to the answer. However, by just glancing at this thread it seems like something's off. Might want to recheck ->mysqli_num_affected_rows and see if the root cause is there...

Sincerest apologies for jumping into this so close to release.

I don't have much PHP or C experience (as you can see my account is quite new (and I'm more familiar with Python for web develop so I don't know how useful or necessary this comment is ))

Anyways, to risc embarrassing myself further, quite similar to ma11oc, the original discoverer of this alleged bug, I'll finish with a final statement.

@mbeccati
Copy link
Contributor

mbeccati commented Aug 7, 2024

@mconkin do you have any reference? To the best of my knowledge this PR is meant to support new functionality introduced in recent MySQL versions.

@mconkin
Copy link

mconkin commented Aug 7, 2024

@Appla sure, give me a day or two. Currently limited to a 2017 MacBook Pro. Xcode tools won't install and I refuse to use Homebrew. Let me provide some code lines once I download the zip.

@mbeccati
Copy link
Contributor

@SakiTakamachi do you think we can push this a little bit and merge it? I have championed for this and the required changes have been implemented in DBAL. It would be nice to keep the promise and have it in 8.4.

Cheers

@bukka
Copy link
Member

bukka commented Aug 18, 2024

I think it would make sense for PHP 8.4. Pinging @SakiTakamachi for approval.

Copy link
Member

@SakiTakamachi SakiTakamachi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, that makes sense :)

@SakiTakamachi
Copy link
Member

I'll merge this tonight if no one else has any concerns

@Appla Appla force-pushed the mysqlnd_support_ER_CLIENT_INTERACTION_TIMEOUT branch from 292530d to eee6137 Compare August 22, 2024 10:19
@Appla Appla force-pushed the mysqlnd_support_ER_CLIENT_INTERACTION_TIMEOUT branch from eee6137 to 62926ec Compare August 22, 2024 10:21
@SakiTakamachi SakiTakamachi merged commit 555b603 into php:master Aug 26, 2024
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants