Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update echo.c #8

Closed
wants to merge 2 commits into from
Closed

Conversation

dataScientist420
Copy link

Modifications:

  • adding an empty space for the first fprintf
  • removing the if statement

Purpose: eliminate argc executions of the if statement and argc-1 calls of fprintf

Improvement: the produced machine code will have less instructions and less function calls (overhead)

Adding an empty space for the first fprintf and removing the if statement. The purpose is to eliminate argc executions of the if statement and argc-1 calls of fprintf. The echo result will be the same as before but the produced machine code will have less instructions and less function calls for more performance.
@mysql-oca-bot
Copy link

Hi, thank you for submitting this pull request. In order to consider your code we need you to sign the Oracle Contribution Agreement (OCA). Please review the details and follow the instructions at http://www.oracle.com/technetwork/community/oca-486395.html
Please make sure to include your MySQL bug system user (email) in the returned form.
Thanks

@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle's Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment:
"I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it."
Thanks

@dataScientist420
Copy link
Author

Hi,
I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Thanks

@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Your code has been assigned to an internal queue. Please follow
bug http://bugs.mysql.com/bug.php?id=77229 for updates.
Thanks

bkandasa pushed a commit that referenced this pull request Aug 4, 2015
…SACTION_INFO.H:138

The assert is hit when XA transaction updated only a non-transactional table and
went to prepare stage.
At that time MYSQL_BIN_LOG::write_binlog_and_commit_engine() is invoked where
the trx should not attempt any committing. But that's happened.
The binlog "engine" managed to commit_low() in conditions of the reported case
which led to the assert few instructions later

#7  0x00007fad151dbe42 in __GI___assert_fail (assertion=0x1e37d16 "is_started()", file=0x1e37c68
#8  0x0000000000fafcf7 in Ha_trx_info::next (this=0x7fac8c002028) at
#9  0x0000000000f9dfa9 in ha_prepare (thd=0x7fac8c000bb0)

More analysis proved there's an "inverted" related issue in the rollback branch.
When the pure non-transactional engine xa transaction is rolled back, this time, it
misses to execute rollback_low() method to leave a screwed state which
the following query discovers hiting against an assert.

   bool trans_commit_stmt(THD*): Assertion
   `thd->in_active_multi_stmt_transaction() || thd->m_transaction_psi
   == __null' failed.

And finally, testing revealed a case of no test coverage so far in combination
of XA ROLLBACK, no transactional tables involved and no XA PREPARE.
In such case logging was just incorrect mixing XA START and ROLLBACK.

The original issue is fixed with making
MYSQL_BIN_LOG::write_binlog_and_commit_engine() to compute a local boolean flag `skip-commit'
correctly based on the value of the XA state.
Commit is disallowed when the state is Prepared.
To satisfy to the ONE-phase XA, the committing XA is also made
to receive XA_PREPARED status, as intermediate, right after the prepare phase is done.
in a general commit handler of ha_commit_trans().

The second issue of the rollback part is fixed with relocating an existing explicit
rollback_low() for xa-rollback to a safer point.

And the final third issue is fixed with augmentment of
ending_trans()'s the trans_cannot_safely_rollback(thd) branch to
compose an appropriate Query-log-event.
Logics of preventing second time do_binlog_xa_commit_rollback()
invocation that is actual to the "externally" committing XA is
simplified.
The former idea was in that the first invocation of
do_binlog_xa_commit_rollback() in the "external" XA commit branch
would necessarily turn the cache from empty, asserted, to not empty.
On the other hand, at running do_binlog_xa_commit_rollback() for the
local xa the cache must be empty (because it should've been flushed at
prepare), asserted in the rollback case too.
Hence the state of the cache checking was correct: (the local xa go
through, the external xa goes through once which is first time).  Now
instead of the above deduction the 1st invocation is just gets
explicitly flagged. And because we would like to preserve signature of
MYSQL_BIN_LOG class methods the flag is made to pass as a new member
of `bool binlog_cache_mngr::has_logged_xid'

Mixing transactional and non-transactional tables in
rpl.rpl_xa_survive_disconnect_mixed_engines reveal one issue in MTS
grouping. An XA transaction "prepare" group can be closed
with XA-ROLLBACK query which was previously missed to capture.
A use case for that is mixed transactional and non-transactional updates.
It's been corrected now.

As a side effect is_loggable_xa_prepare() had to be refined to satisfy
@c simulate_commit_failure.
bjornmu pushed a commit that referenced this pull request Feb 5, 2016
…QL_CMD)-> GET_XA_OPT()

Bug#22273964 	INNODB: FAILING ASSERTION: TOTAL_TRX >= TRX_SYS->N_PREPARED_TRX

**Problem description**

An assertion of Bug#21942487

static_cast<Sql_cmd_xa_commit*>(thd->lex->m_sql_cmd)->
                              get_xa_opt() == XA_ONE_PHASE

#8  0x0000000000e3a5e1 in ha_commit_low
#9  0x00000000015158e6 in TC_LOG_DUMMY::commit
#10 0x0000000001529bc3 in Sql_cmd_xa_commit::trans_xa_commit

was caused by incorrect assumption by XA binlogging of wl6860 in
that @@session.pseudo_slave_mode can be only set 1 through binlog.
If fact the var can be set so manually.

Another assert in Bug#22273964 takes place for the very same reason.

**Solution**

The most reliable way to identify that the executing thread
is a binlog applier must include both the checking of rli_fake
and @@session.pseudo_slave_mode.
The former merely checking of the session variable as atttemp to
identify the binlog applier is replaced by checking the conjuction
of the two properties.

Notice that load containting SET @@session.pseudo_slave_mode=1 and
BINLOG '' pseudo-queries is not necessary authentic to mysqlbinlog output.
Nevertheless it must be processable when it's manually engineered such way.
A test is included to prove that as well.

**Note**
Beware of Bug #19502202 SERVER SHUTDOWN HANG SEEN IN SOME INNODB & RPL TESTS
at testing with binlog.binlog_xa_prepared_disconnect that still fails sporadically.
bjornmu pushed a commit that referenced this pull request Apr 10, 2017
Patch #8: Fix -Wunused-parameter warnings in sys_vars
bjornmu pushed a commit that referenced this pull request Apr 10, 2017
…FOR DATABASE

Patch #8

To repeat:
./mtr --charset-for-testdb=utf8mb4 --defaults-file=include/utf8mb4_my.cnf

group_min_max_innodb
heap
heap_btree
heap_hash
join_outer
join_outer_bka
join_outer_bka_nixbnl
join_outer_innodb
sp_notembedded

Change-Id: I7f1eb96061a1dfba7bb91afcafa47326c9d079e7
pobrzut pushed a commit that referenced this pull request May 8, 2017
bjornmu pushed a commit that referenced this pull request Sep 21, 2017
Patch #8:

seek_no_dup_elimination() is only called on documents that are on
binary format, but it accesses the document through the Json_wrapper
interface. This patch makes it use the json_binary interface instead
to avoid the unnecessary indirection via Json_wrapper.

Microbenchmarks (64-bit, Intel Core i7-4770 3.4 GHz, GCC 6.3):

BM_JsonBinarySearchEllipsis          20462 [+213.1%]
BM_JsonBinarySearchEllipsis_OnlyOne     75 [ +18.7%]
BM_JsonBinarySearchKey                  67 [ +14.9%]

Change-Id: Idb74afe8cbac53eb45dab62a677282df9465ae6b
bjornmu pushed a commit that referenced this pull request Jul 27, 2018
…ENERATED_READ_FIELDS

It's a SELECT with WHERE "(-1) minus 0x4d".
this operation has a result type of "unsigned" (because 0x4d is unsigned
integer) and the result (-78) doesn't fit int an unsigned type.
This WHERE is evaluated by InnoDB in index condition pushdown:
#0 my_error
#1 Item_func::raise_numeric_overflow
...
#7 Item_cond_and::val_int
#8 innobase_index_cond
...
#12 handler::index_read_map
...
#15 handler::multi_range_read_next
...
#20 rr_quick
#21 join_init_read_record
As val_int() has no "error" return code, the execution continues until
frame #12; there we call update_generated_read_fields(), which has
an assertion about thd->is_error() which fails.

Fix: it would be nice to detect error as soon as it happens, i.e. in innodb
code right after it calls val_bool(). But innodb's index condition
pushdown functions only have found / not found return codes so they cannot
signal "error" to the upper layers. Same is true for MyISAM. Moreover,
"thd" isn't easily accessible there.
Adding a detection a bit above in the stack (handler::* functions which
do index reads) is also possible but would require fixing ~20
functions.
The chosen fix here is to change update_generated_*_fields()
to return error if thd->is_error() is true.
Note that the removed assertion was already one cause of
bug 27041382.
surbhat1595 pushed a commit that referenced this pull request Apr 25, 2019
…E TO A SERVER

Problem
========================================================================
Running the GCS tests with ASAN seldomly reports a user-after-free of
the server reference that the acceptor_learner_task uses.

Here is an excerpt of ASAN's output:

==43936==ERROR: AddressSanitizer: heap-use-after-free on address 0x63100021c840 at pc 0x000000530ff8 bp 0x7fc0427e8530 sp 0x7fc0427e8520
WRITE of size 8 at 0x63100021c840 thread T3
    #0 0x530ff7 in server_detected /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:962
    #1 0x533814 in buffered_read_bytes /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1249
    #2 0x5481af in buffered_read_msg /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1399
    #3 0x51e171 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4690
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)
    #9 0x7fc047ff2bfc in __clone (/lib64/libc.so.6+0xfebfc)

0x63100021c840 is located 64 bytes inside of 65688-byte region [0x63100021c800,0x63100022c898)
freed by thread T3 here:
    #0 0x7fc04a5d7508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
    #1 0x52cf86 in freesrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:836
    #2 0x52ea78 in srv_unref /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:868
    #3 0x524c30 in reply_handler_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4914
    #4 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #5 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #6 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #7 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #8 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

previously allocated by thread T3 here:
    #0 0x7fc04a5d7a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
    #1 0x543604 in mksrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:721
    #2 0x543b4c in addsrv /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:755
    #3 0x54af61 in update_servers /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_transport.c:1747
    #4 0x501082 in site_install_action /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1572
    #5 0x55447c in import_config /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/site_def.c:486
    #6 0x506dfc in handle_x_snapshot /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5257
    #7 0x50c444 in xcom_fsm /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:5325
    #8 0x516c36 in dispatch_op /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4510
    #9 0x521997 in acceptor_learner_task /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:4772
    #10 0x562357 in task_loop /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/task.c:1140
    #11 0x5003b2 in xcom_taskmain2 /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/xcom/xcom_base.c:1324
    #12 0x6a278a in Gcs_xcom_proxy_impl::xcom_init(unsigned short, node_address*) /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_proxy.cc:164
    #13 0x59b3c1 in xcom_taskmain_startup /home/tvale/mysql/plugin/group_replication/libmysqlgcs/src/bindings/xcom/gcs_xcom_control_interface.cc:107
    #14 0x7fc04a2e4dd4 in start_thread (/lib64/libpthread.so.0+0x7dd4)

Analysis
========================================================================
The server structure is reference counted by the associated sender_task
and reply_handler_task.
When they finish, they unreference the server, which leads to its memory
being freed.

However, the acceptor_learner_task keeps a "naked" reference to the
server structure.
Under the right ordering of operations, i.e. the sender_task and
reply_handler_task terminating after the acceptor_learner_task acquires,
but before it uses, the reference to the server structure, leads to the
acceptor_learner_task accessing the server structure after it has been
freed.

Solution
========================================================================
Let the acceptor_learner_task also reference count the server structure
so it is not freed while still in use.

Reviewed-by: André Negrão <[email protected]>
Reviewed-by: Venkatesh Venugopal <[email protected]>
RB: 21209
16yuki0702 pushed a commit to 16yuki0702/mysql-server that referenced this pull request Jun 27, 2021
…pi [mysql#8] [noclose]

SEVERAL COMPILE WARNINGS FOR MYSQL CLUSTER ON WINDOWS WITH VS 2019

storage\ndb\src\ndbapi\Ndb.cpp: warning C4302: 'type cast': truncation from 'NdbTransaction *' to 'long'
storage\ndb\src\ndbapi\Ndb.cpp: warning C4311: 'type cast': pointer truncation from 'NdbTransaction *' to 'long'

Change-Id: Id80ee09ff2b44b571242b987c845f17990a3d234
pull bot pushed a commit to Mu-L/mysql-server that referenced this pull request Apr 27, 2023
  # This is the 1st commit message:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA

  Problem Statement
  -----------------
  Currently customers cannot enable heatwave analytics service to their
  HA DBSystem or enable HA if they are using Heatwave enabled DBSystem.
  In this change, we attempt to remove this limitation and provide
  failover support of heatwave in an HA enabled DBSystem.

  High Level Overview
  -------------------
  To support heatwave with HA, we extended the existing feature of auto-
  reloading of tables to heatwave on MySQL server restart (WL-14396). To
  provide seamless failover functionality to tables loaded to heatwave,
  each node in the HA cluster (group replication) must have the latest
  view of tables which are currently loaded to heatwave cluster attached
  to the primary, i.e., the secondary_load flag should be in-sync always.

  To achieve this, we made following changes -
    1. replicate secondary load/unload DDL statements to all the active
       secondary nodes by writing the DDL into the binlog, and
    2. Control how secondary load/unload is executed when heatwave cluster
       is not attached to node executing the command

  Implementation Details
  ----------------------
  Current implementation depends on two key assumptions -
   1. All MDS DBSystems will have RAPID plugin installed.
   2. No non-MDS system will have the RAPID plugin installed.

  Based on these assumptions, we made certain changes w.r.t. how server
  handles execution of secondary load/unload statements.
   1. If secondary load/unload command is executed from a mysql client
      session on a system without RAPID plugin installed (i.e., non-MDS),
      instead of an error, a warning message will be shown to the user,
      and the DDL is allowed to commit.
   2. If secondary load/unload command is executed from a replication
      connection on an MDS system without heatwave cluster attached,
      instead of throwing an error, the DDL is allowed to commit.
   3. If no error is thrown from secondary engine, then the DDL will
      update the secondary_load metadata and write a binlog entry.

  Writing to binlog implies that all the consumer of binlog now need to
  handle this DDL gracefully. This has an adverse effect on Point-in-time
  Recovery. If the PITR backup is taken from a DBSystem with heatwave, it
  may contain traces of secondary load/unload statements in its binlog.
  If such a backup is used to restore a new DBSystem, it will cause failure
  while trying to execute statements from its binlog because
   a) DBSystem will not heatwave cluster attached at this time, and
   b) Statements from binlog are executed from standard mysql client
      connection, thus making them indistinguishable from user executed
      command.
  Customers will be prevented (by control plane) from using PITR functionality
  on a heatwave enabled DBSystem until there is a solution for this.

  Testing
  -------
  This commit changes the behavior of secondary load/unload statements, so it
   - adjusts existing tests' expectations, and
   - adds a new test validating new DDL behavior under different scenarios

  Change-Id: Ief7e9b3d4878748b832c366da02892917dc47d83

  # This is the commit message #2:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (PITR SUPPORT)

  Problem
  -------
  A PITR backup taken from a heatwave enabled system could have traces
  of secondary load or unload statements in binlog. When such a backup
  is used to restore another system, it can cause failure because of
  following two reasons:

  1. Currently, even if the target system is heatwave enabled, heatwave
  cluster is attached only after PITR restore phase completes.
  2. When entries from binlogs are applied, a standard mysql client
  connection is used. This makes it indistinguishable from other user
  session.

  Since secondary load (or unload) statements are meant to throw error
  when they are executed by user in the absence of a healthy heatwave
  cluster, PITR restore workflow will fail if binlogs from the backup
  have any secondary load (or unload) statements in them.

  Solution
  --------
  To avoid PITR failure, we are introducing a new system variable
  rapid_enable_delayed_secondary_ops. It controls how load or unload
  commands are to be processed by rapid plugin.

    - When turned ON, the plugin silently skips the secondary engine
      operation (load/unload) and returns success to the caller. This
      allows secondary load (or unload) statements to be executed by the
      server in the absence of any heatwave cluster.
    - When turned OFF, it follows the existing behavior.
    - The default value is OFF.
    - The value can only be changed when rapid_bootstrap is IDLE or OFF.
    - This variable cannot be persisted.

  In PITR workflow, Control Plane would set the variable at the start of
  PITR restore and then reset it at the end of workflow. This allows the
  workflow to complete without failure even when heatwave cluster is not
  attached. Since metadata is always updated when secondary load/unload
  DDLs are executed, when heatwave cluster is attached at a later point
  in time, the respective tables get reloaded to heatwave automatically.

  Change-Id: I42e984910da23a0e416edb09d3949989159ef707

  # This is the commit message #3:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (TEST CHANGES)

  This commit adds new functional tests for the MDS HA + HW integration.

  Change-Id: Ic818331a4ca04b16998155efd77ac95da08deaa1

  # This is the commit message #4:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA
  BUG#34776485: RESTRICT DEFAULT VALUE FOR rapid_enable_delayed_secondary_ops

  This commit does two things:
  1. Add a basic test for newly introduced system variable
  rapid_enable_delayed_secondary_ops, which controls the behavior of
  alter table secondary load/unload ddl statements when rapid cluster
  is not available.

  2. It also restricts the DEFAULT value setting for the system variable
  So, following is not allowed:
  SET GLOBAL rapid_enable_delayed_secondary_ops = default
  This variable is to be used in restricted scenarios and control plane
  only sets it to ON/OFF before and after PITR apply. Allowing set to
  default has no practical use.

  Change-Id: I85c84dfaa0f868dbfc7b1a88792a89ffd2e81da2

  # This is the commit message #5:

  Bug#34726490: ADD DIAGNOSTICS FOR SECONDARY LOAD / UNLOAD DDL

  Problem:
  --------
  If secondary load or unload DDL gets rolled back due to some error after
  it had loaded / unloaded the table in heatwave cluster, there is no undo
  of the secondary engine action. Only secondary_load flag update is
  reverted and binlog is not written. From User's perspective, the table
  is loaded and can be seen on performance_schema. There are also no
  error messages printed to notify that the ddl didn't commit. This
  creates a problem to debug any issue in this area.

  Solution:
  ---------
  The partial undo of secondary load/unload ddl will be handled in
  bug#34592922. In this commit, we add diagnostics to reveal if the ddl
  failed to commit, and from what stage.

  Change-Id: I46c04dd5dbc07fc17beb8aa2a8d0b15ddfa171af

  # This is the commit message #6:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (TEST FIX)

  Since ALTER TABLE SECONDARY LOAD / UNLOAD DDL statements now write
  to binlog, from Heatwave's perspective, SCN is bumped up.

  In this commit, we are adjusting expected SCN values in certain
  tests which does secondary load/unload and expects SCN to match.

  Change-Id: I9635b3cd588d01148d763d703c72cf50a0c0bb98

  # This is the commit message mysql#7:

  Adding MTR tests for ML in rapid group_replication suite

  Added MTR tests with Heatwave ML queries with in
  an HA setup.

  Change-Id: I386a3530b5bbe6aea551610b6e739ab1cf366439

  # This is the commit message mysql#8:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (MTR TEST ADJUSTMENT)

  In this commit we have adjusted the existing test to work with the
  new MTR test infrastructure which extends the functionalities to
  HA landscape. With this change, a lot of mannual settings have now
  become redundant and thus removed in this commit.

  Change-Id: Ie1f4fcfdf047bfe8638feaa9f54313d509cbad7e

  # This is the commit message mysql#9:

  WL#15280: HEATWAVE SUPPORT FOR MDS HA (CLANG-TIDY FIX)

  Fix clang-tidy warnings found in previous change#16530, patch#20

  Change-Id: I15d25df135694c2f6a3a9146feebe2b981637662

Change-Id: I3f3223a85bb52343a4619b0c2387856b09438265
dveeden pushed a commit to dveeden/mysql-server that referenced this pull request May 5, 2023
Change-Id: I813f12061158da5949200fa40f35b8a01db4a949
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
In the TCP Transporter, request a TLS key rotation after each
2^32 bytes are sent.

Note that there is no visibility into whether this has occured.

Change-Id: I70e1fff8b20305d565efc5a44e7ecb827da22dca
bjornmu pushed a commit that referenced this pull request Oct 25, 2023
Support the standard --ndb-tls-search-path in ndb_mgm.

Also support the --ndb-mgm-tls option, but in ndb_mgm, this enum
option can take a third value, "deferred", that is not supported
by other programs.

Add a --test-tls option, which makes a single, verbose attempt to
connect using TLS, then exits with status 0 on success or 1 on error.

Add a new interactive command, "START TLS". This can perform the
TLS negotiation that was deferred with --ndb-mgm-tls=deferred.

Change-Id: I711245b723183ffe97f87dd0d7d15d2349730c77
venkatesh-prasad-v added a commit to venkatesh-prasad-v/mysql-server that referenced this pull request Jan 26, 2024
…and a local

             DDL executed

https://bugs.mysql.com/bug.php?id=113727

Problem
-------
In high concurrency scenarios, MySQL replica can enter into a deadlock due to a
race condition between the replica applier thread and the client thread
performing a binlog group commit.

Analysis
--------
It needs at least 3 threads for this deadlock to happen

1. One client thread
2. Two replica applier threads

How this deadlock happens?
--------------------------
0. Binlog is enabled on replica, but log_replica_updates is disabled.

1. Initially, both "Commit Order" and "Binlog Flush" queues are empty.

2. Replica applier thread 1 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

3. Since both "Commit Order" and "Binlog Flush" queues are empty, the applier
   thread 1

   3.1. Becomes leader (In Commit_stage_manager::enroll_for()).

   3.2. Registers in the commit order queue.

   3.3. Acquires the lock MYSQL_BIN_LOG::LOCK_log.

   3.4. Commit Order queue is emptied, but the lock MYSQL_BIN_LOG::LOCK_log is
        not yet released.

   NOTE: SE commit for applier thread is already done by the time it reaches
         here.

4. Replica applier thread 2 enters the group commit pipeline to register in the
   "Commit Order" queue since `log-replica-updates` is disabled on the replica
   node.

5. Since the "Commit Order" queue is empty (emptied by applier thread 1 in 3.4), the
   applier thread 2

   5.1. Becomes leader (In Commit_stage_manager::enroll_for())

   5.2. Registers in the commit order queue.

   5.3. Tries to acquire the lock MYSQL_BIN_LOG::LOCK_log. Since it is held by applier
        thread 1 it will wait until the lock is released.

6. Client thread enters the group commit pipeline to register in the
   "Binlog Flush" queue.

7. Since "Commit Order" queue is not empty (there is applier thread 2 in the
   queue), it enters the conditional wait `m_stage_cond_leader` with an
   intention to become the leader for both the "Binlog Flush" and
   "Commit Order" queues.

8. Applier thread 1 releases the lock MYSQL_BIN_LOG::LOCK_log and proceeds to update
   the GTID by calling gtid_state->update_commit_group() from
   Commit_order_manager::flush_engine_and_signal_threads().

9. Applier thread 2 acquires the lock MYSQL_BIN_LOG::LOCK_log.

   9.1. It checks if there is any thread waiting in the "Binlog Flush" queue
        to become the leader. Here it finds the client thread waiting to be
        the leader.

   9.2. It releases the lock MYSQL_BIN_LOG::LOCK_log and signals on the
        cond_var `m_stage_cond_leader` and enters a conditional wait until the
        thread's `tx_commit_pending` is set to false by the client thread
       (will be done in the
       Commit_stage_manager::process_final_stage_for_ordered_commit_group()
       called by client thread from fetch_and_process_flush_stage_queue()).

10. The client thread wakes up from the cond_var `m_stage_cond_leader`.  The
    thread has now become a leader and it is its responsibility to update GTID
    of applier thread 2.

    10.1. It acquires the lock MYSQL_BIN_LOG::LOCK_log.

    10.2. Returns from `enroll_for()` and proceeds to process the
          "Commit Order" and "Binlog Flush" queues.

    10.3. Fetches the "Commit Order" and "Binlog Flush" queues.

    10.4. Performs the storage engine flush by calling ha_flush_logs() from
          fetch_and_process_flush_stage_queue().

    10.5. Proceeds to update the GTID of threads in "Commit Order" queue by
          calling gtid_state->update_commit_group() from
          Commit_stage_manager::process_final_stage_for_ordered_commit_group().

11. At this point, we will have

    - Client thread performing GTID update on behalf if applier thread 2 (from step 10.5), and
    - Applier thread 1 performing GTID update for itself (from step 8).

    Due to the lack of proper synchronization between the above two threads,
    there exists a time window where both threads can call
    gtid_state->update_commit_group() concurrently.

    In subsequent steps, both threads simultaneously try to modify the contents
    of the array `commit_group_sidnos` which is used to track the lock status of
    sidnos. This concurrent access to `update_commit_group()` can cause a
    lock-leak resulting in one thread acquiring the sidno lock and not
    releasing at all.

-----------------------------------------------------------------------------------------------------------
Client thread                                           Applier Thread 1
-----------------------------------------------------------------------------------------------------------
update_commit_group() => global_sid_lock->rdlock();     update_commit_group() => global_sid_lock->rdlock();

calls update_gtids_impl_lock_sidnos()                   calls update_gtids_impl_lock_sidnos()

set commit_group_sidno[2] = true                        set commit_group_sidno[2] = true

                                                        lock_sidno(2) -> successful

lock_sidno(2) -> waits

                                                        update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

                                                        if (commit_group_sidnos[2]) {
                                                          unlock_sidno(2);
                                                          commit_group_sidnos[2] = false;
                                                        }

                                                        Applier thread continues..

lock_sidno(2) -> successful

update_gtids_impl_own_gtid() -> Add the thd->owned_gtid in `executed_gtids()`

if (commit_group_sidnos[2]) { <=== this check fails and lock is not released.
  unlock_sidno(2);
  commit_group_sidnos[2] = false;
}

Client thread continues without releasing the lock
-----------------------------------------------------------------------------------------------------------

12. As the above lock-leak can also happen the other way i.e, the applier
    thread fails to unlock, there can be different consequences hereafter.

13. If the client thread continues without releasing the lock, then at a later
    stage, it can enter into a deadlock with the applier thread performing a
    GTID update with stack trace.

    Client_thread
    -------------
    mysql#1  __GI___lll_lock_wait
    mysql#2  ___pthread_mutex_lock
    mysql#3  native_mutex_lock                                       <= waits for commit lock while holding sidno lock
    mysql#4  Commit_stage_manager::enroll_for
    mysql#5  MYSQL_BIN_LOG::change_stage
    mysql#6  MYSQL_BIN_LOG::ordered_commit
    mysql#7  MYSQL_BIN_LOG::commit
    mysql#8  ha_commit_trans
    mysql#9  trans_commit_implicit
    mysql#10 mysql_create_like_table
    mysql#11 Sql_cmd_create_table::execute
    mysql#12 mysql_execute_command
    mysql#13 dispatch_sql_command

    Applier thread
    --------------
    mysql#1  ___pthread_mutex_lock
    mysql#2  native_mutex_lock
    mysql#3  safe_mutex_lock
    mysql#4  Gtid_state::update_gtids_impl_lock_sidnos               <= waits for sidno lock
    mysql#5  Gtid_state::update_commit_group
    mysql#6  Commit_order_manager::flush_engine_and_signal_threads   <= acquires commit lock here
    mysql#7  Commit_order_manager::finish
    mysql#8  Commit_order_manager::wait_and_finish
    mysql#9  ha_commit_low
    mysql#10 trx_coordinator::commit_in_engines
    mysql#11 MYSQL_BIN_LOG::commit
    mysql#12 ha_commit_trans
    mysql#13 trans_commit
    mysql#14 Xid_log_event::do_commit
    mysql#15 Xid_apply_log_event::do_apply_event_worker
    mysql#16 Slave_worker::slave_worker_exec_event
    mysql#17 slave_worker_exec_job_group
    mysql#18 handle_slave_worker

14. If the applier thread continues without releasing the lock, then at a later
    stage, it can perform recursive locking while setting the GTID for the next
    transaction (in set_gtid_next()).

    In debug builds the above case hits the assertion
    `safe_mutex_assert_not_owner()` meaning the lock is already acquired by the
    replica applier thread when it tries to re-acquire the lock.

Solution
--------
In the above problematic example, when seen from each thread
individually, we can conclude that there is no problem in the order of lock
acquisition, thus there is no need to change the lock order.

However, the root cause for this problem is that multiple threads can
concurrently access to the array `Gtid_state::commit_group_sidnos`.

In its initial implementation, it was expected that threads should
hold the `MYSQL_BIN_LOG::LOCK_commit` before modifying its contents. But it
was not considered when upstream implemented WL#7846 (MTS:
slave-preserve-commit-order when log-slave-updates/binlog is disabled).

With this patch, we now ensure that `MYSQL_BIN_LOG::LOCK_commit` is acquired
when the client thread (binlog flush leader) when it tries to perform GTID
update on behalf of threads waiting in "Commit Order" queue, thus providing a
guarantee that `Gtid_state::commit_group_sidnos` array is never accessed
without the protection of `MYSQL_BIN_LOG::LOCK_commit`.
bjornmu pushed a commit that referenced this pull request Apr 30, 2024
This reverts commit 0a661dc0d6dd8b498b0ed9f59207b9814d380fbe.

Some mtr tests have leaks originating from
    #7 0x7f579a2a28cc  (/lib/x86_64-linux-gnu/libprotobuf-lite.so.23+0x4d8cc) (BuildId: 1af51cdba58393c136bc541bbfc26ee7568c8c78)
    #8 0x7f579a2a44cc in google::protobuf::internal::InitSCCImpl(google::protobuf::internal::SCCInfoBase*) (/lib/x86_64-linux-gnu/libprotobuf-lite.so.23+0x4f4cc) (BuildId: 1af51cdba58393c136bc541bbfc26ee7568c8c78)
    #9 0x56352824ad0d in google::protobuf::internal::InitSCC(google::protobuf::internal::SCCInfoBase*) /usr/include/google/protobuf/generated_message_util.h:240:5

Change-Id: I52ea02cee282a9521a2b153b4d883314ed15eb5c
bjornmu pushed a commit that referenced this pull request Apr 30, 2024
Problem:
Starting ´ndb_mgmd --bind-address´ may potentially cause abnormal
program termination in MgmtSrvr destructor when ndb_mgmd restart itself.

  Core was generated by `ndb_mgmd --defa'.
  Program terminated with signal SIGABRT,   Aborted.
  #0  0x00007f8ce4066b8f in raise () from /lib64/libc.so.6
  #1  0x00007f8ce4039ea5 in abort () from /lib64/libc.so.6
  #2  0x00007f8ce40a7d97 in __libc_message () from /lib64/libc.so.6
  #3  0x00007f8ce40af08c in malloc_printerr () from /lib64/libc.so.6
  #4  0x00007f8ce40b132d in _int_free () from /lib64/libc.so.6
  #5  0x00000000006e9ffe in MgmtSrvr::~MgmtSrvr (this=0x28de4b0) at
mysql/8.0/storage/ndb/src/mgmsrv/MgmtSrvr.cpp:
890
  #6  0x00000000006ea09e in MgmtSrvr::~MgmtSrvr (this=0x2) at mysql/8.0/
storage/ndb/src/mgmsrv/MgmtSrvr.cpp:849
  #7  0x0000000000700d94 in mgmd_run () at
mysql/8.0/storage/ndb/src/mgmsrv/main.cpp:260
  #8  0x0000000000700775 in mgmd_main (argc=<optimized out>,
argv=0x28041d0) at mysql/8.0/storage/ndb/src/
mgmsrv/main.cpp:479

Analysis:
While starting up, the ndb_mgmd will allocate memory for bind_address in
order to potentially rewrite the parameter. When ndb_mgmd restart itself
the memory will be released and dangling pointer causing double free.

Fix:
Drop support for bind_address=[::], it is not documented anywhere, is
not useful and doesn't work.
This means the need to rewrite bind_address is gone and bind_address
argument need neither alloc or free.

Change-Id: I7797109b9d8391394587188d64d4b1f398887e94
bjornmu pushed a commit that referenced this pull request Jul 1, 2024
… for connection xxx'.

The new iterator based explains are not impacted.

The issue here is a race condition. More than one thread is using the
query term iterator at the same time (whoch is neithe threas safe nor
reantrant), and part of its state is in the query terms being visited
which leads to interference/race conditions.

a) the explain thread

uses an iterator here:

   Sql_cmd_explain_other_thread::execute

is inspecting the Query_expression of the running query
calling master_query_expression()->find_blocks_query_term which uses
an iterator over the query terms in the query expression:

   for (auto qt : query_terms<>()) {
       if (qt->query_block() == qb) {
           return qt;
       }
   }

the above search fails to find qb due to the interference of the
thread b), see below, and then tries to access a nullpointer:

    * thread #36, name = ‘connection’, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  frame #0: 0x000000010bb3cf0d mysqld`Query_block::type(this=0x00007f8f82719088) const at sql_lex.cc:4441:11
  frame #1: 0x000000010b83763e mysqld`(anonymous namespace)::Explain::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:792:50
  frame #2: 0x000000010b83cc4d mysqld`(anonymous namespace)::Explain_join::explain_select_type(this=0x00007000020611b8) at opt_explain.cc:1487:21
  frame #3: 0x000000010b837c34 mysqld`(anonymous namespace)::Explain::prepare_columns(this=0x00007000020611b8) at opt_explain.cc:744:26
  frame #4: 0x000000010b83ea0e mysqld`(anonymous namespace)::Explain_join::explain_qep_tab(this=0x00007000020611b8, tabnum=0) at opt_explain.cc:1415:32
  frame #5: 0x000000010b83ca0a mysqld`(anonymous namespace)::Explain_join::shallow_explain(this=0x00007000020611b8) at opt_explain.cc:1364:9
  frame #6: 0x000000010b83379b mysqld`(anonymous namespace)::Explain::send(this=0x00007000020611b8) at opt_explain.cc:770:14
  frame #7: 0x000000010b834147 mysqld`explain_query_specification(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, query_term=0x00007f8f82719088, ctx=CTX_JOIN) at opt_explain.cc:2088:20
  frame #8: 0x000000010bd36b91 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f82719088) at sql_union.cc:1519:11
  frame #9: 0x000000010bd36c68 mysqld`Query_expression::explain_query_term(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, qt=0x00007f8f8271d748) at sql_union.cc:1526:13
  frame #10: 0x000000010bd373f7 mysqld`Query_expression::explain(this=0x00007f8f7a090360, explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00) at sql_union.cc:1591:7
  frame #11: 0x000000010b835820 mysqld`mysql_explain_query_expression(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2392:17
  frame #12: 0x000000010b835400 mysqld`explain_query(explain_thd=0x00007f8fbb111e00, query_thd=0x00007f8fbb919c00, unit=0x00007f8f7a090360) at opt_explain.cc:2353:13
 * frame #13: 0x000000010b8363e4 mysqld`Sql_cmd_explain_other_thread::execute(this=0x00007f8fba585b68, thd=0x00007f8fbb111e00) at opt_explain.cc:2531:11
  frame #14: 0x000000010bba7d8b mysqld`mysql_execute_command(thd=0x00007f8fbb111e00, first_level=true) at sql_parse.cc:4648:29
  frame #15: 0x000000010bb9e230 mysqld`dispatch_sql_command(thd=0x00007f8fbb111e00, parser_state=0x0000700002065de8) at sql_parse.cc:5303:19
  frame #16: 0x000000010bb9a4cb mysqld`dispatch_command(thd=0x00007f8fbb111e00, com_data=0x0000700002066e38, command=COM_QUERY) at sql_parse.cc:2135:7
  frame #17: 0x000000010bb9c846 mysqld`do_command(thd=0x00007f8fbb111e00) at sql_parse.cc:1464:18
  frame #18: 0x000000010b2f2574 mysqld`handle_connection(arg=0x0000600000e34200) at connection_handler_per_thread.cc:304:13
  frame #19: 0x000000010e072fc4 mysqld`pfs_spawn_thread(arg=0x00007f8fba8160b0) at pfs.cc:3051:3
  frame #20: 0x00007ff806c2b202 libsystem_pthread.dylib`_pthread_start + 99
  frame #21: 0x00007ff806c26bab libsystem_pthread.dylib`thread_start + 15

b) the query thread being explained is itself performing LEX::cleanup
and as part of the iterates over the query terms, but still allows
EXPLAIN of the query plan since

   thd->query_plan.set_query_plan(SQLCOM_END, ...)

hasn't been called yet.

     20:frame: Query_terms<(Visit_order)1, (Visit_leaves)0>::Query_term_iterator::operator++() (in mysqld) (query_term.h:613)
     21:frame: Query_expression::cleanup(bool) (in mysqld) (sql_union.cc:1861)
     22:frame: LEX::cleanup(bool) (in mysqld) (sql_lex.h:4286)
     30:frame: Sql_cmd_dml::execute(THD*) (in mysqld) (sql_select.cc:799)
     31:frame: mysql_execute_command(THD*, bool) (in mysqld) (sql_parse.cc:4648)
     32:frame: dispatch_sql_command(THD*, Parser_state*) (in mysqld) (sql_parse.cc:5303)
     33:frame: dispatch_command(THD*, COM_DATA const*, enum_server_command) (in mysqld) (sql_parse.cc:2135)
     34:frame: do_command(THD*) (in mysqld) (sql_parse.cc:1464)
     57:frame: handle_connection(void*) (in mysqld) (connection_handler_per_thread.cc:304)
     58:frame: pfs_spawn_thread(void*) (in mysqld) (pfs.cc:3053)
     65:frame: _pthread_start (in libsystem_pthread.dylib) + 99
     66:frame: thread_start (in libsystem_pthread.dylib) + 15

Solution:

This patch solves the issue by removing iterator state from
Query_term, making the query_term iterators thread safe. This solution
labels every child query_term with its index in its parent's
m_children vector.  The iterator can therefore easily compute the next
child to visit based on Query_term::m_sibling_idx.

A unit test case is added to check reentrancy.

One can also manually verify that we have no remaining race condition
by running two client connections files (with \. <file>) with a big
number of copies of the repro query in one connection and a big number
of EXPLAIN format=json FOR <connection>, e.g.

    EXPLAIN FORMAT=json FOR CONNECTION 8\G

in the other. The actual connection number would need to verified
in connection one, of course.

Change-Id: Ie7d56610914738ccbbecf399ccc4f465f7d26ea7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants