Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galera node wrongly purged after placed 'OFFLINE HARD' in the 'offline_hostgroup' #3216

Closed
5 tasks done
JavierJF opened this issue Dec 23, 2020 · 0 comments · Fixed by #3217
Closed
5 tasks done

Galera node wrongly purged after placed 'OFFLINE HARD' in the 'offline_hostgroup' #3216

JavierJF opened this issue Dec 23, 2020 · 0 comments · Fixed by #3217

Comments

@JavierJF
Copy link
Collaborator

  • A clear description of the issue

After certain circumstances a node can be wrongly placed as 'OFFLINE HARD' in the 'offline_hostgroup', leading to a possible purge of the node if a subsequent cleanup of 'runtime_mysql_servers' table is performed.

  • ProxySQL version

v2.0.16
v2.1.0

  • OS version
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
  • The steps to reproduce the issue

A particular sequence for triggering this issue could be the following:

  1. Start ProxySQL with three node Galera cluster with the following config:
     mysql> select * from runtime_mysql_galera_hostgroups;
     +------------------+-------------------------+------------------+-------------------+--------+-------------+-----------------------+-------------------------+---------+
     | writer_hostgroup | backup_writer_hostgroup | reader_hostgroup | offline_hostgroup | active | max_writers | writer_is_also_reader | max_transactions_behind | comment |
     +------------------+-------------------------+------------------+-------------------+--------+-------------+-----------------------+-------------------------+---------+
     | 20               | 30                      | 10               | 40                | 1      | 1           | 0                     | 100                     | NULL    |
     +------------------+-------------------------+------------------+-------------------+--------+-------------+-----------------------+-------------------------+---------+
     1 row in set (0.00 sec)
    
  2. Make ProxySQL move one particular server of the cluster to the 'offline_hostgroup' twice.
  3. The server should right now only be present in the 'offline_hostgroup' as 'OFFLINE_HARD'.
  4. Query runtime_mysql_galera_hostgroups to trigger a cleanup of the runtime_mysql_servers table.
  5. The offline server has been purged and has disappear from the table:
    mysql> select * from runtime_mysql_servers;
    +--------------+-------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
    | hostgroup_id | hostname    | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
    +--------------+-------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
    | 10           | 172.18.1.11 | 3306 | 0         | ONLINE | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
    | 10           | 172.18.1.12 | 3306 | 0         | ONLINE | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
    +--------------+-------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
    2 rows in set (0.00 sec)
    
  • The full ProxySQL error log (default location: /var/lib/proxysql/proxysql.log)

Attached can be found:

  • Full ProxySQL error log during the issue reproduction.
  • mysql_servers table configuration.
  • galera_hostgroups table configuration.
  • Actions performed against the offline server.
  • Final runtime_mysql_servers state.

node_missing_wiar_zero.tar.gz

renecannao added a commit that referenced this issue Dec 27, 2020
Closes #3216: Galera node wrongly purged after placed 'OFFLINE HARD' in the 'offline_hostgroup'
renecannao added a commit that referenced this issue Dec 27, 2020
Closes #3216: Galera node wrongly purged after placed 'OFFLINE HARD' in the 'offline_hostgroup'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant