Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProxySQL hanged due to Watchdog missed heartbeat #2217

Open
selvabalaji15 opened this issue Aug 27, 2019 · 7 comments
Open

ProxySQL hanged due to Watchdog missed heartbeat #2217

selvabalaji15 opened this issue Aug 27, 2019 · 7 comments

Comments

@selvabalaji15
Copy link

ProxySQL version : ProxySQL version 2.0.4-116-g7d371cf, codename Truls
OS version : CentOS Linux release 7.6.1810 (Core)

As we are having network latency between MySQL and ProxySQL servers some times, watchdog missed heartbeat and proxysql got restarted.

So I have changed restart_on_missing_heartbeats variable value as zero to avoid restart if watchdog missed heartbeat.

But after that if watchdog missed heartbeat for while, proxysql got hanged. So we are restarting proxysql whenever it got hanged.

Is that any timeout settings/configurations we can deploy for not missing heartbeat. Please suggest us, how to overcome this issue.

Proxysql Error log :-

2019-08-26 17:53:51 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:53:57 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:03 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:09 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:15 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:21 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:27 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:33 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:39 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:45 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:51 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:54:57 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:03 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:09 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:15 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:21 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:27 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:33 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:39 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:45 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:51 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:55:57 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:03 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:09 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:15 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:21 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:27 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:33 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:39 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:45 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:51 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:56:57 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:57:03 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat
2019-08-26 17:57:09 main.cpp:1634:main(): [ERROR] Watchdog: 1 threads missed a heartbeat

@renecannao
Copy link
Contributor

I am not really sure what you are asking.
ProxySQL has a mechanism to automatically restart if it hangs. But you disabled it and now you are restarting proxysql when it hangs.
Why don't you re-enable the automatic restart?

Also, when proxysql automatically restarts it will also generate a core dump that, once analyzed, may help identify why it hangs, so that perhaps a fix can be implemented.

@selvabalaji15
Copy link
Author

selvabalaji15 commented Aug 29, 2019

Thanks for your update.
Please find the core dump output.

#0 0x00007f8194dc0207 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f8194dc18f8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f8194db9026 in __assert_fail_base () from /lib64/libc.so.6
No symbol table info available.
#3 0x00007f8194db90d2 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#4 0x00000000004e56c6 in MySQL_Thread::run (this=this@entry=0x7f8190200000) at MySQL_Thread.cpp:3363
_myds = 0x7f81902d8100
rc =
myds = 0x7f81902d8100
num_idles =
ttw =
maintenance_interval =
idle_maintenance_thread =
PRETTY_FUNCTION = "void MySQL_Thread::run()"
func = "run"
n = 102
rc =
#5 0x000000000049eb2c in mysql_worker_thread_func (arg=0x7f819482bcf0) at main.cpp:627
thread_attr = {__size = '\000' <repeats 17 times>, "\020", '\000' <repeats 37 times>, __align = 0}
tmp_stack_size = 8388608
mysql_thread = 0x7f819482bcf0
worker = 0x7f8190200000
#6 0x00007f8195f9fdd5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#7 0x00007f8194e87ead in clone () from /lib64/libc.so.6
No symbol table info available.

We suspect n/w latency between proxy and DB, Which is not avoidable in our case.
So is it possible to increase the mysql thread polling time.

@renecannao
Copy link
Contributor

hi @selvabalaji15 , this is the backtrace of the main thread, that is the one that aborts in case of missing heartbeats.
The core dump needs more extensive troubleshooting to identify the thread that is not generating heartbeats, and why.
If you could share the core dump, we can troubleshoot that in future.
Thanks

@selvabalaji15
Copy link
Author

Hi Renecannao,
Please find the attached core dump.
core.4503.tar.gz

@renecannao
Copy link
Contributor

Hi @selvabalaji15 .
We are unable to analyze the core dump correctly.
Can you please send us a link of from where did you installed proxysql package? Or, could you please add here a compressed version of proxysql binary?

Thanks

@selvabalaji15
Copy link
Author

Hi Renecannao,
Please find the attachment proxysql binary.
proxysql.tar.gz

@Barbery
Copy link

Barbery commented Dec 30, 2020

I meet the problem too, my version is ProxySQL version 2.0.10-27-g5b31997_DEBUG, codename Truls

here some log

2020-12-30 11:00:59 MySQL_Thread.cpp:4583:process_all_sessions(): [WARNING] Closing unhealthy client connection *****:33376
<jemalloc>: src/arena.c:334: Failed assertion: "((uintptr_t)ptr - (uintptr_t)extent_addr_get(slab)) % (uintptr_t)bin_infos[binind].reg_size == 0"
<jemalloc>: Should own 0 locks of rank >= 1: bin(4294967295)
2020-12-30 11:01:27 main.cpp:1819:main(): [ERROR] Watchdog: 3 threads missed a heartbeat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants