Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on REST API script invocation #4001

Closed
8 tasks done
borissavelev opened this issue Oct 21, 2022 · 6 comments
Closed
8 tasks done

Crash on REST API script invocation #4001

borissavelev opened this issue Oct 21, 2022 · 6 comments

Comments

@borissavelev
Copy link

borissavelev commented Oct 21, 2022

It is a reproducible bug
Log:

2022-10-21 15:09:40 ProxySQL_RESTAPI_Server.cpp:168:process_request(): [ERROR] Internal error while executing script '/var/lib/proxysql/scripts/ping.sh'. 'resource exhaustion (maxfd >= FD_SETSIZE)' syscall failed with error code: '9'.
2022-10-21 15:09:40 MySQL_Thread.cpp:5577:check_for_invalid_fd(): [ERROR] revents==POLLNVAL for FD=1206, events=1, MyDSFD=1206, MyConnFD=1206
proxysql: MySQL_Thread.cpp:5578: void MySQL_Thread::check_for_invalid_fd(unsigned int): Assertion `mypolls.fds[n].revents!=POLLNVAL' failed.
Error: signal 6:
/usr/bin/proxysql(_Z13crash_handleri+0x48)[0xaaaae550a198]
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0xffff93a9078c]
/lib/aarch64-linux-gnu/libc.so.6(gsignal+0xe0)[0xffff9341ed78]
/lib/aarch64-linux-gnu/libc.so.6(abort+0x114)[0xffff9340baac]
/lib/aarch64-linux-gnu/libc.so.6(+0x2d490)[0xffff93418490]
/lib/aarch64-linux-gnu/libc.so.6(+0x2d4f4)[0xffff934184f4]
/usr/bin/proxysql(+0x330934)[0xaaaae5560934]
/usr/bin/proxysql(_ZN12MySQL_Thread24ProcessAllMyDS_AfterPollEv+0x68)[0xaaaae556c068]
/usr/bin/proxysql(_ZN12MySQL_Thread3runEv+0x550)[0xaaaae556d7b0]
/usr/bin/proxysql(_Z24mysql_worker_thread_funcPv+0x90)[0xaaaae5501ef0]
/lib/aarch64-linux-gnu/libpthread.so.0(+0x7624)[0xffff93844624]
/lib/aarch64-linux-gnu/libc.so.6(+0xd149c)[0xffff934bc49c]
 ---- /usr/bin/proxysql(_Z13crash_handleri+0x48) [0xaaaae550a198] : crash_handler(int)
 ---- /usr/bin/proxysql(_ZN12MySQL_Thread24ProcessAllMyDS_AfterPollEv+0x68) [0xaaaae556c068] : MySQL_Thread::ProcessAllMyDS_AfterPoll()
 ---- /usr/bin/proxysql(_ZN12MySQL_Thread3runEv+0x550) [0xaaaae556d7b0] : MySQL_Thread::run()
 ---- /usr/bin/proxysql(_Z24mysql_worker_thread_funcPv+0x90) [0xaaaae5501ef0] : mysql_worker_thread_func(void*)
To report a crashing bug visit: https://github.com/sysown/proxysql/issues
For support visit: https://proxysql.com/services/support/
2022-10-21 15:09:40 main.cpp:1296:ProxySQL_daemonize_phase3(): [ERROR] ProxySQL crashed. Restarting!
2022-10-21 15:09:40 [INFO] ProxySQL version 2.4.4-41-g83ffb72
2022-10-21 15:09:40 [INFO] ProxySQL SHA1 checksum: 10c33e190cef903bb5fe1c48a7f84aaa64396c90
2022-10-21 15:09:40 [INFO] Angel process started ProxySQL process 732631
2022-10-21 15:09:40 [INFO] Loaded built-in SQLite3
Standard ProxySQL MySQL Logger rev. 2.0.0714 -- MySQL_Logger.cpp -- Wed Sep 14 08:24:46 2022
Standard ProxySQL Cluster rev. 0.4.0906 -- ProxySQL_Cluster.cpp -- Wed Sep 14 08:24:46 2022
Standard ProxySQL Statistics rev. 1.4.1027 -- ProxySQL_Statistics.cpp -- Wed Sep 14 08:24:46 2022
Standard ProxySQL HTTP Server Handler rev. 1.4.1031 -- ProxySQL_HTTP_Server.cpp -- Wed Sep 14 08:24:46 2022
uname -a
Linux prelive-proxysql-1 5.15.0-1020-aws #24~20.04.1-Ubuntu SMP Fri Sep 2 15:32:23 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3146
Server version: 5.7.28 (ProxySQL Admin Module)

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select * from runtime_restapi_routes;
+----+--------+------------+--------+------+-----------------------------------+---------+
| id | active | timeout_ms | method | uri  | script                            | comment |
+----+--------+------------+--------+------+-----------------------------------+---------+
| 4  | 1      | 3000       | GET    | ping | /var/lib/proxysql/scripts/ping.sh | ping    |
+----+--------+------------+--------+------+-----------------------------------+---------+
1 row in set (0.00 sec)
cat /var/lib/proxysql/scripts/ping.sh
#!/bin/sh

echo '{"ping": "ok"}'
dpkg -l | grep proxysql
ii  proxysql                       2.4.4                             arm64        High performance MySQL proxy
gdb /usr/bin/proxysql /var/lib/proxysql/core
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/proxysql...
[New LWP 732604]
[New LWP 732594]
[New LWP 732601]
[New LWP 732597]
[New LWP 732607]
[New LWP 732611]
[New LWP 732603]
[New LWP 732599]
[New LWP 732613]
[New LWP 732608]
[New LWP 732605]
[New LWP 732614]
[New LWP 732609]
[New LWP 732612]
[New LWP 732610]
[New LWP 732615]
[New LWP 732618]
[New LWP 732596]
[New LWP 732622]
[New LWP 732606]
[New LWP 732600]
[New LWP 732595]
[New LWP 732598]
[New LWP 732616]
[New LWP 732617]
[New LWP 732621]
[New LWP 732602]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/proxysql --idle-threads -c /etc/proxysql.cnf'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0xffff8e17a250 (LWP 732604))]
(gdb) bt full
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
        set = {__val = {0, 281473065654400, 281473065654256, 281473065654096, 281473065654096, 281473065654048, 12228749664433029632, 281473065653824, 187651343613952, 281473065653712, 281473152507448, 281473153606816, 281473065653840, 281473065653760,
            281473152236112, 5578}}
        pid = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
#1  0x0000ffff9340baac in __GI_abort () at abort.c:79
        save_stage = 1
        act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {8386658464824651349, 8245935278387127397, 2606, 18446744073709551615, 0, 0, 18446744073709486080, 281473065654048, 281473152509840, 281473153606816,
              281473153590832, 4616194021471028225, 4616194021471028225, 281473065654112, 281473152287760, 187650975185392}}, sa_flags = -440944088, sa_restorer = 0xffff93513738}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2  0x0000ffff93418490 in __assert_fail_base (fmt=0xffff93513738 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0xaaaae5b7ba28 "mypolls.fds[n].revents!=POLLNVAL", file=file@entry=0xaaaae5b7acb0 "MySQL_Thread.cpp",
    line=line@entry=5578, function=function@entry=0xaaaae5b7b9f0 "void MySQL_Thread::check_for_invalid_fd(unsigned int)") at assert.c:92
        str = 0xffff8440ff20 "proxysql: MySQL_Thread.cpp:5578: void MySQL_Thread::check_for_invalid_fd(unsigned int): Assertion `mypolls.fds[n].revents!=POLLNVAL' failed.\n"
        total = 4096
#3  0x0000ffff934184f4 in __GI___assert_fail (assertion=assertion@entry=0xaaaae5b7ba28 "mypolls.fds[n].revents!=POLLNVAL", file=file@entry=0xaaaae5b7acb0 "MySQL_Thread.cpp", line=line@entry=5578,
    function=function@entry=0xaaaae5b7b9f0 "void MySQL_Thread::check_for_invalid_fd(unsigned int)") at assert.c:101
No locals.
#4  0x0000aaaae5560934 in MySQL_Thread::check_for_invalid_fd (this=this@entry=0xffff8c8ff000, n=n@entry=32) at MySQL_Thread.cpp:5583
        _myds = 0xffff873c4500
        __func__ = "check_for_invalid_fd"
        __PRETTY_FUNCTION__ = "void MySQL_Thread::check_for_invalid_fd(unsigned int)"
#5  0x0000aaaae556c068 in MySQL_Thread::ProcessAllMyDS_AfterPoll (this=this@entry=0xffff8c8ff000) at MySQL_Thread.cpp:3053
        rc = <optimized out>
        myds = 0xffff873c4500
        n = 32
#6  0x0000aaaae556d7b0 in MySQL_Thread::run (this=0xffff8c8ff000) at MySQL_Thread.cpp:3323
        num_idles = 0
        ttw = <optimized out>
        maintenance_interval = <optimized out>
        n = <optimized out>
        rc = 2
        idle_maintenance_thread = false
        __PRETTY_FUNCTION__ = "void MySQL_Thread::run()"
#7  0x0000aaaae5501ef0 in mysql_worker_thread_func (arg=0xffff92ac6fe0) at main.cpp:418
        thread_attr = {__size = '\000' <repeats 17 times>, "\020", '\000' <repeats 45 times>, __align = 0}
        tmp_stack_size = 8388608
        mysql_thread = 0xffff92ac6fe0
        worker = 0xffff8c8ff000
#8  0x0000ffff93844624 in start_thread (arg=0xaaaae5501e60 <mysql_worker_thread_func(void*)>) at pthread_create.c:477
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {281473065657964, 281473142701584, 281474186384926, 281473156808704, 281474186384927, 187650968395360, 281473065658704, 281473065656912, 281473156812800, 281473065656912, 281473065654832,
                17637164524599477294, 0, 17637164524103293698, 0, 0, 0, 0, 0, 0, 0, 0}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#9  0x0000ffff934bc49c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
No locals.
curl -v localhost:6070/sync/ping
*   Trying 127.0.0.1:6070...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 6070 (#0)
> GET /sync/ping HTTP/1.1
> Host: localhost:6070
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 500 Internal Server Error
< Connection: Keep-Alive
< Content-Length: 141
< Access-Control-Allow-Origin: *
< Content-Type: application/json
< Date: Fri, 21 Oct 2022 15:32:08 GMT
<
* Connection #0 to host localhost left intact
{"type":"out","error":"Internal error while executing script, 'resource exhaustion (maxfd >= FD_SETSIZE)' syscall failed.", "error_code":"9"}

If you are submitting a reproducible bug report, please provide:

  • A clear description of the issue
  • ProxySQL version
  • OS version
  • The steps to reproduce the issue
  • The full ProxySQL error log (default location: /var/lib/proxysql/proxysql.log)

If this is a crashing bug, please also include:

  • The package used to install ProxySQL
  • The compressed proxysql binary
  • The compressed core dump (Note: if you're worried it may contain sensitive data, please contact us for information on sharing it securely: https://proxysql.com/contact-us/)

If the above information is not provided, this issue is likely to be closed.

Please use markdown to format any code or SQL: https://guides.github.com/features/mastering-markdown/

Thank you!

@renecannao
Copy link
Contributor

renecannao commented Oct 21, 2022

Thank you for the report.

There is no relevant change between 2.4.3 and 2.4.4 .
What it is changing between when you managed to crash it, ans when not, is your workload.
The error is clear in the first line of your report:

2022-10-21 15:09:40 ProxySQL_RESTAPI_Server.cpp:168:process_request(): [ERROR] Internal error while executing script '/var/lib/proxysql/scripts/ping.sh'. 'resource exhaustion (maxfd >= FD_SETSIZE)' syscall failed with error code: '9'.

Basically: 'resource exhaustion (maxfd >= FD_SETSIZE)' , you ran out of file descriptors that the http server can use.
In a workload you run out of file descriptors, in another you didn't (maybe because there were less clients connected).

We will troubleshoot this more, and specifically we need to:

  • evaluate how to handle more file descriptors in the REST API
  • better error handling, and avoid the assert() to happen

@borissavelev
Copy link
Author

yes, later I got the same error on 2.4.3 too. Sorry about misleading you.

I guess the reason is here
https://man7.org/linux/man-pages/man2/select.2.html

WARNING: select() can monitor only file descriptors numbers that
are less than FD_SETSIZE (1024)—an unreasonably low limit for
many modern applications—and this limitation will not change.
All modern applications should instead use poll(2) or epoll(7),
which do not suffer this limitation.

https://github.com/sysown/proxysql/blob/f0799a43d8145ae3d138d7bc512581b2433df732/include/proxysql_utils.h

*   - '-4' in case of resource exhaustion, file descriptors being used exceeds 'FD_SETSIZE'.
*   - '-5' in case 'select()' call failed.

so right after I have more than 1024 connections (opened FD) all REST API calls will fail

@renecannao
Copy link
Contributor

@borissavelev : while reviewing this issue again we noticed that the crash has nothing to do with the REST API.
The error on the REST API was simply one line above the real cause of the error, an assert() in MySQL_Thread.cpp, thus completely unrelated to REST API.
Your proxysql instance is running out of file descriptors: this is the real problem, and what you should be fixing.
Please increase the number of file descriptors.
If you run proxysql with systemd, that happens automatically: https://github.com/sysown/proxysql/blob/v2.x/systemd/system/proxysql.service#L20

Thanks

@borissavelev
Copy link
Author

thus completely unrelated to REST API.

I may not agree with you) REST API server is running in a separate thread, so it's related.

Proxysql instance is not running out of file descriptors

cat /proc/$(pidof proxysql|awk '{print $1}')/limits|grep open
Max open files            102400               102400               files
Oct 21 20:56:52 prelive-proxysql-1 proxysql[1349294]: 2022-10-21 20:56:52 [INFO] Current RLIMIT_NOFILE: 102400

int wexecvp is using only in REST API and this check doesn't look good enough

printf '# include <sys/select.h>\n#pragma message FD_SETSIZE'| cpp |grep pragma
#pragma message 1024

@renecannao
Copy link
Contributor

Reopening, maybe you are onto something

@renecannao renecannao reopened this Oct 24, 2022
JavierJF added a commit that referenced this issue Dec 8, 2022
This commit addresses several issues with 'wexecvp':
  - Fix invalid double call to 'close' in case fd was higher than
    'FD_SETSIZE'. This could lead to the invalidation of an unrelated
    fd resulting in asserts, as in issue #4001 and other instabilities.
  - Fix previous limitations of the legacy 'select 'impl that rendered
    the RESTAPI unusable when ProxySQL had more then 'FD_SETSIZE' fds
    opened.
  - Other minor improvements in function logic and interface.
JavierJF added a commit that referenced this issue Dec 8, 2022
This commit pack a couple of fixes and improvements for the RESTAPI:
  - Homogenization of GET/POST endpoint responses by ProxySQL. All
    responses should now be guaranteed to be valid JSON.
  - Fix JSON construction from parameters supplied to GET endpoint.
  - Add two new fields 'script_stdout' and 'script_stderr' to the
    JSON response when the target script fails to be executed. This is,
    when it exists with an error code other than zero. This makes the
    response homogeneous to when the scripts fails to produce a valid
    JSON output, and add extra information for debugging on client side.
  - Add a new debugging module 'PROXY_DEBUG_RESTAPI', to help tracing in
    debugging builds the requests to the endpoints.
JavierJF added a commit that referenced this issue Dec 8, 2022
- Fix several utility functions and add new ones.
- Improve current tests targeting RESTAPI.
- Add new regression test for issue #4001.
- Add several scripts used for RESTAPI tests rework.
- Fix compilation of several tests after 'wexcevp' interface change.
@JavierJF
Copy link
Collaborator

JavierJF commented Dec 8, 2022

Hi @borissavelev,

first, thank you for the report. We were aware of the previous limitation of the RESTAPI for FD_SETSIZE, as you could see in the documentation you pointed out. The check you also pointed out here was sufficient for avoiding trying to use those file descriptors that were bigger than FD_SETSIZE by gracefully failing. The problem was that there was a double call to close over the same file descriptor, potentially invalidating an unrelated file descriptor and leading to the crash you experienced. These issues, including the previous limitation should be fixed by PR #4046.

Thank you, regards,
Javier

JavierJF added a commit that referenced this issue Dec 9, 2022
This commit addresses several issues with 'wexecvp':
  - Fix invalid double call to 'close' in case fd was higher than
    'FD_SETSIZE'. This could lead to the invalidation of an unrelated
    fd resulting in asserts, as in issue #4001 and other instabilities.
  - Fix previous limitations of the legacy 'select 'impl that rendered
    the RESTAPI unusable when ProxySQL had more then 'FD_SETSIZE' fds
    opened.
  - Other minor improvements in function logic and interface.
JavierJF added a commit that referenced this issue Dec 9, 2022
This commit pack a couple of fixes and improvements for the RESTAPI:
  - Homogenization of GET/POST endpoint responses by ProxySQL. All
    responses should now be guaranteed to be valid JSON.
  - Fix JSON construction from parameters supplied to GET endpoint.
  - Add two new fields 'script_stdout' and 'script_stderr' to the
    JSON response when the target script fails to be executed. This is,
    when it exists with an error code other than zero. This makes the
    response homogeneous to when the scripts fails to produce a valid
    JSON output, and add extra information for debugging on client side.
  - Add a new debugging module 'PROXY_DEBUG_RESTAPI', to help tracing in
    debugging builds the requests to the endpoints.
JavierJF added a commit that referenced this issue Dec 9, 2022
- Fix several utility functions and add new ones.
- Improve current tests targeting RESTAPI.
- Add new regression test for issue #4001.
- Add several scripts used for RESTAPI tests rework.
- Fix compilation of several tests after 'wexcevp' interface change.
JavierJF added a commit that referenced this issue Dec 9, 2022
This commit addresses several issues with 'wexecvp':
  - Fix invalid double call to 'close' in case fd was higher than
    'FD_SETSIZE'. This could lead to the invalidation of an unrelated
    fd resulting in asserts, as in issue #4001 and other instabilities.
  - Fix previous limitations of the legacy 'select 'impl that rendered
    the RESTAPI unusable when ProxySQL had more then 'FD_SETSIZE' fds
    opened.
  - Other minor improvements in function logic and interface.
JavierJF added a commit that referenced this issue Dec 9, 2022
This commit pack a couple of fixes and improvements for the RESTAPI:
  - Homogenization of GET/POST endpoint responses by ProxySQL. All
    responses should now be guaranteed to be valid JSON.
  - Fix JSON construction from parameters supplied to GET endpoint.
  - Add two new fields 'script_stdout' and 'script_stderr' to the
    JSON response when the target script fails to be executed. This is,
    when it exists with an error code other than zero. This makes the
    response homogeneous to when the scripts fails to produce a valid
    JSON output, and add extra information for debugging on client side.
  - Add a new debugging module 'PROXY_DEBUG_RESTAPI', to help tracing in
    debugging builds the requests to the endpoints.
JavierJF added a commit that referenced this issue Dec 9, 2022
- Fix several utility functions and add new ones.
- Improve current tests targeting RESTAPI.
- Add new regression test for issue #4001.
- Add several scripts used for RESTAPI tests rework.
- Fix compilation of several tests after 'wexcevp' interface change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants