-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch 2.2 fails to start with jna tmp dir configured (core dump) #18272
Comments
i have the same issue |
This looks like a JNA problem. See http://mail-archives.apache.org/mod_mbox/cassandra-user/201405.mbox/%[email protected]%3E Try just setting the jna tmpdir to a tmp folder without noexec. |
@clintongormley if you read this ticket and mine again you can see that we tried to point the ES config to a new directory with appropriate rights and it still doesn't work. ls -alZ /var/lib/elasticsearch/ |
Related to #14372 |
I've tried to reproduce this on Centos 6.7 and Fedora 23. It does not reproduce for me. I notice that both error reports show death trying to register the C library. This the first step before we can even make any native calls. Thus, we are immediately dying here and it's not that we are interacting poorly with other native calls (e.g., Is there additional configuration at play here that might be relevant? What else can you share about the system? What kernel do you have? I tested on |
Kernel: 2.6.32-573.7.1.el6.x86_64
Adding
|
CentOS Linux release 7.2.1511 (Core)
My tmp dir is on an LVM disk but it doesn't matter if the tmp dir is a symlink or not.
|
@fxh @bubo77 Thanks. Nothing alarming there. Can you add these system properties to
and start Elasticsearch (only the first two should be new). This should produce output like:
Can you share that and any additional JNA debugging output? |
Also, can you share your SELinux configuration via |
sestatus:
getsebool
systemctl dump:
|
I've personally tried with setenforce 0 but nothing. |
yields
furthermore as requested:
I'll disable SELinux and reboot later tonight and let you know the outcome. Thanks for your investigations. |
Poking around some more, the segmentation fault is happening in |
same result with jna-4.2.2.jar:
regarding SELinux, I disabled it (via /etc/selinux/config and reboot):
and this seems to work:
|
Ah, so if I'm reading you correctly, it appears my hunch that it is SELinux is correct! Okay, I'll see if I can take your SELinux configuration somewhere then. Thanks for reporting back. |
@fxh Can you share the output of |
and
|
I have noticed I am able to install and successfully run ES 2.1.1 on a CoS 7.2 system with SELinux enabled and noexec on /tmp (without using java opts -Djna.tmpdir= nor -Djava.io.tmpdir= 2.1.2 and above (tested on the latest 2.3.3) crash with a similar dump mentioned by bubo77, both with and without the tmpdirs specified. Also using: jdk1.8.0_45-1.8.0_45-fcs.x86_64 |
@olenm And if you disable SELinux, does Elasticsearch startup successfully? |
with SELinux disabled it does not startup successfully (just hit the reboot). forgot to mention in my first post (and more importantly) I have noexec on /tmp EDIT: and I have verified removing the noexec on /tmp does NOT resolve the issue for 2.3.3 (if needed I can test this again on 2.1.2) - it takes a while longer to fail |
@olenm It's never going to work if |
ran a few tests: for ES 2.1.1 with noexec on /tmp (selinux still disabled): using no with ES 2.3.3 with noexec on /tmp (selinux disabled):
I am seeing an entry I have not before through journalctl: plugin cloud-aws is the culprit; yanking the cloud-aws and removing its section from the config for it did yield in a successful start of ES 2.3.3 |
@olenm Can you share the log message? |
(and before it appears to be overlooked with the combination the tmp noexec) |
Thanks, that's unrelated to the issue here. The plugins are not loaded until node during startup, which happens after bootstrap where JNA is loaded. |
@olenm Right, crashing is not the expected behavior with Both of your experiments seem to continue to support the earlier hypothesis that there is a poor interaction between JNA and SELinux occurring here. |
@jasontedor after my "success" last night; I could not reproduce the "successful" case for ES 2.3.3 this morning on a fresh system so something contaminated my instance (again!). But now will not be messing around with less than 2.3.3. Using different CoS7 systems (built from the same template and running side by side), I had success getting ES 2.3.3 to started in the first 2 below and failed on the 3rd:
SYS1:
SYS2:
SYS3:
Couldn't tell you why I did not post the memory violation earlier (saw it last week even) - any other info that I can pull from the system? |
I feel pretty confident that the issues here are due to poor interaction between SELinux and JNA. The exact circumstances are unclear to me, but I do not think that there is an Elasticsearch issue here to resolve. Please feel free to reopen if you have evidence to the contrary. |
In #18406 it was said that so far this wasn't really reproducible. |
If someone else is facing the same .. I got the same error as OP with:
The reason of failure was that the system user with which ES was running did not have an existing home directory. Once I created the homedir, it started as expected. More about this file - https://community.oracle.com/thread/3783686. |
`ffi_closure_alloc` may fail and return `NULL` if, for instance, we're running in a locked-down operating system that forbids FFI from allocating executable pages of memory in any of the ways that it tries. Today we pass this `NULL` on to `ffi_prep_closure_loc` which triggers a segmentation fault that takes down the whole JVM. With this change we check for a failure in this call and turn it into an `UnsupportedOperationException` so that the caller can handle it more gracefully. Relates elastic/elasticsearch#73309 Relates elastic/elasticsearch#18272
`ffi_closure_alloc` may fail and return `NULL` if, for instance, we're running in a locked-down operating system that forbids FFI from allocating executable pages of memory in any of the ways that it tries. Today we pass this `NULL` on to `ffi_prep_closure_loc` which triggers a segmentation fault that takes down the whole JVM. With this change we check for a failure in this call and turn it into an `UnsupportedOperationException` so that the caller can handle it more gracefully. Relates elastic/elasticsearch#73309 Relates elastic/elasticsearch#18272
Today if `libffi` cannot allocate pages of memory which are both writeable and executable then it will attempt to write code to a temporary file. Elasticsearch configures itself a suitable temporary directory for use by JNA but by default `libffi` won't find this directory and will try various other places. In certain configurations, none of the other places that `libffi` tries are suitable. With older versions of JNA this would result in a `SIGSEGV`; since elastic#80617 the JVM will exit with an exception. With this commit we use the `LIBFFI_TMPDIR` environment variable to configure `libffi` to use the same directory as JNA for its temporary files if they are needed. Closes elastic#18272 Closes elastic#73309 Closes elastic#74545 Closes elastic#77014 Closes elastic#77053 Relates elastic#77285 Co-authored-by: Rory Hunter <[email protected]>
Today if `libffi` cannot allocate pages of memory which are both writeable and executable then it will attempt to write code to a temporary file. Elasticsearch configures itself a suitable temporary directory for use by JNA but by default `libffi` won't find this directory and will try various other places. In certain configurations, none of the other places that `libffi` tries are suitable. With older versions of JNA this would result in a `SIGSEGV`; since #80617 the JVM will exit with an exception. With this commit we use the `LIBFFI_TMPDIR` environment variable to configure `libffi` to use the same directory as JNA for its temporary files if they are needed. Closes #18272 Closes #73309 Closes #74545 Closes #77014 Closes #77053 Relates #77285 Co-authored-by: Rory Hunter <[email protected]>
Today if `libffi` cannot allocate pages of memory which are both writeable and executable then it will attempt to write code to a temporary file. Elasticsearch configures itself a suitable temporary directory for use by JNA but by default `libffi` won't find this directory and will try various other places. In certain configurations, none of the other places that `libffi` tries are suitable. With older versions of JNA this would result in a `SIGSEGV`; since elastic#80617 the JVM will exit with an exception. With this commit we use the `LIBFFI_TMPDIR` environment variable to configure `libffi` to use the same directory as JNA for its temporary files if they are needed. Closes elastic#18272 Closes elastic#73309 Closes elastic#74545 Closes elastic#77014 Closes elastic#77053 Relates elastic#77285 Co-authored-by: Rory Hunter <[email protected]>
Today if `libffi` cannot allocate pages of memory which are both writeable and executable then it will attempt to write code to a temporary file. Elasticsearch configures itself a suitable temporary directory for use by JNA but by default `libffi` won't find this directory and will try various other places. In certain configurations, none of the other places that `libffi` tries are suitable. With older versions of JNA this would result in a `SIGSEGV`; since elastic#80617 the JVM will exit with an exception. With this commit we use the `LIBFFI_TMPDIR` environment variable to configure `libffi` to use the same directory as JNA for its temporary files if they are needed. Closes elastic#18272 Closes elastic#73309 Closes elastic#74545 Closes elastic#77014 Closes elastic#77053 Relates elastic#77285 Co-authored-by: Rory Hunter <[email protected]>
Today if `libffi` cannot allocate pages of memory which are both writeable and executable then it will attempt to write code to a temporary file. Elasticsearch configures itself a suitable temporary directory for use by JNA but by default `libffi` won't find this directory and will try various other places. In certain configurations, none of the other places that `libffi` tries are suitable. With older versions of JNA this would result in a `SIGSEGV`; since #80617 the JVM will exit with an exception. With this commit we use the `LIBFFI_TMPDIR` environment variable to configure `libffi` to use the same directory as JNA for its temporary files if they are needed. Closes #18272 Closes #73309 Closes #74545 Closes #77014 Closes #77053 Relates #77285 Co-authored-by: Rory Hunter <[email protected]> Co-authored-by: Rory Hunter <[email protected]>
* Set LIBFFI_TMPDIR at startup (#80651) Today if `libffi` cannot allocate pages of memory which are both writeable and executable then it will attempt to write code to a temporary file. Elasticsearch configures itself a suitable temporary directory for use by JNA but by default `libffi` won't find this directory and will try various other places. In certain configurations, none of the other places that `libffi` tries are suitable. With older versions of JNA this would result in a `SIGSEGV`; since #80617 the JVM will exit with an exception. With this commit we use the `LIBFFI_TMPDIR` environment variable to configure `libffi` to use the same directory as JNA for its temporary files if they are needed. Closes #18272 Closes #73309 Closes #74545 Closes #77014 Closes #77053 Relates #77285 Co-authored-by: Rory Hunter <[email protected]> * Fix incorrect SSL usage Co-authored-by: Rory Hunter <[email protected]>
Elasticsearch version 2.2
JVM version: 1.7.0_75
OS version: CentOS release 6.7
Description of the problem including expected versus actual behavior:
Starting elasticsearch fails to start if the following is added to the /etc/sysconfig/elasticsearch file:
With this line commented out, ES starts but JNA is disabled as the default /tmp/ directory is mounted with noexec. If /tmp/ is mounted without noexec all works as expected.
/usr/share/elasticsearch/tmp exists, belongs to the right user/group and has permissions 755.
Steps to reproduce:
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: