Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PortDumpTest fails if kernel.core_pattern forces the core to be redirected #6300

Open
davidjmccann opened this issue Jan 13, 2022 · 15 comments

Comments

@davidjmccann
Copy link
Contributor

Running omrporttest if kernel.core_pattern=core.%p (for example), PortDumpTest passes.

If it's something like "|/usr/share/apport/apport %p %s %c %d %P %E" then the core doesn't necessarily get created in the current working directory, and the test fails.

This then raises the point that for this code to work, the user must disable core dump redirection, which is going to be a privileged operation. Even worse, there is no way of setting it within a container unless the container is privileged, meaning that the host has to be changed to get a core dump generated within the container with the specified name. This is described here:
containers/podman#6528

Could the core dump code for Linux be changed to directly generate the file similar to OSX so it wouldn't be affected by core_pattern at all?

@davidjmccann
Copy link
Contributor Author

Google coredumper (https://code.google.com/archive/p/google-coredumper/) did attempt to do this kind of thing but it's very old. https://github.com/Percona-Lab/coredumper was an attempt at updating it, but it uses various bits of assembler that don't seem to build now.

@davidjmccann
Copy link
Contributor Author

The basic issue I'm trying to solve is... in a Kubernetes environment, how can I get a core dump out of a process:

  1. Without requiring elevated privileges in the container
  2. Without requiring root access on the host node to set specific core settings
  3. With the core dump going to the persistent volume of choice rather than storage on the host.

@babsingh
Copy link
Contributor

babsingh commented Mar 3, 2022

With apport enabled, is a core file generated in any of the below locations?

  1. /var/lib/apport/coredump/
  2. /var/crash/

@babsingh
Copy link
Contributor

babsingh commented Mar 3, 2022

# apport disabled
systemctl stop apport

./fvtest/porttest/omrporttest --verbose --gtest_filter=PortDumpTest.dump_test_create_dump_with_NO_name
Note: Google Test filter = PortDumpTest.dump_test_create_dump_with_NO_name
[==========] Running 1 test from 1 test case.
[----------] 1 test from PortDumpTest
[----------] 1 test from PortDumpTest (18 ms total)

[==========] 1 test from 1 test case ran. (18 ms total)
[  PASSED  ] 1 test.
[  ALL TESTS PASSED  ]
# apport enabled
systemctl start apport

./fvtest/porttest/omrporttest --verbose --gtest_filter=PortDumpTest.dump_test_create_dump_with_NO_name
Note: Google Test filter = PortDumpTest.dump_test_create_dump_with_NO_name
[==========] Running 1 test from 1 test case.
[----------] 1 test from PortDumpTest
JVMPORT030W
/root/openj9-openjdk-jdk18/omr/fvtest/porttest/omrdumpTest.cpp line  173: omrdump_test_create_dump_with_NO_name omrdump_create returned: 1, with filename: The core file created by child process with pid = 21208 was not found. Expected to find core file with name "core"
		LastErrorNumber: 0
		LastErrorMessage:

/root/openj9-openjdk-jdk18/omr/fvtest/porttest/testHelpers.cpp:109: Failure
Value of: 0 == numberFailedTestsInComponent
  Actual: false
Expected: true
Test failed!
[  FAILED  ] PortDumpTest.dump_test_create_dump_with_NO_name (5376 ms)
[----------] 1 test from PortDumpTest (5376 ms total)

[==========] 1 test from 1 test case ran. (5377 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] PortDumpTest.dump_test_create_dump_with_NO_name

 1 FAILED TEST

@babsingh
Copy link
Contributor

babsingh commented Mar 3, 2022

My OS:

NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"

With apport enabled, the core file gets generated in /var/lib/apport/coredump/core._root_openj9-openjdk-jdk18_omr_build_fvtest_porttest_omrporttest.0.d4015845-7c5d-441f-90c2-82548c7c33d0.21208.706762115.

With apport disabled, the core file gets generated in the same directory.

omrdump_create(struct OMRPortLibrary *portLibrary, char *filename, char *dumpType, void *userData)

renameDump(struct OMRPortLibrary *portLibrary, char *filename, pid_t pid, int signalNumber)

The above functions correctly work with apport disabled since they expect the core file to be generated in the same directory. We can look into having the above functions work with apport. Will there be value in this since Apport is not enabled by default in stable releases, even if it is installed? The reasons are specified in https://wiki.ubuntu.com/Apport:

  1. Apport collects potentially sensitive data, such as core dumps, stack traces, and log files. They can contain passwords, credit card numbers, serial numbers, and other private material.
  2. During the development release we already collect thousands of crash reports, much more than we can ever fix. Continuing to collect those for stable releases is not really useful, since ...
  3. Data collection from apport takes a nontrivial amount of CPU and I/O resources, which slow down the computer and don't allow you to restart the crashed program for several seconds.

@davidjmccann
Copy link
Contributor Author

In an OpenShift environment we're more likely to be looking at systemd-coredump rather than Apport.

@keithc-ca
Copy link
Member

This seems a reasonable feature to add, but I suggest it should be opt-in: an application must explicitly indicate that the new mechanism should be used to generate core files (as opposed to allowing the Linux kernel to do that).

@babsingh
Copy link
Contributor

babsingh commented Mar 3, 2022

More details on the reasonable feature:

  • With apport or systemd-coredump, core files generated may be placed in a centralized location which can be inaccessible from the container environment.
  • Instead of relying upon the Linux OS to generate and redirect the core file to an inaccessible location, the new feature will generate and write the core file at the desired location. This approach is used on OSX:
    coredump_to_file(mach_port_t task_port, pid_t pid)
  • Due to differences in system calls between Linux and OSX, the OSX approach cannot be used as-is on Linux. It will need to be re-implemented for Linux.
  • Opt-in methods can be either compile time via a flag or runtime via an environment variable or command line option.

@mikezhang1234567890 While implementing #6014, did you find any resources which will allow us to extend the core dump tool to Linux?

@mikezhang1234567890
Copy link
Contributor

mikezhang1234567890 commented Mar 3, 2022

If we're looking to implement a user-space core dump tool, the basic approach can roughly be the same, dump the memory (of a copy or of the original process), and dump the thread state.

Most of the information needed to implement is regarding the binary format, which is ELF on Linux, and documentation is plentiful for ELF. https://www.gabriel.urdhr.fr/2015/05/29/core-file/ is a good read on the basic structure of a core file generated by the kernel or GDB.

I don't have anything for challenges or issues specific to Linux unfortunately.

@davidjmccann
Copy link
Contributor Author

Particular care would probably need to be made for setuid/setgid executables. The file permissions would probably need to be read just for the owning user of the process?

@kgibm
Copy link
Contributor

kgibm commented Mar 7, 2022

The basic issue I'm trying to solve is... in a Kubernetes environment, how can I get a core dump out of a process:

  1. Without requiring elevated privileges in the container
  2. Without requiring root access on the host node to set specific core settings
  3. With the core dump going to the persistent volume of choice rather than storage on the host.

Note that there are two common solutions to this:

  1. Install gdb in the image so that the gcore command is available and then run gcore ${PID} which attaches gdb to the process, writes the core to the current directory, and then detaches:
    % podman exec -it $(podman ps -q) sh -c 'gcore "$(cat /opt/IBM/WebSphere/AppServer/profiles/AppSrv01/logs/server1/server1.pid)" && ls -l core*'
    [...]
    -rw-r--r--. 1 root root 5468347904 Mar  7 18:35 core.1306
    
    It doesn't write out all the same VMAs as the kernel does when it creates a core, but it's close. The one downside is that if you happen to get the core while the JVM is in a sensitive operation like a garbage collection, then pointers might be in-flight and the core might be useless as far as some forms of common Java heap dump analysis. There are some ways around this like injecting an exception into the process and using an -Xdump handler with request=exclusive+prepwalk and filtering to that exception to exec a tool script that calls gcore but that's complicated.
  2. For the most common core_pattern of |/usr/lib/systemd/systemd-coredump, set ProcessSizeMax and ExternalSizeMax (both of which default to 2GB which may cause truncation, until systemd-coredump v250 which defaults to 32GB) in /etc/systemd/coredump.conf on the worker node, run systemctl daemon-reload, and then after the container produces a core, find it on the worker node with coredumpctl.

With that, it would be very nice to have a core-dumper in J9 like on macOS since, 1) most containers don't have gdb and it's a complex process for most customers to re-build their images, and 2) worker node operations are privileged and complicated to do at many customers.

@davidjmccann
Copy link
Contributor Author

Thanks @kgibm

  1. Doesn't that still require extra privileges in the container though? Adding gdb also adds quite a number of MB to the image size. Also running gcore requires you to run the command after an event has occurred, rather than being able to take a core when a problem occurs - e.g. a SIGSEGV.
  2. This just highlights how awkward getting core dumps can be!

@kgibm
Copy link
Contributor

kgibm commented Mar 8, 2022

  1. Doesn't that still require extra privileges in the container though?

Yes, you're right, I just tried this and it requires --cap-add SYS_PTRACE.

rather than being able to take a core when a problem occurs - e.g. a SIGSEGV.

One could create an -Xdump handler for, e.g., gpf that uses the tool option to exec out to gcore, but yeah, that's cumbersome, and made largely moot by the above privilege point.

  1. This just highlights how awkward getting core dumps can be!

Agreed, and from my experience, most customers have the default systemd-coredump configuration which truncates many cores.

@mikezhang1234567890
Copy link
Contributor

mikezhang1234567890 commented Apr 14, 2022

I can have a try at this. I will initially try to do this with an environment variable or Java option to toggle the behaviour.

@roolebo
Copy link

roolebo commented Oct 16, 2023

When apport is installed as default coredump handler, if you listen on /run/apport.socket inside the container you can accept coredump from within the container. I have not found any way to do something similar with systemd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants