Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option for user space core dump on Linux #6688

Closed

Conversation

mikezhang1234567890
Copy link
Contributor

Add an environment variable toggle to switch to creating a system core dump
manually in the current directory rather than rely on the kernel to create a
dump automatically from a signal and storing it where the OS dictates.

Fixes #6300 by providing an alternative to the kernel made system dump.

Add an environment variable toggle to switch to creating a system core dump
manually in the current directory rather than rely on the kernel to create a
dump automatically from a signal and storing it where the OS dictates.

Signed-off-by: Mike Zhang <[email protected]>
@mikezhang1234567890
Copy link
Contributor Author

After discussion with @babsingh, we determined the following changes to improve this PR:

  • Update comments to doxygen format
  • use camel case for naming variables
  • ensure all variables are initialized to a value
  • use macros instead of integer constants
  • update PR description with a high level overview describing how the core dump works
  • use port library IO instead of native
  • use flag to determine whether user space core dump should be used, instead of checking the environment
  • set up testing, here or through OpenJ9 consuming this component

@babsingh
Copy link
Contributor

babsingh commented Sep 13, 2022

re #6688 (comment):

The following points were missed:

All the above feedback was verbally given on 7 Sept 22.

@babsingh babsingh marked this pull request as draft September 13, 2022 17:51
@babsingh
Copy link
Contributor

babsingh commented Sep 13, 2022

@mikezhang1234567890 Remove the draft state once the PR is ready for review.

@keithc-ca Can you help with reviewing the PR once it is GTG?

pthread_kill(pthread_self(), J9_DUMP_SIGNAL);
#endif /* defined(LINUX) || defined(OSX) */
#if defined(LINUX)
if (getenv("J9_USER_DUMP")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a programmatic way of requesting a user-space dump rather than using the environment, but this should compare with NULL:

		if (NULL != getenv("J9_USER_DUMP")) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

J9_USER_DUMP might be a name that suits openj9, but other downstream projects would probably disagree.

#if defined(LINUX)
if (getenv("J9_USER_DUMP")) {
if (0 != userspace_dump_create(portLibrary, filename)) {
fprintf(stderr, "core dump creation failed\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failure message should be written to filename as is the documented behavior of this function.

				portLibrary->str_printf(portLibrary, filename, EsMaxPath, "core dump creation failed");

@@ -40,6 +40,5 @@ typedef struct MarkAllPagesWritableHeader {
uintptr_t renameDump(struct OMRPortLibrary *portLibrary, char *filename, pid_t pid, int signalNumber);
char *markAllPagesWritable(struct OMRPortLibrary *portLibrary);



uintptr_t userspace_dump_create(struct OMRPortLibrary *portLibrary, char *filename);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please remove this extra blank line.

@@ -146,6 +149,10 @@ renameDump(struct OMRPortLibrary *portLibrary, char *filename, pid_t pid, int si

memset(corePatternFormat, 0, sizeof(corePatternFormat));

if (getenv("J9_USER_DUMP")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparison with NULL should be explicit.

@@ -146,6 +149,10 @@ renameDump(struct OMRPortLibrary *portLibrary, char *filename, pid_t pid, int si

memset(corePatternFormat, 0, sizeof(corePatternFormat));

if (getenv("J9_USER_DUMP")) {
return stat(filename, &attrBuf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, filename is supposed to say why.

size_t bytes_read = 0;
char read_buffer[EsMaxPath];

snprintf(proc_filename + 6, 32 - 6, "%d/stat", ppid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same /proc/ comment as before.

/* fill out prpsinfo and general process prstatus info */
prpsinfo->pr_uid = getuid();
prpsinfo->pr_gid = getgid();
if (read_word_from_stat(proc_fd, read_buffer)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicit comparisons with zero.


static int
write_thread_notes(int addrsize, int outfile_fd, off_t *outfile_off, prstatus_t *global_prstatus) {
char corename[8] = "CORE"; /* size 8 for alignment */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No alignment guarantee here.

}

/* skip first 13 entries in stat file */
for (i=0; i < 13; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space both sides of =

}
file_off += sizeof corename;

if(!write_note_auxv_data(outfile_fd, &file_off, &note_hdr.n_descsz)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after if

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PortDumpTest fails if kernel.core_pattern forces the core to be redirected
3 participants