Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor interrupting blocking threads #1948

Conversation

wenyongh
Copy link
Contributor

@wenyongh wenyongh commented Feb 9, 2023

  • Disable the feature by default, use cmake -DWAMR_BUILD_INTERRUPT_BLOCK_INSN to enable it
  • Reuse the original signal handler
  • Move kill threads to be after notifying atomic waiting threads
  • Fix some issues and refine the code

wasm_runtime_set_exec_env_tls(exec_env);
wasm_exec_env_push_jmpbuf(exec_env, &jmpbuf_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we handle the case in which the jmpbuf is pushed here but then we receive a signal before the os_setjmp in the next line? In that case the signal handler would use an uninitialized jmpbuf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's an issue, some operations should not be interrupted by the signal, here we had better disable signal before pushing jmpbuf and enable again after setjmp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why I moved wasm_exec_env_push_jmpbuf after os_setjmp in my previous implementation;
the approach described here https://notes.shichao.io/apue/ch10/ with static volatile sig_atomic_t canjump; can be a possible solution too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, seems the static volatile sig_atomic_t canjump is better: moving wasm_exec_env_push_jmpbuf after os_setjmp also cannot protect the operations of os_setjmp and pushing jmpbuf.
How about adding field volatile sig_atomic_t canjump in exec_env, and changing the process to:

    exec_env->canjump = 0;
    wasm_exec_env_push_jmpbuf(exec_env, &jmpbuf_node);
    wasm_runtime_set_exec_env_tls(exec_env);

    if (os_setjmp(jmpbuf_node.jmpbuf) == 0) {
        exec_env->canjump = 1;
        ret = invoke_native_block_insn_interrupt(exec_env, func_ptr, func_type,
                                                 signature, attachment, argv,
                                                 argc, argv_ret);
    }
    else {
        /* Exception has been set in signal handler before calling longjmp */
        ret = false;
    }

    exec_env->canjump = 0;
    jmpbuf_node_pop = wasm_exec_env_pop_jmpbuf(exec_env);
    exec_env->canjump = 1;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should fix the problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. BTW, any other comments? Shall we merge it after some basic tests? So that we can start to fix the memory allocation issue.

Copy link
Contributor

@eloparco eloparco Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's still missing is to use the canjump variable inside the signal handler interrupt_block_insn_sig_handler to decide if we can jump or not, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, forgot to do it, done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the changes and it seems to be working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canjump flag only controls whether it is allowed to jump to the place of setjmp inside signal handler of SIGUSR1, but another issue is that: whether it is allowed to receive the signal SIGUSR1 in a thread to trigger its signal handler? Since the signal handler may triggered in each place (each machine instruction) of the thread, it very dangerous, I tend to limit it only in some special places. I uploaded a patch set to enable it only when calling the callback API of pthread_create, maybe we can narrow it more, e.g. enable it only when calling native APIs.

@wenyongh wenyongh changed the title [WIP] Refactor interrupting blocking threads Refactor interrupting blocking threads Feb 10, 2023
Copy link
Contributor

@eloparco eloparco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenyongh wenyongh merged commit ed98d88 into bytecodealliance:dev/interrupt_block_insn Feb 13, 2023
@wenyongh wenyongh deleted the dev/interrupt_block_insn branch February 16, 2023 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants