Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamoRIO fails to run trivial "clone" example on ARM #1936

Open
egrimley opened this issue May 12, 2016 · 6 comments
Open

DynamoRIO fails to run trivial "clone" example on ARM #1936

egrimley opened this issue May 12, 2016 · 6 comments

Comments

@egrimley
Copy link
Contributor

Here's the program. It seems to work natively on several Linux architectures, and under DynamoRIO on Intel, but not under DynamoRIO on ARM, where I got a segfault in is_thread_tls_initialized.

The program puts a garbage value in the child's TLS pointer, which is hardly acceptable in a C program, so imagine this program written in assembler, if you prefer. If you're not using the standard libraries you're presumably free to use the TLS pointer however you like.

#define _GNU_SOURCE 1
#include <sched.h>
#include <stdlib.h>
#include <unistd.h>

int child(void *arg)
{
    _exit(0);
}

int main()
{
    char *stack = malloc(4096);
    int flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND |
                 CLONE_THREAD | CLONE_SYSVSEM | CLONE_SETTLS |
                 CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID);
    void *arg = 0;
    pid_t parent_tid = 0;
    void *tls = (void *)0x123;
    pid_t child_tid = 0;

    clone(child, stack + 4096, flags, arg, &parent_tid, tls, &child_tid);
    _exit(0);
}
@derekbruening
Copy link
Contributor

Xref #2089 where I'm putting in a safe_read TLS solution for x86. We may want something similar here: a special safe_read that does not need a dcontext (as TLS init queries are done on new threads). See safe_read_tls_magic().

@egrimley
Copy link
Contributor Author

It's a long time since I've looked at this, but I'm not sure that safe_read is the right solution here. It would be better to altogether avoid abusing the app's TLS, which I think ought to be possible here, since we have the stolen register on ARM, unlike on Intel.

@derekbruening
Copy link
Contributor

derekbruening commented Feb 20, 2017

But you don't have a stolen register for native threads. That's the main issue, supporting mixed-control models including attach/detach, the start/stop API, native_exec, etc.: you have to run the same code in a native thread that was never under DR control, a thread that used to be but was given free reign to run in a native context, and a thread now under DR control. You have to give up the stolen register for the 2nd case, and you never had it for the 1st.

@egrimley
Copy link
Contributor Author

I wonder whether for robustness and long-term simplicity we shouldn't just use gettid() and a hash table rather than make fragile and non-portable assumptions about the app's TLS. (Or is that a solution for a different problem?)

@derekbruening
Copy link
Contributor

Whether a thread's DR TLS initialized is called in many places and having a system call there is undesirable for performance reasons. Originally there was no system call, but as various complexities crept in one was added. Removing it in favor of the safe_read resulted in 25% (yes, 25%) speedup in bb-building-bound apps, an 80% speedup for debug build -checklevel 0, and a 6x speedup in dr_get_current_drcontext() (see #2089 for details) (and similarly for the DR-internal analogue) -- getting the current dcontext is a very common operation. So we're talking about significant performance impact. Re-architecting how a lot of code works could perhaps change the situation, but right now querying whether TLS is set up is a performance-critical point. There are a number of downsides to the safe-read approach (including a bunch of faults in every single thread on delayed attach, #2270, and others) and I'm hoping there's a better solution, but it is not a simple problem.

@egrimley
Copy link
Contributor Author

Should it in principle be possible to know from the context whether TLS is set up, without having to query, except at the start of a signal handler in a process which has at least one native thread? (A global flag could alert you to the possible presence of native threads.)

By "context" I mean the function you're in and the arguments that were given to it, but I suppose you could look at the stack backtrace (which would be horrible). By "in principle" I mean with changes to the API, if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants