Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#1369: Use synch flush callback to enable drcachesim tracing. #4491

Merged
merged 23 commits into from
Oct 30, 2020
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
55f425e
i#1369: Use synch flush callback to enable drcachesim tracing.
abhinav92003 Oct 20, 2020
e2e89bb
Fix args to dr_flush_region call-site in events.dll.c test
abhinav92003 Oct 20, 2020
5342bf6
Load stolen reg with TLS base address before entering reset exit stub.
abhinav92003 Oct 20, 2020
7561b94
Add callback function to dr_flush_region call in existing test for fl…
abhinav92003 Oct 20, 2020
3d3f318
Fix args to dr_flush_region call in cbr.dll.c test.
abhinav92003 Oct 21, 2020
4979965
Document dr_flush_region signature change in release doc.
abhinav92003 Oct 21, 2020
6df5dbf
Update issue number for inlining instr counter increment in drcachesim.
abhinav92003 Oct 21, 2020
333da5b
Revert change made in TRY_EXCEPT in clean_call_opt_shared.c
abhinav92003 Oct 21, 2020
71f273f
Add missing callback invocation for early return cases of dr_flush_re…
abhinav92003 Oct 22, 2020
69c3f89
Fix reset tests so that they actually test the complete reset path.
abhinav92003 Oct 22, 2020
2889982
Add back reset_at_fragment_count option that was removed earlier.
abhinav92003 Oct 23, 2020
8b2f7b9
Preserve source compatibility by creating new API for flush with call…
abhinav92003 Oct 27, 2020
a2514e7
Rename new option to reset_at_created_thread_count.
abhinav92003 Oct 27, 2020
0b375a7
Minor style fixes.
abhinav92003 Oct 27, 2020
f9d64ac
Add user_data arg to synch flush callback function.
abhinav92003 Oct 28, 2020
8459b6c
Optimise mcontext initialisation by avoiding excessive zeroing.
abhinav92003 Oct 28, 2020
32fd389
Add XXX comments to describe future work issues.
abhinav92003 Oct 28, 2020
9af7c48
Fix signature of defined dr_flush_region_ex.
abhinav92003 Oct 28, 2020
47c8172
Add missing user_data to early-return invocations of callback.
abhinav92003 Oct 28, 2020
b2ebbea
Fix formatting and a callsite with missing arg.
abhinav92003 Oct 28, 2020
f421141
Revert changes related to i#4497 that were separated into a different…
abhinav92003 Oct 30, 2020
be2e3aa
Comment, TODO and documentation changes.
abhinav92003 Oct 30, 2020
b6ac919
Merge branch 'master' into i1369-drcachesim-synch-flush-callback
abhinav92003 Oct 30, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,10 @@ changes:
member fields of classes are now following a consistent style with
an underscore suffix. References to renamed fields will need to be
updated.
- Changed the signature of the dr_flush_region() instrument API, in order to
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved
support a callback function which is executed after completion of flush but
before threads are resumed. Existing client source will need to be modified
to pass a NULL value for the flush_completion_callback arg.

The changes between version \DR_VERSION and 8.0.0 include the following minor
compatibility changes:
Expand Down Expand Up @@ -173,6 +177,9 @@ Further non-compatibility-affecting changes include:
- Added opnd_create_immed_double(), opnd_get_immed_double() and
opnd_is_immed_double() to enable the creation and handling of double
precision floating-point operands.
- Added -reset_at_thread_count option to allow triggering reset when created thread
count reaches the given value. This option is currently used to make Linux reset
tests more robust.
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved

**************************************************
<hr>
Expand Down
4 changes: 2 additions & 2 deletions api/samples/cbr.c
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ at_taken(app_pc src, app_pc targ)
/* Since the flush will remove the fragment we're already in,
* redirect execution to the target address.
*/
dr_flush_region(src, 1);
dr_flush_region(src, 1, NULL);
dr_get_mcontext(drcontext, &mcontext);
mcontext.pc = targ;
dr_redirect_execution(&mcontext);
Expand Down Expand Up @@ -280,7 +280,7 @@ at_not_taken(app_pc src, app_pc fall)
/* Since the flush will remove the fragment we're already in,
* redirect execution to the fallthrough address.
*/
dr_flush_region(src, 1);
dr_flush_region(src, 1, NULL);
dr_get_mcontext(drcontext, &mcontext);
mcontext.pc = fall;
dr_redirect_execution(&mcontext);
Expand Down
57 changes: 39 additions & 18 deletions clients/drcachesim/tracer/tracer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1316,13 +1316,14 @@ event_kernel_xfer(void *drcontext, const dr_kernel_xfer_info_t *info)
*/

static uint64 instr_count;
static volatile bool tracing_enabled;
static void *enable_tracing_lock;
static bool tracing_enabled;
static volatile bool tracing_scheduled;
static void *schedule_tracing_lock;

#ifdef X86_64
# define DELAYED_CHECK_INLINED 1
#else
// XXX: we don't have the inlining implemented yet.
// XXX, i#4487: we don't have the inlining implemented yet.
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved
#endif

static dr_emit_flags_t
Expand All @@ -1345,7 +1346,7 @@ enable_delay_instrumentation()
if (!drmgr_register_bb_instrumentation_event(
event_delay_bb_analysis, event_delay_app_instruction, &memtrace_pri))
DR_ASSERT(false);
enable_tracing_lock = dr_mutex_create();
schedule_tracing_lock = dr_mutex_create();
}

static void
Expand All @@ -1358,8 +1359,8 @@ disable_delay_instrumentation()
static void
exit_delay_instrumentation()
{
if (enable_tracing_lock != NULL)
dr_mutex_destroy(enable_tracing_lock);
if (schedule_tracing_lock != NULL)
dr_mutex_destroy(schedule_tracing_lock);
#ifdef DELAYED_CHECK_INLINED
drx_exit();
#endif
Expand All @@ -1379,28 +1380,46 @@ enable_tracing_instrumentation()
}

static void
hit_instr_count_threshold()
change_instrumentation_callback()
{
NOTIFY(0, "Hit delay threshold: enabling tracing.\n");
disable_delay_instrumentation();
enable_tracing_instrumentation();
}

static void
hit_instr_count_threshold(app_pc next_pc)
{
bool do_flush = false;
dr_mutex_lock(enable_tracing_lock);
if (!tracing_enabled) { // Already came here?
NOTIFY(0, "Hit delay threshold: enabling tracing.\n");
disable_delay_instrumentation();
enable_tracing_instrumentation();
dr_mutex_lock(schedule_tracing_lock);
if (!tracing_scheduled) {
do_flush = true;
tracing_scheduled = true;
}
dr_mutex_unlock(enable_tracing_lock);
if (do_flush && !dr_unlink_flush_region(NULL, ~0UL))
dr_mutex_unlock(schedule_tracing_lock);

if (do_flush) {
if (!dr_flush_region(NULL, ~0UL, change_instrumentation_callback))
DR_ASSERT(false);

dr_mcontext_t mcontext = {
sizeof(mcontext),
DR_MC_ALL,
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved
};
dr_get_mcontext(dr_get_current_drcontext(), &mcontext);
mcontext.pc = next_pc;
dr_redirect_execution(&mcontext);
DR_ASSERT(false);
}
}

#ifndef DELAYED_CHECK_INLINED
static void
check_instr_count_threshold(uint incby)
check_instr_count_threshold(uint incby, app_pc next_pc)
{
instr_count += incby;
if (instr_count > op_trace_after_instrs.get_value())
hit_instr_count_threshold();
hit_instr_count_threshold(next_pc);
}
#endif

Expand Down Expand Up @@ -1450,7 +1469,8 @@ event_delay_app_instruction(void *drcontext, void *tag, instrlist_t *bb, instr_t
}
MINSERT(bb, instr, INSTR_CREATE_jcc(drcontext, OP_jl, opnd_create_instr(skip_call)));
dr_insert_clean_call(drcontext, bb, instr, (void *)hit_instr_count_threshold,
false /*fpstate */, 0);
false /*fpstate */, 1,
OPND_CREATE_INTPTR((ptr_uint_t)instr_get_app_pc(instr)));
MINSERT(bb, instr, skip_call);
if (scratch != DR_REG_NULL) {
if (drreg_unreserve_register(drcontext, bb, instr, scratch) != DRREG_SUCCESS)
Expand All @@ -1463,7 +1483,8 @@ event_delay_app_instruction(void *drcontext, void *tag, instrlist_t *bb, instr_t
// XXX: drx_insert_counter_update doesn't support 64-bit, and there's no
// XINST_CREATE_load_8bytes. For now we pay the cost of a clean call every time.
dr_insert_clean_call(drcontext, bb, instr, (void *)check_instr_count_threshold,
false /*fpstate */, 1, OPND_CREATE_INT32(num_instrs));
false /*fpstate */, 2, OPND_CREATE_INT32(num_instrs),
OPND_CREATE_INTPTR((ptr_uint_t)instr_get_app_pc(instr)));
#endif
return DR_EMIT_DEFAULT;
}
Expand Down
3 changes: 2 additions & 1 deletion core/dispatch.c
Original file line number Diff line number Diff line change
Expand Up @@ -902,7 +902,8 @@ dispatch_enter_dynamorio(dcontext_t *dcontext)
app_pc begin = (app_pc)dcontext->local_state->spill_space.r2;
app_pc end = (app_pc)dcontext->local_state->spill_space.r3;
dcontext->next_tag = (app_pc)dcontext->local_state->spill_space.r4;
flush_fragments_from_region(dcontext, begin, end - begin, true);
flush_fragments_from_region(dcontext, begin, end - begin, true,
NULL /*flush_completion_callback*/);
}
#endif

Expand Down
8 changes: 8 additions & 0 deletions core/dynamo.c
Original file line number Diff line number Diff line change
Expand Up @@ -2184,6 +2184,14 @@ add_thread(IF_WINDOWS_ELSE_NP(HANDLE hthread, process_id_t pid), thread_id_t tid
RSTATS_ADD_PEAK(num_threads, 1);
RSTATS_INC(num_threads_created);
num_known_threads++;
DOSTATS({
if (d_r_stats != NULL &&
(uint)GLOBAL_STAT(num_threads_created) ==
INTERNAL_OPTION(reset_at_thread_count)) {
ASSERT(INTERNAL_OPTION(reset_at_thread_count) != 0);
schedule_reset(RESET_ALL);
}
});
d_r_mutex_unlock(&all_threads_lock);
}

Expand Down
13 changes: 10 additions & 3 deletions core/fragment.c
Original file line number Diff line number Diff line change
Expand Up @@ -5659,12 +5659,14 @@ process_client_flush_requests(dcontext_t *dcontext, dcontext_t *alloc_dcontext,
* that we can inform the client right away, it might be nice to use
* the more performant regular flush when possible. */
flush_fragments_from_region(dcontext, iter->start, iter->size,
true /*force synchall*/);
true /*force synchall*/,
NULL /*flush_completion_callback*/);
(*iter->flush_callback)(iter->flush_id);
} else {
/* do a regular flush */
flush_fragments_from_region(dcontext, iter->start, iter->size,
false /*don't force synchall*/);
false /*don't force synchall*/,
NULL /*flush_completion_callback*/);
}
}
HEAP_TYPE_FREE(alloc_dcontext, iter, client_flush_req_t, ACCT_CLIENT,
Expand Down Expand Up @@ -6881,10 +6883,11 @@ flush_fragments_and_remove_region(dcontext_t *dcontext, app_pc base, size_t size

/* Flushes fragments from the region without any changes to the exec list.
* Does not free futures and caller can't be holding the initexit lock.
* Invokes the given callback after flushing and before resuming threads.
* FIXME - add argument parameters (free futures etc.) as needed. */
void
flush_fragments_from_region(dcontext_t *dcontext, app_pc base, size_t size,
bool force_synchall)
bool force_synchall, void (*flush_completion_callback)())
{
/* we pass false to flush_fragments_in_region_start() below for owning the initexit
* lock */
Expand All @@ -6895,6 +6898,10 @@ flush_fragments_from_region(dcontext_t *dcontext, app_pc base, size_t size,
flush_fragments_in_region_start(dcontext, base, size, false /*don't own initexit*/,
false /*don't free futures*/, false /*exec valid*/,
force_synchall _IF_DGCDIAG(NULL));
if (flush_completion_callback != NULL) {
(*flush_completion_callback)();
}

flush_fragments_in_region_finish(dcontext, false);
}

Expand Down
2 changes: 1 addition & 1 deletion core/fragment.h
Original file line number Diff line number Diff line change
Expand Up @@ -1141,7 +1141,7 @@ flush_fragments_and_remove_region(dcontext_t *dcontext, app_pc base, size_t size

void
flush_fragments_from_region(dcontext_t *dcontext, app_pc base, size_t size,
bool force_synchall);
bool force_synchall, void (*flush_completion_callback)());

void
flush_fragments_custom_list(dcontext_t *dcontext, fragment_t *list,
Expand Down
16 changes: 11 additions & 5 deletions core/lib/instrument.c
Original file line number Diff line number Diff line change
Expand Up @@ -7174,7 +7174,7 @@ DR_API
* to be !couldbelinking (xref PR 199115, 227619). Caller must use
* dr_redirect_execution() to return to the cache. */
bool
dr_flush_region(app_pc start, size_t size)
dr_flush_region(app_pc start, size_t size, void (*flush_completion_callback)())
{
dcontext_t *dcontext = get_thread_private_dcontext();
CLIENT_ASSERT(!standalone_library, "API not supported in standalone mode");
Expand All @@ -7201,13 +7201,18 @@ dr_flush_region(app_pc start, size_t size)
CLIENT_ASSERT(size != 0, "dr_flush_region: 0 is invalid size for flush");

/* release build check of requirements, as many as possible at least */
if (size == 0 || is_couldbelinking(dcontext))
if (size == 0 || is_couldbelinking(dcontext)) {
(*flush_completion_callback)();
return false;
}

if (!executable_vm_area_executed_from(start, start + size))
if (!executable_vm_area_executed_from(start, start + size)) {
(*flush_completion_callback)();
return true;
}

flush_fragments_from_region(dcontext, start, size, true /*force synchall*/);
flush_fragments_from_region(dcontext, start, size, true /*force synchall*/,
flush_completion_callback);

return true;
}
Expand Down Expand Up @@ -7259,7 +7264,8 @@ dr_unlink_flush_region(app_pc start, size_t size)
if (!executable_vm_area_executed_from(start, start + size))
return true;

flush_fragments_from_region(dcontext, start, size, false /*don't force synchall*/);
flush_fragments_from_region(dcontext, start, size, false /*don't force synchall*/,
NULL /*flush_completion_callback*/);

return true;
}
Expand Down
5 changes: 4 additions & 1 deletion core/lib/instrument_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -6228,11 +6228,14 @@ DR_API
* \note Use \p size == 1 to flush fragments containing the instruction at address
* \p start. A flush of \p size == 0 is not allowed.
*
* \note Use flush_completion_callback to specify logic to be executed after the flush
* and before the threads are resumed.
*
* \note As currently implemented, dr_delay_flush_region() with no completion callback
* routine specified can be substantially more performant.
*/
bool
dr_flush_region(app_pc start, size_t size);
dr_flush_region(app_pc start, size_t size, void (*flush_completion_callback)());
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved

/* FIXME - get rid of the no locks requirement by making event callbacks !couldbelinking
* and no dr locks (see PR 227619) so that client locks owned by this thread can't block
Expand Down
6 changes: 4 additions & 2 deletions core/options.c
Original file line number Diff line number Diff line change
Expand Up @@ -1502,8 +1502,10 @@ check_option_compatibility_helper(int recurse_count)
changed_options = true;
}
# ifdef EXPOSE_INTERNAL_OPTIONS
else if (INTERNAL_OPTION(reset_at_fragment_count)) {
USAGE_ERROR("-reset_at_fragment_count requires -enable_reset, enabling");
else if (INTERNAL_OPTION(reset_at_thread_count) ||
INTERNAL_OPTION(reset_at_fragment_count)) {
USAGE_ERROR("-reset_at_thread_count and -reset_at_fragment_count require "
"-enable_reset, enabling");
dynamo_options.enable_reset = true;
changed_options = true;
}
Expand Down
3 changes: 3 additions & 0 deletions core/optionsx.h
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@
{ \
(prefix)->enable_reset = false; \
IF_INTERNAL((prefix)->reset_at_fragment_count = 0;) \
IF_INTERNAL((prefix)->reset_at_thread_count = 0;) \
(prefix)->reset_at_nth_thread = 0; \
(prefix)->reset_at_switch_to_os_at_vmm_limit = false; \
(prefix)->reset_at_vmm_percent_free_limit = 0; \
Expand Down Expand Up @@ -1494,6 +1495,8 @@ OPTION_COMMAND(bool, enable_reset, IF_X86_ELSE(true, false), "enable_reset",

OPTION_DEFAULT_INTERNAL(uint, reset_at_fragment_count, 0,
"reset all caches at a certain fragment count")
OPTION_DEFAULT_INTERNAL(uint, reset_at_thread_count, 0,
"reset all caches when thread count reaches given value")
OPTION(uint, reset_at_nth_thread,
"reset all caches when the nth thread is explicitly created")
/* FIXME - is potentially using up all the os allocation leaving nothing for the
Expand Down
6 changes: 6 additions & 0 deletions core/synch.c
Original file line number Diff line number Diff line change
Expand Up @@ -1749,6 +1749,12 @@ translate_from_synchall_to_dispatch(thread_record_t *tr, thread_synch_state_t sy
arch_mcontext_reset_stolen_reg(dcontext, mc);
}
});
IF_AARCHXX({
set_stolen_reg_val(mc, (reg_t)os_get_dr_tls_base(dcontext));
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved
abhinav92003 marked this conversation as resolved.
Show resolved Hide resolved
// XXX: This path is tested by linux.thread-reset and linux.clone-reset.
// We just haven't run those on ARM yet.
IF_ARM(ASSERT_NOT_TESTED());
});
/* We send all threads, regardless of whether was in DR or not, to
* re-interp from translated cxt, to avoid having to handle stale
* local state problems if we simply resumed.
Expand Down
3 changes: 2 additions & 1 deletion core/unix/os.c
Original file line number Diff line number Diff line change
Expand Up @@ -7655,7 +7655,8 @@ pre_system_call(dcontext_t *dcontext)
* use synch to ensure other threads see the
* new code.
*/
false /*don't force synchall*/);
false /*don't force synchall*/,
NULL /*flush_completion_callback*/);
break;
}
# endif /* ARM */
Expand Down
Loading