-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create simulator interface which provides thread scheduling + speculative fetching #5843
Comments
Adds a new scheduler component to drmemtrace which provides flexibility in combining input traces and is meant to supply key features for simulation of traces. This first stage adds a base scheduler which only supports the two analyzer modes: parallel software thread streams or a single serial stream. The input file opening code and the input-to-worker code is moved from the analyzer to the scheduler. The analyzer now has to look at the tid fields in the stream records to identify shards to tools, but the input-to-worker does belong in the scheduler. Removes the analyzer external iterator interface; tools should instead use the scheduler directly. Updates histogram_launcher and two tests to do this. Adds a new scheduler unit test with a mocked reader that takes vectors of records, containing some initial sanity tests. The scheduler takes in either file paths and opens its own readers for those, or it can be passed readers. This latter interface is used for online IPC readers, as well as for the unit test using a mocked reader. The IPC reader requires a delayed init() call which is handled by paying for a flag check on each stream advance. To support -skip_instrs, region-of-interest code is implemented here. However, it requires fixing a problem in reader_t::skip_instructions() by adding a queue and a new use-prior-record method. (The queue can be merged with the file_reader_t queue later.) It might be nicer to separate that out but that would leave -skip_instrs not working. Future work includes moving the serial mode interleaving from the file reader to the scheduler, and then adding new scheduling and simulation features. Issue: #5843
There are many design points here; documenting some smaller ones and will probably put the rest in a separate doc: Lots of little issues with the scheduler -- here is one: the output streams
The scheduler though does the skip and asks the zipfile reader what the new
What is the best solution? Does it have to query both ordinals before and after every input advance? Have separate "effective" and "presented" ordinals? Use the same get_last_record_ordinal() proposed for scheduler-inserted Maybe these inserted records should all be reported as the same prior I seem to recall a prior discussion where we came up with the 0 and liked Decison:
|
Adds a new scheduler component to drmemtrace which provides flexibility in combining input traces and is meant to supply key features for simulation of traces. This first stage adds a base scheduler which only supports the two analyzer modes: parallel software thread streams or a single serial stream. The input file opening code and the input-to-worker code is moved from the analyzer to the scheduler. The analyzer now has to look at the tid fields in the stream records to identify shards to tools, but the input-to-worker does belong in the scheduler. Removes the analyzer external iterator interface; tools should instead use the scheduler directly. Updates histogram_launcher and two tests to do this. Adds a new scheduler unit test with a mocked reader that takes vectors of records, containing some initial sanity tests. The scheduler takes in either file paths and opens its own readers for those, or it can be passed readers. This latter interface is used for online IPC readers, as well as for the unit test using a mocked reader. The IPC reader requires a delayed init() call which is handled by paying for a flag check on each stream advance. To support -skip_instrs, region-of-interest code is implemented here. However, it requires fixing a problem in reader_t::skip_instructions() by adding a queue and a new use-prior-record method. (The queue can be merged with the file_reader_t queue later.) It might be nicer to separate that out but that would leave -skip_instrs not working. To support skipping with multiple inputs, changes how synthetic records are treated: Eliminates synthetic records being considered to have a 0 record ordinal: instead they have the ordinal of the prior record. A new memtrace_stream_t function is_record_synthetic() is introduced for identifying synthetic records. This change is required to allow the scheduler_t layer to properly figure out output stream orderinals. Updates the reader, zipfile reader, and tests. Adds a new test to test both synthetic and real headers after a skip. Future work includes moving the serial mode interleaving from the file reader to the scheduler, and then adding new scheduling and simulation features. Issue: #5843
Implements timestamp ordering in scheduler_t rather than relying on the old implementation inside file_reader_t. Adds a sanity test. Removing the file_reader_t code, along with eliminating the thread-as-sub-reader API routines, will be done as a separate refactoring. Issue: #5843
Implements timestamp ordering in scheduler_t rather than relying on the old implementation inside file_reader_t. Adds a sanity test. Fixes a bug with only_threads and adds a simple test. Removing the file_reader_t code, along with eliminating the thread-as-sub-reader API routines, will be done as a separate refactoring. Issue: #5843
Removes multi-input support from file_reader_t and other readers now that the scheduler_t owns that. Specifically: + Removes read_next_thread_entry() and requires that read_next_entry() always check the queue (via a provided helper function). + Removes skip_thread_instructions() and refactors the pre-skip header reading and the post-skip walking while remembering timestamps. Places these latter two inside reader_t for use by all readers, with zipfile overriding just the fast skip in the middle and sharing all the other code. This refactoring and sharing solves the problem of missing timestamps when skipping from the middle. + Removes the arrays of data for multiple inputs from file_reader_t and all subclasses. Updates the view_test to use a scheduler for its multiple-input mock reader. While at it, removes is_complete(). Issue: #5843, #5538
Removes multi-input support from file_reader_t and other readers now that the scheduler_t owns that. Specifically: + Removes read_next_thread_entry() and requires that read_next_entry() always check the queue (via a provided helper function). + Removes skip_thread_instructions() and refactors the pre-skip header reading and the post-skip walking while remembering timestamps. Places these latter two inside reader_t for use by all readers, with zipfile overriding just the fast skip in the middle and sharing all the other code. This refactoring and sharing solves the problem of missing timestamps when skipping from the middle. + Removes the arrays of data for multiple inputs from file_reader_t and all subclasses. Updates the view_test to use a scheduler for its multiple-input mock reader. While at it, removes is_complete(). Issue: #5843, #5538
Adds get_input_stream_count() and get_input_stream_name() to the scheduler_t drmemtrace interface. Adds a test of these to the scheduler unit tests which uses real files and also serves as a test of only_threads for real files, whose code paths are different enough it had a bug which we fix here as well. Issue: #5843
Adds get_input_stream_count() and get_input_stream_name() to the scheduler_t drmemtrace interface. Adds a test of these to the scheduler unit tests which uses real files and also serves as a test of only_threads for real files, whose code paths are different enough it had a bug which we fix here as well. Issue: #5843
Adds to the scheduler interface a query to obtain the current input stream's memtrace_stream_t handle. Adds a new scheduler flag SCHEDULER_USE_INPUT_ORDINALS and sets it by default for parallel mode so the output stream's ordinals are suppressed and instead the current input stream's ordinals are presented on the output stream. This fixes a problem where the parallel analysis tool framework saw accumulated ordinals across inputs. Adds a similar flag SCHEDULER_USE_SINGLE_INPUT_ORDINALS which causes the first flag to be set if there is a single input and single output. This solves a serial mode problem where an analysis tool does want to see input gaps when there is no interleaving as there is only one input. Adds a test. Also manually tested a real analysis tool to confirm by tweaking the view tool to operate in parallel: Before: =========================================================================== [analyzer] Worker 0 starting on trace shard 0 stream is 0x562a2b0ff480 1 0: 3443916 <marker: version 4> 2 0: 3443916 <marker: filetype 0x240> ... 1479 585: 3443916 <thread 3443916 exited> [analyzer] Worker 0 starting on trace shard 1 stream is 0x562a2b0ff480 ------------------------------------------------------------ 1480 585: 3443921 <marker: version 4> 1481 585: 3443921 <marker: filetype 0x240> =========================================================================== After: =========================================================================== [analyzer] Worker 0 starting on trace shard 0 stream is 0x555cebc44480 1 0: 3443916 <marker: version 4> 2 0: 3443916 <marker: filetype 0x240> ... 1479 585: 3443916 <thread 3443916 exited> [analyzer] Worker 0 starting on trace shard 1 stream is 0x555cebc44480 ------------------------------------------------------------ 1 0: 3443921 <marker: version 4> 2 0: 3443921 <marker: filetype 0x240> =========================================================================== Issue: #5843
Adds to the scheduler interface a query to obtain the current input stream's memtrace_stream_t handle. Adds a new scheduler flag SCHEDULER_USE_INPUT_ORDINALS and sets it by default for parallel mode so the output stream's ordinals are suppressed and instead the current input stream's ordinals are presented on the output stream. This fixes a problem where the parallel analysis tool framework saw accumulated ordinals across inputs. Adds a similar flag SCHEDULER_USE_SINGLE_INPUT_ORDINALS which causes the first flag to be set if there is a single input and single output. This solves a serial mode problem where an analysis tool does want to see input gaps when there is no interleaving as there is only one input. Adds a test. Also manually tested a real analysis tool to confirm by tweaking the view tool to operate in parallel: Before: =========================================================================== [analyzer] Worker 0 starting on trace shard 0 stream is 0x562a2b0ff480 1 0: 3443916 <marker: version 4> 2 0: 3443916 <marker: filetype 0x240> ... 1479 585: 3443916 <thread 3443916 exited> [analyzer] Worker 0 starting on trace shard 1 stream is 0x562a2b0ff480 ------------------------------------------------------------ 1480 585: 3443921 <marker: version 4> 1481 585: 3443921 <marker: filetype 0x240> =========================================================================== After: =========================================================================== [analyzer] Worker 0 starting on trace shard 0 stream is 0x555cebc44480 1 0: 3443916 <marker: version 4> 2 0: 3443916 <marker: filetype 0x240> ... 1479 585: 3443916 <thread 3443916 exited> [analyzer] Worker 0 starting on trace shard 1 stream is 0x555cebc44480 ------------------------------------------------------------ 1 0: 3443921 <marker: version 4> 2 0: 3443921 <marker: filetype 0x240> =========================================================================== Issue: #5843
Fixes some fencepost errors in scheduler input region of interest handling. Adds a test of regions of interest which actually contains timestamps, which is what revealed the errors. Refactors the scheduler unit tests to use trace_entry_t instead of memref_t, which is required to properly test the scheduler's input readers, as that is the record type they operate on. This results in no longer needing reader_t::use_prev() which is removed here. Issue: #5843
Adds to the scheduler interface a query to obtain the current input stream's memtrace_stream_t handle. Adds a new scheduler flag SCHEDULER_USE_INPUT_ORDINALS and sets it by default for parallel mode so the output stream's ordinals are suppressed and instead the current input stream's ordinals are presented on the output stream. This fixes a problem where the parallel analysis tool framework saw accumulated ordinals across inputs. Adds a similar flag SCHEDULER_USE_SINGLE_INPUT_ORDINALS which causes the first flag to be set if there is a single input and single output. This solves a serial mode problem where an analysis tool does want to see input gaps when there is no interleaving as there is only one input. Adds a test. Also manually tested a real analysis tool to confirm by tweaking the view tool to operate in parallel: Before: =========================================================================== [analyzer] Worker 0 starting on trace shard 0 stream is 0x562a2b0ff480 1 0: 3443916 <marker: version 4> 2 0: 3443916 <marker: filetype 0x240> ... 1479 585: 3443916 <thread 3443916 exited> [analyzer] Worker 0 starting on trace shard 1 stream is 0x562a2b0ff480 ------------------------------------------------------------ 1480 585: 3443921 <marker: version 4> 1481 585: 3443921 <marker: filetype 0x240> =========================================================================== After: =========================================================================== [analyzer] Worker 0 starting on trace shard 0 stream is 0x555cebc44480 1 0: 3443916 <marker: version 4> 2 0: 3443916 <marker: filetype 0x240> ... 1479 585: 3443916 <thread 3443916 exited> [analyzer] Worker 0 starting on trace shard 1 stream is 0x555cebc44480 ------------------------------------------------------------ 1 0: 3443921 <marker: version 4> 2 0: 3443921 <marker: filetype 0x240> =========================================================================== Issue: #5843
Fixes some fencepost errors in scheduler input region of interest handling. Adds a test of regions of interest which actually contains timestamps, which is what revealed the errors. Refactors the scheduler unit tests to use trace_entry_t instead of memref_t, which is required to properly test the scheduler's input readers, as that is the record type they operate on. This results in no longer needing reader_t::use_prev() which is removed here. Issue: #5843
Adds initial support for MAP_TO_ANY_OUTPUT with multiple outputs. Uses a simple queue of ready-to-schedule inputs and implements an instruction-based scheduling quantum. Adds a test. Issue: #5843
Adds initial support for MAP_TO_ANY_OUTPUT with multiple outputs. Uses a simple queue of ready-to-schedule inputs and implements an instruction-based scheduling quantum. Adds a test. Adds new types input_ordinal_t and output_ordinal_t and corresponding invalid constants and updates all existing code to use these. Issue: #5843
Implements initial speculation support, supplying nops. Speculation is separated into its own class where we can fill in different strategies in the future. The start_speculation() function takes a flag controlling whether the scheduler queues up the current record and re-returns it as the first record after speculation stops. This is often what a simulator wants as it has to read the instruction record following a branch to determine whether it is on the wrong path, and it would like to resume with that already-read instruction after speculation. Adds a unit test. Issue: #5843
Implements initial speculation support, supplying nops. Speculation is separated into its own class where we can fill in different strategies in the future. The start_speculation() function takes a flag controlling whether the scheduler queues up the current record and re-returns it as the first record after speculation stops. This is often what a simulator wants as it has to read the instruction record following a branch to determine whether it is on the wrong path, and it would like to resume with that already-read instruction after speculation. Adds a unit test. Issue: #5843
Adds a lock for each input to enforce missing synchronization during scheduling decisions. Fixes a bug with the existing scheduler lock. Adds a multi-threaded test. Tested a similar multi-threaded test under ThreadSanitizer which now reports no races (it did before these code changes). Fixes #5843
Adds support for the TRACE_MARKER_TYPE_DIRECT_THREAD_SWITCH marker, when it appears after TRACE_MARKER_TYPE_MAYBE_BLOCKING_SYSCALL. The scheduler directly switches to the target thread if it is on the ready queue. Performing a forced migration if the target is running on another output is not yet implemented. Once i/o wait states are added, waking up a target thread will be added, but that is future work as well. Adds a DEPENDENCY_DIRECT_SWITCH_BITFIELD and renames DEPENDENCY_TIMESTAMPS to DEPENDENCY_TIMESTAMP_BITFIELD so we can combine them, and makes a new enum entry DEPENDENCY_TIMESTAMPS which combines the two bitfields, which is what nearly every use case should want while still giving us control and without really breaking compatibility (and by providing bits and combinations the enum type is all that's needed still). Adds a unit test where the schedule would clearly be different without the switch target. Issue: #5843
Rather than context switching on every syscall labeled maybe-blocking, the scheduler uses the now-available syscall latency to decide whether the syscall should block and result in a context switch. Adds two new command line options, -sched_syscall_switch_us (default 500us) and -sched_blocking_switch_us (default 100us), and corresponding scheduler_t inputs, to control the latency thresholds. To avoid relying too much on the maybe-blocking labels, we do consider a very-high-latency syscall not marked as maybe-blocking to block. Adds a new unit test. Tested in a large proprietary app where this reduces the context switch rate from ~100x too high down to ~10x too high. The next step of adding i/o wait times should further improve the representativeness. Issue: #5843
Rather than context switching on every syscall labeled maybe-blocking, the scheduler uses the now-available syscall latency to decide whether the syscall should block and result in a context switch. Adds two new command line options, -sched_syscall_switch_us (default 500us) and -sched_blocking_switch_us (default 100us), and corresponding scheduler_t inputs, to control the latency thresholds. To avoid relying too much on the maybe-blocking labels, we do consider a very-high-latency syscall not marked as maybe-blocking to result in a context switch. Adds a new schedule_stats unit test. Tested in a large proprietary app where this reduces the context switch rate from ~100x too high down to ~10x too high. The next step of adding i/o wait times should further improve the representativeness. Issue: #5843
Changes the quanta accounting to match the real kernel by accumulating it across executions if a prior execution was terminated early due to a voluntary context switch. Adds new testing, and updates old tests with the behavior change. Scheduler unit test string changes were carefully vetted. E.g., for test_synthetic_with_syscalls_multiple(): the output strings changed because H's quantum accumulates and it hits a preempt in the middle of its second HH sequence, which decrements B's quantum, causing B to become available sooner. Issue: #5843
Changes the quanta accounting to match the real kernel by accumulating it across executions if a prior execution was terminated early due to a voluntary context switch. Adds new testing, and updates old tests with the behavior change. Scheduler unit test string changes were carefully vetted. E.g., for test_synthetic_with_syscalls_multiple(): the output strings changed because H's quantum accumulates and it hits a preempt in the middle of its second HH sequence, which decrements B's quantum, causing B to become available sooner. Issue: #5843
Adds a new scheduler option field honor_direct_switches and a corresponding command-line parameter -sched_disable_direct_switches to allow a way to disable direct thread switches, primarily for scheduling experimentation. Adds a unit test. Issue #5843
Adds a new scheduler option field honor_direct_switches and a corresponding command-line parameter -sched_disable_direct_switches to allow a way to disable direct thread switches, primarily for scheduling experimentation. Adds a unit test. Issue #5843
Fixes an inconsistency in the CLI drmemtrace scheduler quantum and the internal API by making them both the same at 6 million. We pick 6 million to match 2 instructions per nanosecond with a 3ms quantum. The scheduler_launcher default is also made to match. Issue: #5843
Fixes an inconsistency in the CLI drmemtrace scheduler quantum and the internal API by making them both the same at 6 million. We pick 6 million to match 2 instructions per nanosecond with a 3ms quantum. The scheduler_launcher default is also made to match. Issue: #5843
We make logging available in release build to help diagnosing issues and understanding scheduler behavior. We assume the extra branches do not add undue overhead. Issue: #5843
We make logging available in release build to help with diagnosing issues and understanding scheduler behavior. We assume the extra branches do not add undue overhead. Issue: #5843
Improves speculation error checking by printing the error message on failure. Adds missing return status checks to the speculation calls in the speculator unit test. Issue: #5843
Improves speculation error checking by printing the error message on failure. Adds missing return status checks to the speculation calls in the speculator unit test. Issue: #5843
Promotes the scheduler section to a full page and adds information on the deficiencies of the as-traced schedule, details of dynamic scheduling, simulated time, idle time, record-replay, regions of interest, and speculation support. Updates the example code to include multiple threads each processing one core. Issue: #5843
Promotes the scheduler section to a full page and adds information on the deficiencies of the as-traced schedule, details of dynamic scheduling, simulated time, idle time, record-replay, regions of interest, and speculation support. Updates the example code to include multiple threads each processing one core. Issue: #5843
Sets the scheduler_t system call switch thresholds to match the command-line interface defaults, which were raised a long time ago to reflect better understanding of real behavior. Issue: #5843
Sets the scheduler_t system call switch thresholds to match the command-line interface defaults, which were raised a long time ago to reflect better understanding of real behavior. Issue: #5843
Rather than having each simulator figure out how to schedule traced software threads onto simulated cores in their own ad hoc way, we would like to provide a scheduler service, which should result in several benefits:
Xref #5694: provide per-core iterator.
That may become subsumed by this new broader-scope feature.
The text was updated successfully, but these errors were encountered: