-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update documentation for oneTBB 2021.9 (#1060)
- Loading branch information
Showing
9 changed files
with
194 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
.. _hybrid_cpu_support: | ||
|
||
Hybrid CPU and NUMA Support | ||
*************************** | ||
|
||
If you need NUMA/Hybrid CPU support in oneTBB, you need to make sure that HWLOC* is installed on your system. | ||
|
||
HWLOC* (Hardware Locality) is a library that provides a portable abstraction of the hierarchical topology of modern architectures (NUMA, hybrid CPU systems, etc). | ||
oneTBB relies on HWLOC* to identify the underlying topology of the system to optimize thread scheduling and memory allocation. | ||
|
||
Without HWLOC*, oneTBB may not take advantage of NUMA/Hybrid CPU support. Therefore, it's important to make sure that HWLOC* is installed before using oneTBB on such systems. | ||
|
||
Check HWLOC* on the System | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
To check if HWLOC* is already installed on your system, run `hwloc-ls`: | ||
|
||
* For Linux* OS, in the command line. | ||
* For Windows* OS, in the command prompt. | ||
|
||
If HWLOC* is installed, the command displays information about the hardware topology of your system. | ||
If it is not installed, you receive an error message saying that the command ``hwloc-ls`` could not be found. | ||
|
||
.. note:: For Hybrid CPU support, make sure that HWLOC* is version 2.5 or higher. | ||
For NUMA support, install HWLOC* version 1.11 or higher. | ||
|
||
Install HWLOC* | ||
^^^^^^^^^^^^^^ | ||
|
||
To install HWLOC*, visit the official Portable Hardware Locality website (https://www-lb.open-mpi.org/projects/hwloc/). | ||
|
||
* For Windows* OS, binaries are available for download. | ||
* For Linux* OS, only the source code is provided and binaries should be built. | ||
|
||
On Linux* OS, HWLOC* can be also installed with package managers, such as APT*, YUM*, etc. | ||
To do so, run: ``sudo apt install hwloc``. | ||
|
||
|
||
.. note:: For Hybrid CPU support, make sure that HWLOC* is version 2.5 or higher. | ||
For NUMA support, install HWLOC* version 1.11 or higher. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
84 changes: 84 additions & 0 deletions
84
doc/main/reference/scalable_memory_pools/malloc_replacement_log.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
.. _malloc_replacement_log: | ||
|
||
TBB_malloc_replacement_log Function | ||
=================================== | ||
|
||
.. note:: This function is for Windows* OS only. | ||
|
||
Summary | ||
******* | ||
|
||
Provides information about the status of dynamic memory allocation replacement. | ||
|
||
Syntax | ||
******* | ||
|
||
:: | ||
|
||
extern "C" int TBB_malloc_replacement_log(char *** log_ptr); | ||
|
||
Header | ||
****** | ||
|
||
:: | ||
|
||
#include "oneapi/tbb/tbbmalloc_proxy.h" | ||
|
||
|
||
Description | ||
*********** | ||
|
||
Dynamic replacement of memory allocation functions on Windows* OS uses in-memory binary instrumentation techniques. | ||
To make sure that such instrumentation is safe, oneTBB first searches for a subset of replaced functions in the Visual C++* runtime DLLs | ||
and checks if each one has a known bytecode pattern. If any required function is not found or its bytecode pattern is unknown, the replacement is skipped, | ||
and the program continues to use the standard memory allocation functions. | ||
|
||
The ``TBB_malloc_replacement_log`` function allows the program to check if the dynamic memory replacement happens and to get a log of the performed checks. | ||
|
||
**Returns:** | ||
|
||
* 0, if all necessary functions are successfully found and the replacement takes place. | ||
* 1, otherwise. | ||
|
||
The ``log_ptr`` parameter must be an address of a char** variable or be ``NULL``. If it is not ``NULL``, the function writes there the address of an array of | ||
NULL-terminated strings containing detailed information about the searched functions in the following format: | ||
|
||
:: | ||
|
||
search_status: function_name (dll_name), byte pattern: <bytecodes> | ||
|
||
For more information about the replacement of dynamic memory allocation functions, see :ref:`Windows_C_Dynamic_Memory_Interface_Replacement`. | ||
|
||
|
||
Example | ||
******* | ||
|
||
:: | ||
|
||
#include "oneapi/tbb/tbbmalloc_proxy.h" | ||
#include <stdio.h> | ||
|
||
int main(){ | ||
char **func_replacement_log; | ||
int func_replacement_status = TBB_malloc_replacement_log(&func_replacement_log); | ||
|
||
if (func_replacement_status != 0) { | ||
printf("tbbmalloc_proxy cannot replace memory allocation routines\n"); | ||
for (char** log_string = func_replacement_log; *log_string != 0; log_string++) { | ||
printf("%s\n",*log_string); | ||
} | ||
} | ||
|
||
return 0; | ||
} | ||
|
||
|
||
Example output: | ||
|
||
:: | ||
|
||
tbbmalloc_proxy cannot replace memory allocation routines | ||
Success: free (ucrtbase.dll), byte pattern: <C7442410000000008B4424> | ||
Fail: _msize (ucrtbase.dll), byte pattern: <E90B000000CCCCCCCCCCCC> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
.. _Floating_Point_Settings: | ||
|
||
Floating-point Settings | ||
======================= | ||
|
||
To propagate CPU-specific settings for floating-point computations to tasks executed by the task scheduler, you can use one of the following two methods: | ||
|
||
* When a ``task_arena`` or a task scheduler for a given application thread is initialized, they capture the current floating-point settings of the thread. | ||
* The ``task_group_context`` class has a method to capture the current floating-point settings. | ||
|
||
By default, worker threads use floating-point settings obtained during the initialization of a ``task_arena`` or the implicit arena of the application thread. The settings are applied to all computations within that ``task_arena`` or started by that application thread. | ||
|
||
|
||
For better control over floating point behavior, a thread may capture the current settings in a task group context. Do it at context creation with a special flag passed to the constructor: | ||
|
||
:: | ||
task_group_context ctx( task_group_context::isolated, | ||
task_group_context::default_traits | task_group_context::fp_settings ); | ||
|
||
|
||
Or call the ``capture_fp_settings`` method: | ||
|
||
:: | ||
task_group_context ctx; | ||
ctx.capture_fp_settings(); | ||
|
||
|
||
You can then pass the task group context to most parallel algorithms, including ``flow::graph``, to ensure that all tasks related to this algorithm use the specified floating-point settings. | ||
It is possible to execute the parallel algorithms with different floating-point settings captured to separate contexts, even at the same time. | ||
|
||
Floating-point settings captured to a task group context prevail over the settings captured during task scheduler initialization. It means, if a context is passed to a parallel algorithm, the floating-point settings captured to the context are used. | ||
Otherwise, if floating-point settings are not captured to the context, or a context is not explicitly specified, the settings captured during the task arena initialization are used. | ||
|
||
In a nested call to a parallel algorithm that does not use the context of a task group with explicitly captured floating-point settings, the outer-level settings are used. | ||
If none of the outer-level contexts capture floating-point settings, the settings captured during task arena initialization are used. | ||
|
||
It guarantees that: | ||
|
||
* Floating-point settings are applied to all tasks executed within a task arena, if they are captured: | ||
|
||
* To a task group context. | ||
* During the arena initialization. | ||
|
||
* A call to a oneTBB parallel algorithm does not change the floating-point settings of the calling thread, even if the algorithm uses different settings. | ||
|
||
.. note:: | ||
The guarantees above apply only to the following conditions: | ||
|
||
* A user code inside a task should: | ||
|
||
* Not change the floating-point settings. | ||
* Revert any modifications. | ||
* Restore previous settings before the end of the task. | ||
|
||
* oneTBB task scheduler observers are not used to set or modify floating point settings. | ||
|
||
Otherwise, the stated guarantees are not valid and the behavior related to floating-point settings is undefined. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters