Add buffer pool/arena to enable re-use of temporary buffers during graph execution #108

This extracts the data buffer from a tensor without making it contiguous.

This will be useful for a tensor pool/arena in the rten crate.

This initializes a `Tensor<MaybeUninit<T>>` by copying data from an existing view/tensor.

Improve buffer re-use during graph execution by adding a pool from which operators can allocate output buffers, and into which buffers are added when their ref count drops to zero (ie. when they are no longer needed by subsequent graph execution steps). This significantly reduces how often execution needs to allocate "fresh" buffers from the system allocator and free them back. In this initial implementation, a reference to the pool is passed to all operators via `Operator::run`, but only a subset actually use the pool. This subset was chosen to benefit the YOLOv8 example. - Add `pool` argument to `Operator::run`, specifying a pool from which operators should allocate their outputs - Create a pool at the start of graph execution and release it at the end. Intermediate values that are no longer needed are added to the pool after each operator runs. - Report the number of allocations from the pools and the hit rate (how often the pool was able to satisfy allocations) as part of timing info. - Modify an initial subset of allocators to allocate from the pool, based on what helps the YOLOv8 example.

If the `RTEN_USE_POOL` env var is set, the pool will be used. Otherwise the pool is still created, but buffers are never added to it, so all allocations go through the system allocator as before.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add buffer pool/arena to enable re-use of temporary buffers during graph execution #108

Add buffer pool/arena to enable re-use of temporary buffers during graph execution #108

Commits on Apr 22, 2024

Commits on Apr 23, 2024