Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add buffer pool/arena to enable re-use of temporary buffers during graph execution #108

Merged
merged 5 commits into from
Apr 23, 2024

Commits on Apr 22, 2024

  1. Add Tensor::into_non_contiguous_data

    This extracts the data buffer from a tensor without making it contiguous.
    robertknight committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    b6a7fa8 View commit details
    Browse the repository at this point in the history
  2. Export IntoLayout from rten-tensor

    This will be useful for a tensor pool/arena in the rten crate.
    robertknight committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    4c02605 View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2024

  1. Add Tensor::init_from

    This initializes a `Tensor<MaybeUninit<T>>` by copying data from an existing
    view/tensor.
    robertknight committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    a3e2dd9 View commit details
    Browse the repository at this point in the history
  2. Add an arena/pool to enable tensor buffer re-use during graph execution

    Improve buffer re-use during graph execution by adding a pool from which
    operators can allocate output buffers, and into which buffers are added
    when their ref count drops to zero (ie. when they are no longer needed
    by subsequent graph execution steps). This significantly reduces how
    often execution needs to allocate "fresh" buffers from the system
    allocator and free them back.
    
    In this initial implementation, a reference to the pool is passed to all
    operators via `Operator::run`, but only a subset actually use the pool.
    This subset was chosen to benefit the YOLOv8 example.
    
     - Add `pool` argument to `Operator::run`, specifying a pool from which
       operators should allocate their outputs
    
     - Create a pool at the start of graph execution and release it at the end.
       Intermediate values that are no longer needed are added to the pool after
       each operator runs.
    
     - Report the number of allocations from the pools and the hit rate (how often
       the pool was able to satisfy allocations) as part of timing info.
    
     - Modify an initial subset of allocators to allocate from the pool, based on
       what helps the YOLOv8 example.
    robertknight committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    11be82a View commit details
    Browse the repository at this point in the history
  3. Add a temporary feature flag for the memory pool

    If the `RTEN_USE_POOL` env var is set, the pool will be used. Otherwise the pool
    is still created, but buffers are never added to it, so all allocations go
    through the system allocator as before.
    robertknight committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e49ab2c View commit details
    Browse the repository at this point in the history