Convert tensor copies to allocate from pool #113

robertknight · 2024-04-25T07:28:29Z

Add TensorBase::to_tensor_buf API that makes it possible to create a contiguous copy of a tensor/view with a pre-allocated output buffer, and use it to convert the Identity, BatchNormalization, Slice, Softmax, LogSoftmax and InstanceNormalization ops to allocate from the pool.

See #109

When using this example for profiling, it is useful to be able to suppress the full output.

These will allow operators which need to copy inputs to handle allocation of the cloned tensor/view's buffer.

`LayerNormalization` is an exception as it is built on lower-level operators, not all of which have been converted to use the pool yet.

robertknight added 5 commits April 25, 2024 07:32

Add -s, --summary flag to YOLOv8 example

813b24d

When using this example for profiling, it is useful to be able to suppress the full output.

Add TensorBase::{to_tensor_buf, to_vec_buf}

3c4fd9a

These will allow operators which need to copy inputs to handle allocation of the cloned tensor/view's buffer.

Convert the Identity op to allocate from the pool

d47ca70

Convert most normalization ops to allocate from the pool

80fa290

`LayerNormalization` is an exception as it is built on lower-level operators, not all of which have been converted to use the pool yet.

Convert Slice op to allocate from the pool

fdccd02

robertknight merged commit 6a0aa11 into main Apr 25, 2024
2 checks passed

robertknight deleted the tensor-copy-pool branch April 25, 2024 07:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert tensor copies to allocate from pool #113

Convert tensor copies to allocate from pool #113

robertknight commented Apr 25, 2024 •

edited

Loading

Convert tensor copies to allocate from pool #113

Convert tensor copies to allocate from pool #113

Conversation

robertknight commented Apr 25, 2024 • edited Loading

robertknight commented Apr 25, 2024 •

edited

Loading