[Major Change] Enforcing tensor alignment #148

Narsil · 2023-01-04T16:39:41Z

Now the header will automatically align itself to 8 bytes (f64) with
appending extra spaces as necessary.
This will allow extra fast memory mapping by reinterpreting bytes as
f32/f64 etc.. Unaligned bytes do not allow for this. https://www.reddit.com/r/rust/comments/tanaxm/mutating_a_buffer_of_u8s_as_f32s_in_place/
This does not change contiguousness of tensors
This does not change the actual spec (we're just putting extra valid bytes
in the header and using a different serialization ordering)
Readers should still be able to read old files, they would just need
to be copied before being cast as their final destination when using
mmap
This has no effect for GPU since copy is already necessary (I think,
depends on the cuda API actually if it allows filling f32 addresses
from raw unaligned bytes).

This change will only be interesting if things like https://github.com/Narsil/fast_gpt2
actually pick up. And even with the copy, load times are still vastly
faster than pytorch/transformers.

julien-c · 2023-01-04T18:58:46Z

would this change require changes to #44? or would it be transparent?

And even with the copy, load times are still vastly
superior to pytorch/transformers.

superior is ambiguous, I would add "(i.e., lower then)"

Narsil · 2023-01-04T20:21:01Z

would this change require changes to #44? or would it be transparent?

Transparent, it's only making sure the header JSON is of the appropriate size by adding extra padding.
(Also tensor order is modified, which readers should be agnostic to anyway).

And even with the copy, load times are still vastly
superior to pytorch/transformers.

superior is ambiguous, I would add "(i.e., lower then)"

True!

- Now the header will automatically align itself to 8 bytes (f64) with appending extra spaces as necessary. - This will allow extra fast memory mapping by reinterpreting bytes as f32/f64 etc.. Unaligned bytes do not allow for this. https://www.reddit.com/r/rust/comments/tanaxm/mutating_a_buffer_of_u8s_as_f32s_in_place/ - This does not change contiguousness of tensors - This does not change the actual spec (we're just putting extra valid bytes in the header and using a different serialization ordering) - Readers should still be able to read old files, they would just need to be copied before being cast as their final destination when using mmap - This has no effect for GPU since copy is already necessary (*I think*, depends on the cuda API actually if it allows filling f32 addresses from raw unaligned bytes). This change will only be interesting if things like https://github.com/Narsil/fast_gpt2 actually pick up. And even with the copy, load times are still vastly superior to `pytorch`. We need to be able to read old files.

McPatate

lgtm, what prompted the change from BTreeMap to HashMap ?

Narsil · 2023-02-20T15:45:31Z

lgtm, what prompted the change from BTreeMap to HashMap ?

complexity. O(log (n) ) vs O(1). Shouldn't matter in practice really, but since I don't rely on the structure for ordering anymore, I don't need BTree anymore.

Narsil marked this pull request as draft January 4, 2023 16:39

Narsil requested review from OlivierDehaene and thomasw21 January 4, 2023 16:39

Narsil force-pushed the forcing_alignment branch from 7808b6d to d39d645 Compare January 22, 2023 16:03

Narsil added 2 commits February 8, 2023 12:15

Fixup.

f5d27a4

Narsil force-pushed the forcing_alignment branch from 268e7f9 to f5d27a4 Compare February 8, 2023 11:16

Narsil added 2 commits February 8, 2023 12:25

Clippy fix.

fb9869b

Cargo fmt (clippy --fix broke it ? :( )

f5ef88d

Narsil marked this pull request as ready for review February 8, 2023 11:42

Narsil mentioned this pull request Feb 9, 2023

Explicit automatic alignment of header #178

Closed

Narsil mentioned this pull request Feb 20, 2023

Safetensors support. coreylowman/dfdx#381

Merged

McPatate approved these changes Feb 20, 2023

View reviewed changes

Narsil merged commit 0c5b3a6 into main Feb 21, 2023

Narsil deleted the forcing_alignment branch February 21, 2023 14:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Major Change] Enforcing tensor alignment #148

[Major Change] Enforcing tensor alignment #148

Narsil commented Jan 4, 2023 •

edited

Loading

julien-c commented Jan 4, 2023

Narsil commented Jan 4, 2023 •

edited

Loading

McPatate left a comment

Narsil commented Feb 20, 2023

[Major Change] Enforcing tensor alignment #148

[Major Change] Enforcing tensor alignment #148

Conversation

Narsil commented Jan 4, 2023 • edited Loading

julien-c commented Jan 4, 2023

Narsil commented Jan 4, 2023 • edited Loading

McPatate left a comment

Choose a reason for hiding this comment

Narsil commented Feb 20, 2023

Narsil commented Jan 4, 2023 •

edited

Loading

Narsil commented Jan 4, 2023 •

edited

Loading