Skip to content

Commit

Permalink
Support fp8_e4m3/fp8_e5m2 (#383)
Browse files Browse the repository at this point in the history
* Support fp8_e4m3/fp8_e5m2

* Moving to regular README include, which is easier to manage.

* Update README.md
  • Loading branch information
Narsil authored Nov 17, 2023
1 parent bfd22b3 commit 7faab77
Show file tree
Hide file tree
Showing 6 changed files with 16 additions and 395 deletions.
10 changes: 0 additions & 10 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,6 @@ jobs:
if: matrix.os == 'ubuntu-latest'
run: cargo install cargo-llvm-cov

- name: Install cargo-readme for Ubuntu
if: matrix.os == 'ubuntu-latest'
run: cargo install cargo-readme

- name: Build
run: cargo build --all-targets --verbose

Expand All @@ -57,9 +53,3 @@ jobs:
token: ${{ secrets.CODECOV_TOKEN }} # not required for public repos
working-directory: ./safetensors
fail_ci_if_error: true

# Verify that Readme.md is up to date.
- name: Make sure, Readme generated from lib.rs matches actual Readme
if: matrix.os == 'ubuntu-latest'
shell: bash
run: cargo readme > must_match_readme.md && diff must_match_readme.md README.md && diff must_match_readme.md ../README.md
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,10 @@ Notes:
from traditional tensor libraries perspective (torch, tensorflow, numpy, ..).
- 0-rank Tensors (tensors with shape `[]`) are allowed, they are merely a scalar.
- The byte buffer needs to be entirely indexed, and cannot contain holes. This prevents
the creation of polyglot files.
the creation of polyglot files.
- Endianness: Little-endian.
moment.
- Order: 'C' or row-major.


### Yet another format ?
Expand All @@ -113,7 +116,7 @@ formats.
Let's take a look at alternatives and why this format is deemed interesting.
This is my very personal and probably biased view:

| Format | Safe | Zero-copy | Lazy loading | No file size limit | Layout control | Flexibility | Bfloat16
| Format | Safe | Zero-copy | Lazy loading | No file size limit | Layout control | Flexibility | Bfloat16/Fp8
| ----------------------- | --- | --- | --- | --- | --- | --- | --- |
| pickle (PyTorch) |||| 🗸 || 🗸 | 🗸 |
| H5 (Tensorflow) | 🗸 || 🗸 | 🗸 | ~ | ~ ||
Expand All @@ -133,7 +136,7 @@ some tensors in it without scanning the whole file (distributed setting) ?
- Layout control: Lazy loading, is not necessarily enough since if the information about tensors is spread out in your file, then even if the information is lazily accessible you might have to access most of your file to read the available tensors (incurring many DISK -> RAM copies). Controlling the layout to keep fast access to single tensors is important.
- No file size limit: Is there a limit to the file size ?
- Flexibility: Can I save custom code in the format and be able to use it later with zero extra code ? (~ means we can store more than pure tensors, but no custom code)
- Bfloat16: Does the format support native bfloat16 (meaning no weird workarounds are
- Bfloat16/Fp8: Does the format support native bfloat16/fp8 (meaning no weird workarounds are
necessary)? This is becoming increasingly important in the ML world.


Expand Down
190 changes: 0 additions & 190 deletions safetensors/README.md

This file was deleted.

1 change: 1 addition & 0 deletions safetensors/README.md
27 changes: 0 additions & 27 deletions safetensors/README.tpl

This file was deleted.

Loading

0 comments on commit 7faab77

Please sign in to comment.