SafeTensors.jl

This packages loads data stored in safetensor format. Since Python is row-major and Julia is column-major, the dimensions are permuted such the tensor has the same shape as in python, but everything is correctly ordered. This includes a performance penalty in sense that we cannot be completely copy-free.

The main function is load_safetensors which returns a Dict{String,V} where keys are names of tensors and values are tensors. An example from runtests is as follows

julia> using SafeTensors

julia> d = load_safetensors("test/model.safetensors")
Dict{String, Array} with 27 entries:
  "int32_357"   => Int32[0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29…
  "uint8_3"     => UInt8[0x00, 0x01, 0x02]
  "float16_35"  => Float16[0.0 1.0 … 3.0 4.0; 5.0 6.0 … 8.0 9.0; 10.0 11.0 … 13.0…
  "bool_3"      => Bool[0, 1, 0]
  "int64_3"     => [0, 1, 2]
  "int64_35"    => [0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]
  "float32_357" => Float32[0.0 7.0 … 21.0 28.0; 35.0 42.0 … 56.0 63.0; 70.0 77.0 …
  "bool_35"     => Bool[0 1 … 1 0; 1 0 … 0 1; 0 1 … 1 0]
  "float32_35"  => Float32[0.0 1.0 … 3.0 4.0; 5.0 6.0 … 8.0 9.0; 10.0 11.0 … 13.0…
  "float32_3"   => Float32[0.0, 1.0, 2.0]
  "uint8_35"    => UInt8[0x00 0x01 … 0x03 0x04; 0x05 0x06 … 0x08 0x09; 0x0a 0x0b …
  "float16_3"   => Float16[0.0, 1.0, 2.0]
  "int16_357"   => Int16[0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29…
  "int16_3"     => Int16[0, 1, 2]
  "float64_357" => [0.0 7.0 … 21.0 28.0; 35.0 42.0 … 56.0 63.0; 70.0 77.0 … 91.0 …
  "uint8_357"   => UInt8[0x00 0x07 … 0x15 0x1c; 0x23 0x2a … 0x38 0x3f; 0x46 0x4d …
  "float16_357" => Float16[0.0 7.0 … 21.0 28.0; 35.0 42.0 … 56.0 63.0; 70.0 77.0 …
  "int32_3"     => Int32[0, 1, 2]
  "int16_35"    => Int16[0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]
  "int8_357"    => Int8[0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29;…
  "int8_35"     => Int8[0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]
  "bool_357"    => Bool[0 1 … 1 0; 1 0 … 0 1; 0 1 … 1 0;;; 1 0 … 0 1; 0 1 … 1 0; …
  "float64_35"  => [0.0 1.0 … 3.0 4.0; 5.0 6.0 … 8.0 9.0; 10.0 11.0 … 13.0 14.0]
  "int8_3"      => Int8[0, 1, 2]
  "int64_357"   => [0 7 … 21 28; 35 42 … 56 63; 70 77 … 91 98;;; 1 8 … 22 29; 36 …
  "int32_35"    => Int32[0 1 … 3 4; 5 6 … 8 9; 10 11 … 13 14]
  "float64_3"   => [0.0, 1.0, 2.0]

It can also perform a lazy loading with SafeTensors.deserialize("model.safetensors") which mmap the file and return a Dict-like object:

julia> tensors = SafeTensors.deserialize("test/model.safetensors"; mmap = true #= default to `true`=#);

julia> tensors["float32_35"]
3×5 mappedarray(ltoh, PermutedDimsArray(reshape(reinterpret(Float32, view(::Vector{UInt8}, 0x0000000000000ef5:0x0000000000000f30)), 5, 3), (2, 1))) with eltype Float32:
  0.0   1.0   2.0   3.0   4.0
  5.0   6.0   7.0   8.0   9.0
 10.0  11.0  12.0  13.0  14.0

Serialization is also supported:

julia> using Random, BFloat16s

julia> weights = Dict("W"=>randn(BFloat16, 3, 5), "b"=>rand(BFloat16, 3))
Dict{String, Array{BFloat16}} with 2 entries:
  "W" => [0.617188 0.695312 … 0.390625 -2.0; -0.65625 -0.617188 … 0.652344 0.244141; 0.226562 2.70312 … -0.174805 -0.7773…
  "b" => [0.111816, 0.566406, 0.283203]

julia> f = tempname();

julia> SafeTensors.serialize(f, weights)

julia> loaded = SafeTensors.deserialize(f);

julia> loaded["W"] ≈ weights["W"]
true

julia> SafeTensors.serialize(f, weights, Dict("Package"=>"SafeTensors.jl", "version"=>"1"))

julia> loaded = SafeTensors.deserialize(f);

julia> loaded.metadata
Dict{String, String} with 2 entries:
  "Package" => "SafeTensors.jl"
  "version" => "1"

Working with gpu:

julia> loaded["W"]
3×5 mappedarray(ltoh, PermutedDimsArray(reshape(reinterpret(BFloat16, view(::Vector{UInt8}, 0x00000000000000b9:0x00000000000000d6)), 5, 3), (2, 1))) with eltype BFloat16:
  0.542969    0.201172   1.38281    -0.255859  -1.55469
  0.172852   -0.949219   0.0561523  -1.34375   -0.206055
 -0.0854492   1.17969   -0.265625   -0.871094   2.25

julia> using CUDA; CUDA.allowscalar(false)

julia> CuArray(loaded["W"])
3×5 CuArray{BFloat16, 2, CUDA.Mem.DeviceBuffer}:
  0.542969    0.201172   1.38281    -0.255859  -1.55469
  0.172852   -0.949219   0.0561523  -1.34375   -0.206055
 -0.0854492   1.17969   -0.265625   -0.871094   2.25

julia> gpu_weights = Dict("W"=>CuArray(loaded["W"]), "b"=>CuArray(loaded["b"]))
Dict{String, CuArray{BFloat16, N, CUDA.Mem.DeviceBuffer} where N} with 2 entries:
  "W" => [0.542969 0.201172 … -0.255859 -1.55469; 0.172852 -0.949219 … -1.34375 -0.206055; -0.0854492 1.17969 … -0.871094…
  "b" => BFloat16[0.871094, 0.773438, 0.703125]

julia> f = tempname();

julia> SafeTensors.serialize(f, gpu_weights)

julia> SafeTensors.deserialize(f)
SafeTensors.SafeTensor{SubArray{UInt8, 1, Vector{UInt8}, Tuple{UnitRange{UInt64}}, true}} with 2 entries:
  "W" => BFloat16[0.542969 0.201172 … -0.255859 -1.55469; 0.172852 -0.949219 … -1.34375 -0.206055; -0.0854492 1.17969 … -…
  "b" => BFloat16[0.871094, 0.773438, 0.703125]

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafeTensors.jl

About

Releases 3

Sponsor this project

Packages

Contributors 5

Languages

License

FluxML/SafeTensors.jl

Folders and files

Latest commit

History

Repository files navigation

SafeTensors.jl

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 3

Sponsor this project

Packages 0

Contributors 5

Languages

Packages