Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gguf-split add a default option to not include tensors data in first shard #6463

Closed
phymbert opened this issue Apr 3, 2024 · 0 comments · Fixed by #7072
Closed

gguf-split add a default option to not include tensors data in first shard #6463

phymbert opened this issue Apr 3, 2024 · 0 comments · Fixed by #7072
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed split GGUF split model sharding

Comments

@phymbert
Copy link
Collaborator

phymbert commented Apr 3, 2024

Motivation

be able to make a split where the first shard is very small and contains primarily the metadata so that it can be downloaded quickly and then start the download of the other shards without waiting for the first to finish

Proposition

Add an option to not include tensor data in the first file. Maybe it should be enabled by default.
Should be well tested.

ggml_alloc should not be called as it will complain with WARNING: Behavior may be unexpected when allocating 0 bytes for ggml_malloc!

We can add extra meta data in the first file that describes all tensors in the shards for example

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed split GGUF split model sharding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant