-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couldn't build proto file into descriptor pool! Invalid proto descriptor for file "sentencepiece_model.proto": sentencepiece_model.proto: A file with this name is already in the pool. #1266
Comments
Still not change ... will save 16 bit merged and Lora but not the GGUF ... that error comes up ... OH well. |
Using Vanilla new script with ONLY MY TOKEN and HF name model added to save.gguf part of code .. https://colab.research.google.com/drive/1PrX2o1VXJJfG1n8GXpzpBr3qY9NPgucM?usp=sharing This works ... So the issue is with my code some place ... Let me see if I can substitute my MODEL and HF TOKEN to test. Will report back |
My model fails with the identical code ... So it is the model. IDK |
Still off and on ... And I am using this code to start the scripts: %%capture Also get the latest nightly Unsloth!!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" Nemo saved. As expected. Small Mistral. Failed. Same error as above. Mistral 7b instruct. Failed. Same error as above. ( used your provided script only added HF user name and tokens at indicated place in code and increased context window to their max. ) But A fine tune of Dolphin 2.9.3 iirc THAT WORKED PERFECTLY. (Changed model in above script. This worked above errored.) So it seems to be BOTH model and TRL dependent but that is above my level ... I can save all as Merged 16 bit and Lora so they are not lost. Just GGUF is giving me a go again. |
I have tried a new HF token as well ... NO dice. The old one is valid and working for pushing Merged model to HF well and in order. ONLY GGUF ... Have tried all sorts of things. The manual save ... I can't figure out how to use ... above my level that is all. |
Still an issue ... Can save using manual but it won't push to HF as GGUF ... |
Hmm so Mistral 7b Instruct is the main culprit? @Erland366 Can you take a look at exporting Mistral Instruct thanks |
Yes basically Mistral 7b instruct and its 'clones' seem to have issues. Thanks a lot again. |
…ress issue unslothai#1266" This reverts commit 9fc1307.
%%capture
!pip install unsloth "xformers==0.0.28.post2"
Also get the latest nightly Unsloth!
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --upgrade --no-cache-dir --no-deps unsloth transformers git+https://github.com/huggingface/trl.git
AND
%%capture
!pip install unsloth "xformers==0.0.28.post2"
Also get the latest nightly Unsloth!
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
#!pip install --upgrade --no-cache-dir --no-deps unsloth transformers git+https://github.com/huggingface/trl.git
Error is the same.
if True:
model.push_to_hub_gguf(
"HF/Model", # Change hf to your username!
tokenizer,
quantization_method = ["q4_0","q4_k_m","q5_k_m",],
token = "hf_KCorrect_Token_Here", # Get a token at https://huggingface.co/settings/tokens
)
Unsloth: [0] Installing llama.cpp. This will take 3 minutes...
Unsloth: [1] Converting model at XXX into f16 GGUF format.
The output location will be /content/CCC/dddd/unsloth.F16.gguf
This will take 3 minutes...
TypeError Traceback (most recent call last)
in <cell line: 1>()
1 if True:
----> 2 model.push_to_hub_gguf(
3 "HF/Model", # Change hf to your username!
4 tokenizer,
5 quantization_method = ["q4_0","q4_k_m","q5_k_m",],
4 frames
/usr/local/lib/python3.10/dist-packages/unsloth/save.py in unsloth_push_to_hub_gguf(self, repo_id, tokenizer, quantization_method, first_conversion, use_temp_dir, commit_message, private, token, max_shard_size, create_pr, safe_serialization, revision, commit_description, tags, temporary_location, maximum_memory_usage)
1859
1860 # Save to GGUF
-> 1861 all_file_locations, want_full_precision = save_to_gguf(
1862 model_type, model_dtype, is_sentencepiece_model,
1863 new_save_directory, quantization_method, first_conversion, makefile,
/usr/local/lib/python3.10/dist-packages/unsloth/save.py in save_to_gguf(model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, _run_installer)
1091 vocab_type = "spm,hfft,bpe"
1092 # Fix Sentencepiece model as well!
-> 1093 fix_sentencepiece_gguf(model_directory)
1094 else:
1095 vocab_type = "bpe"
/usr/local/lib/python3.10/dist-packages/unsloth/tokenizer_utils.py in fix_sentencepiece_gguf(saved_location)
402 """
403 from copy import deepcopy
--> 404 from transformers.utils import sentencepiece_model_pb2
405 import json
406 from enum import IntEnum
/usr/local/lib/python3.10/dist-packages/transformers/utils/sentencepiece_model_pb2.py in
26
27
---> 28 DESCRIPTOR = _descriptor.FileDescriptor(
29 name="sentencepiece_model.proto",
30 package="sentencepiece",
/usr/local/lib/python3.10/dist-packages/google/protobuf/descriptor.py in new(cls, name, package, options, serialized_options, serialized_pb, dependencies, public_dependencies, syntax, pool, create_key)
1022 raise RuntimeError('Please link in cpp generated lib for %s' % (name))
1023 elif serialized_pb:
-> 1024 return _message.default_pool.AddSerializedFile(serialized_pb)
1025 else:
1026 return super(FileDescriptor, cls).new(cls)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "sentencepiece_model.proto":
sentencepiece_model.proto: A file with this name is already in the pool.
This happened last week too, IIRC it was a transformers thing then, but I can't find a work around.
The text was updated successfully, but these errors were encountered: