Skip to content
This repository has been archived by the owner on May 12, 2023. It is now read-only.

invalid model file (bad magic [got 0x67676d66 want 0x67676a74]) #58

Closed
qaiwiz opened this issue Apr 12, 2023 · 13 comments
Closed

invalid model file (bad magic [got 0x67676d66 want 0x67676a74]) #58

qaiwiz opened this issue Apr 12, 2023 · 13 comments

Comments

@qaiwiz
Copy link

qaiwiz commented Apr 12, 2023

I am working on linux debian 11, and after pip install and downloading a most recent mode: gpt4all-lora-quantized-ggml.bin I have tried to test the example but I get the following error:

./gpt4all-lora-quantized-ggml.bin: invalid model file (bad magic [got 0x67676d66 want 0x67676a74])
you most likely need to regenerate your ggml files
the benefit is you'll get 10-100x faster load times
see ggerganov/llama.cpp#91
use convert-pth-to-ggml.py to regenerate from original pth
use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model

I tried this: pyllamacpp-convert-gpt4all ./gpt4all-lora-quantized-ggml.bin ./llama_tokenizer ./gpt4all-converted.bin but I am not sure where the tokenizer is stored!

@abdeladim-s
Copy link
Collaborator

@qaiwiz you should download the tokenizer as well (it's a small file), please see #5

@qaiwiz
Copy link
Author

qaiwiz commented Apr 12, 2023

@abdeladim-s thanks, I just came to post that one has to download tokenizer as you pointed out (#5). I actually did, but then I get
File "/root/env39/bin/pyllamacpp-convert-gpt4all", line 8, in
sys.exit(main())
File "/root/env39/lib/python3.9/site-packages/pyllamacpp/scripts/convert_gpt4all.py", line 19, in main
convert_one_file(args.gpt4all_model, tokenizer)
File "/root/env39/lib/python3.9/site-packages/pyllamacpp/scripts/convert.py", line 92, in convert_one_file
write_header(f_out, read_header(f_in))
File "/root/env39/lib/python3.9/site-packages/pyllamacpp/scripts/convert.py", line 34, in write_header
raise Exception('Invalid file magic. Must be an old style ggml file.')

What does it mean by old file! Actually downloaded the most recent model.bin file from that link ([gpt4all-lora-quantized-ggml.bin] 05-Apr-2023 13:07 4G). Now, I am wondering how should I fix this to get the model working.

@qaiwiz
Copy link
Author

qaiwiz commented Apr 12, 2023

I couldn't fix it to work, so I redownload the converted model: https://huggingface.co/LLukas22/gpt4all-lora-quantized-ggjt. I am trying this on my server with 2 core and 8GB of ram (I know it is the limit), and i tried to bring down temperature and ease up some of the parameter, yet it is stalling! Typically how fast should I expect this to run on such server?

#Load the model
model = Model(ggml_model="ggjt-model.bin", n_ctx=2000)

#Generate
prompt="User: How are you doing?\nBot:"

result=model.generate(prompt,n_predict=50,temp=0, top_k = 3, top_p = 0.950000 ,repeat_last_n = 64, repeat_penalty = 1.100000)

is there any hyperparameter to fix it to work faster?

@abdeladim-s
Copy link
Collaborator

@qaiwiz the spec you are using is very low, you should have a quad core CPU at least.
Also if the CPU you are using does not have AVX acceleration, it will be worse.
You won't get much speed even if you changed the hyper-parameters.

@qaiwiz
Copy link
Author

qaiwiz commented Apr 12, 2023

Here is the system config:
system_info: n_threads = 2 / 2 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
sampling: temp = 0.000000, top_k = 3, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.100000
generate: n_ctx = 2000, n_batch = 8, n_predict = 50, n_keep = 0

@qaiwiz
Copy link
Author

qaiwiz commented Apr 12, 2023

Here is the output:

llama_print_timings: load time = 71340.45 ms
llama_print_timings: sample time = 299.64 ms / 55 runs ( 5.45 ms per run)
llama_print_timings: prompt eval time = 292639.93 ms / 36 tokens ( 8128.89 ms per token)
llama_print_timings: eval time = 2361021.55 ms / 52 runs (45404.26 ms per run)
llama_print_timings: total time = 2812682.00 ms

result
' User: How are you doing?\nBot:\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01\x01'

@qaiwiz
Copy link
Author

qaiwiz commented Apr 13, 2023

I couldn't fix it to work, so I redownload the converted model: https://huggingface.co/LLukas22/gpt4all-lora-quantized-ggjt.

@qaiwiz qaiwiz closed this as completed Apr 13, 2023
@andzejsp
Copy link

guys you borked the lama again?

Checking discussions database...
llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
./models/gpt4all-lora-quantized-ggml.bin: invalid model file (bad magic [got 0x6e756f46 want 0x67676a74])
        you most likely need to regenerate your ggml files
        the benefit is you'll get 10-100x faster load times
        see https://github.com/ggerganov/llama.cpp/issues/91
        use convert-pth-to-ggml.py to regenerate from original pth
        use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'

Was working until i did a git pull today. So, whats going on? How do you convert to the right magic?, We (GPT4ALL-UI) just recently converted all models and uploaded to the hf but now they are dead...

Issue: ParisNeo/lollms-webui#96

@mahmoodfathy
Copy link

@andzejsp am facing the same issue as well :/ , just tried it now with latest model and it doesn't work

@andzejsp
Copy link

@andzejsp am facing the same issue as well :/ , just tried it now with latest model and it doesn't work

in my case its working with ggml-vicuna-13b-4bit-rev1.bin model, not sure why the other model died...

@mahmoodfathy
Copy link

@andzejsp can you give me a download link to it if you have so i can try it ?

@andzejsp
Copy link

@andzejsp can you give me a download link to it if you have so i can try it ?

https://github.com/nomic-ai/gpt4all-ui#supported-models

@abdeladim-s
Copy link
Collaborator

@andzejsp We didn't touch anything, we didn't push any updates since a week now. You can take a look at the commits history.
Please make sure you are doing the right thing!!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants