Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR - failed to load model from ./models/gpt4all-lora-quantized-ggml.bin #96

Closed
Datou opened this issue Apr 18, 2023 · 8 comments
Closed

Comments

@Datou
Copy link

Datou commented Apr 18, 2023

Current Behavior

The default model file (gpt4all-lora-quantized-ggml.bin) already exists. Do you want to replace it? Press B to download it with a browser (faster). [Y,N,B]?N

Skipping download of model file...
Cleaning tmp folder
Virtual environment created and packages installed successfully.
Launching application...
Checking discussions database...
[2023-04-18 10:11:49,423] {model.py:73} INFO - Loading model ...
llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
llama_model_load: invalid model file './models/gpt4all-lora-quantized-ggml.bin' (bad magic)
[2023-04-18 10:11:49,424] {model.py:75} ERROR - failed to load model from ./models/gpt4all-lora-quantized-ggml.bin

Steps to Reproduce

run webui.bat

Screenshots

image

@andzejsp
Copy link
Contributor

What are your system specs? OS? CPU? RAM? Did you download model correctly? I mean maybe its corrupted, try downloading using browser, and then copy it to /models/ folder.

@Datou
Copy link
Author

Datou commented Apr 18, 2023

Win11, 9900K, 32G.

The file was downloaded completely without any issues. I tried downloading it again using the browser, but the file size and error message were the same.

image

@andzejsp
Copy link
Contributor

andzejsp commented Apr 18, 2023

CPU seems to support AVX2..

Im running this in ubuntu VM, seems to load just fine.

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 4017.70 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 5809.78 MB (+ 2052.00 MB per state)
llama_model_load: loading tensors from './models/gpt4all-lora-quantized-ggml.bin'
llama_model_load: model size =  4017.27 MB / num tensors = 291
llama_init_from_file: kv self size  =  512.00 MB
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-18 09:28:50,988] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://mm.jj.ss:9600
[2023-04-18 09:28:50,988] {_internal.py:224} INFO - Press CTRL+C to quit

Well i did a git pull right now on VM, and i cant load it anymore aswell

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/gpt4all-lora-quantized-ggml.bin' - please wait ...
./models/gpt4all-lora-quantized-ggml.bin: invalid model file (bad magic [got 0x6e756f46 want 0x67676a74])
        you most likely need to regenerate your ggml files
        the benefit is you'll get 10-100x faster load times
        see https://github.com/ggerganov/llama.cpp/issues/91
        use convert-pth-to-ggml.py to regenerate from original pth
        use migrate-ggml-2023-03-30-pr613.py if you deleted originals
llama_init_from_file: failed to load model
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-18 09:35:03,952] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://mm.jj.ss:9600
[2023-04-18 09:35:03,952] {_internal.py:224} INFO - Press CTRL+C to quit

@ParisNeo something is borked again. Maybe pyllamacpp, ... my hate towards python keeps growing...

@Datou
Copy link
Author

Datou commented Apr 18, 2023

You got "bad magic" too.

@andzejsp
Copy link
Contributor

You got "bad magic" too.

its most likely that the model loading script was updated because this UI relies and depends on other packages/repos, so if they get updated it will take some time for main dev of this repo to look through the code and fix it. He on vacation right now. So just hang tight. it will be fixed eventually.

@andzejsp
Copy link
Contributor

@Datou Hi, try this model, it works, the original model is messed up, idk why.

HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
Checking discussions database...
llama_model_load: loading model from './models/ggml-vicuna-13b-4bit-rev1.bin' - please wait ...
llama_model_load: n_vocab = 32001
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 5120
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 40
llama_model_load: n_layer = 40
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 13824
llama_model_load: n_parts = 2
llama_model_load: type    = 2
llama_model_load: ggml map size = 7759.84 MB
llama_model_load: ggml ctx size = 101.25 KB
llama_model_load: mem required  = 9807.93 MB (+ 3216.00 MB per state)
llama_model_load: loading tensors from './models/ggml-vicuna-13b-4bit-rev1.bin'
llama_model_load: model size =  7759.40 MB / num tensors = 363
llama_init_from_file: kv self size  =  800.00 MB
Chatbot created successfully
 * Serving Flask app 'GPT4All-WebUI'
 * Debug mode: off
[2023-04-18 12:36:03,700] {_internal.py:224} INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:9600
 * Running on http://mama-mia.juu:9600
[2023-04-18 12:36:03,700] {_internal.py:224} INFO - Press CTRL+C to quit

@andzejsp
Copy link
Contributor

For me it loaded after i pulled the newest changes from git and redownloaded gpt4all-lora-quantized-ggml.bin model

@ParisNeo
Copy link
Owner

Sorry guyes I have a very slow connection these days and I lost the connection yesterday. It should work now. Please if the problem is solved make sure to close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants