Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found #4881

Closed
1 task done
hherpa opened this issue Dec 11, 2023 · 9 comments
Closed
1 task done
Labels
bug Something isn't working stale

Comments

@hherpa
Copy link

hherpa commented Dec 11, 2023

Describe the bug

I get an error when running mixtral-8x7b via text-generation-web ui. This is the mistake:

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
Traceback (most recent call last):
  File "/content/text-generation-webui/server.py", line 236, in <module>
    shared.model, shared.tokenizer = load_model(model_name)
  File "/content/text-generation-webui/modules/models.py", line 88, in load_model
    output = load_func_map[loader](model_name)
  File "/content/text-generation-webui/modules/models.py", line 253, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
  File "/content/text-generation-webui/modules/llamacpp_model.py", line 91, in from_pretrained
    result.model = Llama(**params)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_cuda/llama.py", line 923, in __init__
    self._n_vocab = self.n_vocab()
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_cuda/llama.py", line 2184, in n_vocab
    return self._model.n_vocab()
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_cuda/llama.py", line 250, in n_vocab
    assert self.model is not None
AssertionError
Exception ignored in: <function LlamaCppModel.__del__ at 0x7f37ccebe950>
Traceback (most recent call last):
  File "/content/text-generation-webui/modules/llamacpp_model.py", line 49, in __del__
    del self.model
AttributeError: model

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

I used this google colab - link
it performs the following steps:

  1. Installation text-generation-webui:
import torch
from pathlib import Path

if Path.cwd().name != 'text-generation-webui':
  print("Installing the webui...")

  !git clone https://github.com/oobabooga/text-generation-webui
  %cd text-generation-webui

  torver = torch.__version__
  print(f"TORCH: {torver}")
  is_cuda118 = '+cu118' in torver  # 2.1.0+cu118
  is_cuda117 = '+cu117' in torver  # 2.0.1+cu117

  textgen_requirements = open('requirements.txt').read().splitlines()
  if is_cuda117:
      textgen_requirements = [req.replace('+cu121', '+cu117').replace('+cu122', '+cu117').replace('torch2.1', 'torch2.0') for req in textgen_requirements]
  elif is_cuda118:
      textgen_requirements = [req.replace('+cu121', '+cu118').replace('+cu122', '+cu118') for req in textgen_requirements]
  with open('temp_requirements.txt', 'w') as file:
      file.write('\n'.join(textgen_requirements))

  !pip install -r extensions/api/requirements.txt --upgrade
  !pip install -r temp_requirements.txt --upgrade

  print("\033[1;32;1m\n --> If you see a warning about \"previously imported packages\", just ignore it.\033[0;37;0m")
  print("\033[1;32;1m\n --> There is no need to restart the runtime.\n\033[0;37;0m")

  try:
    import flash_attn
  except:
    !pip uninstall -y flash_attn
  1. use CD:
cd models
mkdir mixtral-8x7b
cd mixtral-8x7b
  1. Installing models:
!wget https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF/resolve/main/mixtral-8x7b-v0.1.Q2_K.gguf
!wget https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF/resolve/main/config.json
  1. Installing requirements:
cd ..
cd ..
!pip install tiktoken sentence_transformers SpeechRecognition
!pip install sse_starlette
!pip install flask_cloudflared
  1. Launch:
!python server.py --share --model mixtral-8x7b --extensions openai --n-gpu-layers 125 --n_ctx 10000 --public-api

Screenshot

image

Logs

Logs: no

System Info

System ingo: I use google colab
@hherpa hherpa added the bug Something isn't working label Dec 11, 2023
@baas-hans
Copy link

Does llama.cpp even support mixtral yet?
ggerganov/llama.cpp#4381

@Chanka0
Copy link

Chanka0 commented Dec 11, 2023

llama.cpp has Mixtral support in the works but it's not part of the master branch yet. You need to wait for it to be merged into the master branch and for llama.cpp python bindings to get updated before it can be added to ooba.

@thistleknot
Copy link

thistleknot commented Dec 12, 2023

came to say I get the same error when trying to work with llama cpp directly

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
Traceback (most recent call last):
  File "/data/text-generation-webui/models/mixtral/mixtral.py", line 4, in <module>
    llm = Llama(
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 923, in __init__
    self._n_vocab = self.n_vocab()
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 2184, in n_vocab
    return self._model.n_vocab()
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 250, in n_vocab
    assert self.model is not None
AssertionError

I find it funny that

https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

says mixtral works with llama.cpp august 27 onward's, but I guess that's a generic message that applies only to gguf

@waefrebeorn
Copy link

llama.cpp has Mixtral support in the works but it's not part of the master branch yet. You need to wait for it to be merged into the master branch and for llama.cpp python bindings to get updated before it can be added to ooba.

today everyone is saying it merged, but all Ooba users finding this thread should know Ooba hasn't merged it yet to the script update, you need to manually refresh the llamma.cpp install

@Nicoolodion2
Copy link

llama.cpp has Mixtral support in the works but it's not part of the master branch yet. You need to wait for it to be merged into the master branch and for llama.cpp python bindings to get updated before it can be added to ooba.

today everyone is saying it merged, but all Ooba users finding this thread should know Ooba hasn't merged it yet to the script update, you need to manually refresh the llamma.cpp install

Can you tell me how I do that?

@EMRD95
Copy link

EMRD95 commented Dec 13, 2023

Finally managed to run it on windows, got to install https://developer.nvidia.com/cuda-12-1-0-download-archive (getting the correct version is important) with the other dependencies, install the webui manually with conda, (without the one click installer), then just follow this https://old.reddit.com/r/Oobabooga/comments/18gijyx/simple_tutorial_using_mixtral_8x7b_gguf_in_ooba/

@ejsnews
Copy link

ejsnews commented Dec 18, 2023

Je fais marcher mes IA sur windows 11 avec les CPU uniquement, donc cuda ne me concerne pas, mais je vais quand même regarder ce lien et tenter les mises a jour. Au moins on sait que le fichier gguf n'est pas en cause.

@TheDarkTrumpet
Copy link

TheDarkTrumpet commented Dec 31, 2023

I don't think these workarounds are require any more. According to abetlen/llama-cpp-python#1000, it's available in v0.2.23 this python library.

I needed to fix moving of my virtual environment, so I wiped the entire environment and rebuilt from scratch (which I'm using pip, so it was just removing venv, then setup the base environment again, then pip install -r requirements.txt). Works OOTB at this point.

@github-actions github-actions bot added the stale label Feb 12, 2024
Copy link

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

9 participants