error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found #4881

hherpa · 2023-12-11T20:17:32Z

Describe the bug

I get an error when running mixtral-8x7b via text-generation-web ui. This is the mistake:

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 
Traceback (most recent call last):
  File "/content/text-generation-webui/server.py", line 236, in <module>
    shared.model, shared.tokenizer = load_model(model_name)
  File "/content/text-generation-webui/modules/models.py", line 88, in load_model
    output = load_func_map[loader](model_name)
  File "/content/text-generation-webui/modules/models.py", line 253, in llamacpp_loader
    model, tokenizer = LlamaCppModel.from_pretrained(model_file)
  File "/content/text-generation-webui/modules/llamacpp_model.py", line 91, in from_pretrained
    result.model = Llama(**params)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_cuda/llama.py", line 923, in __init__
    self._n_vocab = self.n_vocab()
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_cuda/llama.py", line 2184, in n_vocab
    return self._model.n_vocab()
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp_cuda/llama.py", line 250, in n_vocab
    assert self.model is not None
AssertionError
Exception ignored in: <function LlamaCppModel.__del__ at 0x7f37ccebe950>
Traceback (most recent call last):
  File "/content/text-generation-webui/modules/llamacpp_model.py", line 49, in __del__
    del self.model
AttributeError: model

Is there an existing issue for this?

I have searched the existing issues

Reproduction

I used this google colab - link
it performs the following steps:

Installation text-generation-webui:

import torch
from pathlib import Path

if Path.cwd().name != 'text-generation-webui':
  print("Installing the webui...")

  !git clone https://github.com/oobabooga/text-generation-webui
  %cd text-generation-webui

  torver = torch.__version__
  print(f"TORCH: {torver}")
  is_cuda118 = '+cu118' in torver  # 2.1.0+cu118
  is_cuda117 = '+cu117' in torver  # 2.0.1+cu117

  textgen_requirements = open('requirements.txt').read().splitlines()
  if is_cuda117:
      textgen_requirements = [req.replace('+cu121', '+cu117').replace('+cu122', '+cu117').replace('torch2.1', 'torch2.0') for req in textgen_requirements]
  elif is_cuda118:
      textgen_requirements = [req.replace('+cu121', '+cu118').replace('+cu122', '+cu118') for req in textgen_requirements]
  with open('temp_requirements.txt', 'w') as file:
      file.write('\n'.join(textgen_requirements))

  !pip install -r extensions/api/requirements.txt --upgrade
  !pip install -r temp_requirements.txt --upgrade

  print("\033[1;32;1m\n --> If you see a warning about \"previously imported packages\", just ignore it.\033[0;37;0m")
  print("\033[1;32;1m\n --> There is no need to restart the runtime.\n\033[0;37;0m")

  try:
    import flash_attn
  except:
    !pip uninstall -y flash_attn

use CD:

cd models

mkdir mixtral-8x7b

cd mixtral-8x7b

Installing models:

!wget https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF/resolve/main/mixtral-8x7b-v0.1.Q2_K.gguf

!wget https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF/resolve/main/config.json

Installing requirements:

cd ..

cd ..

!pip install tiktoken sentence_transformers SpeechRecognition
!pip install sse_starlette
!pip install flask_cloudflared

Launch:

!python server.py --share --model mixtral-8x7b --extensions openai --n-gpu-layers 125 --n_ctx 10000 --public-api

Screenshot

Logs

Logs: no

System Info

System ingo: I use google colab

The text was updated successfully, but these errors were encountered:

baas-hans · 2023-12-11T21:28:05Z

Does llama.cpp even support mixtral yet?
ggerganov/llama.cpp#4381

Chanka0 · 2023-12-11T22:38:18Z

llama.cpp has Mixtral support in the works but it's not part of the master branch yet. You need to wait for it to be merged into the master branch and for llama.cpp python bindings to get updated before it can be added to ooba.

thistleknot · 2023-12-12T00:27:48Z

came to say I get the same error when trying to work with llama cpp directly

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
Traceback (most recent call last):
  File "/data/text-generation-webui/models/mixtral/mixtral.py", line 4, in <module>
    llm = Llama(
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 923, in __init__
    self._n_vocab = self.n_vocab()
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 2184, in n_vocab
    return self._model.n_vocab()
  File "/root/miniconda3/envs/textgen/lib/python3.10/site-packages/llama_cpp/llama.py", line 250, in n_vocab
    assert self.model is not None
AssertionError

I find it funny that

https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

says mixtral works with llama.cpp august 27 onward's, but I guess that's a generic message that applies only to gguf

waefrebeorn · 2023-12-13T18:46:21Z

llama.cpp has Mixtral support in the works but it's not part of the master branch yet. You need to wait for it to be merged into the master branch and for llama.cpp python bindings to get updated before it can be added to ooba.

today everyone is saying it merged, but all Ooba users finding this thread should know Ooba hasn't merged it yet to the script update, you need to manually refresh the llamma.cpp install

Nicoolodion2 · 2023-12-13T18:56:47Z

llama.cpp has Mixtral support in the works but it's not part of the master branch yet. You need to wait for it to be merged into the master branch and for llama.cpp python bindings to get updated before it can be added to ooba.

today everyone is saying it merged, but all Ooba users finding this thread should know Ooba hasn't merged it yet to the script update, you need to manually refresh the llamma.cpp install

Can you tell me how I do that?

EMRD95 · 2023-12-13T20:40:12Z

Finally managed to run it on windows, got to install https://developer.nvidia.com/cuda-12-1-0-download-archive (getting the correct version is important) with the other dependencies, install the webui manually with conda, (without the one click installer), then just follow this https://old.reddit.com/r/Oobabooga/comments/18gijyx/simple_tutorial_using_mixtral_8x7b_gguf_in_ooba/

ejsnews · 2023-12-18T18:41:37Z

Je fais marcher mes IA sur windows 11 avec les CPU uniquement, donc cuda ne me concerne pas, mais je vais quand même regarder ce lien et tenter les mises a jour. Au moins on sait que le fichier gguf n'est pas en cause.

TheDarkTrumpet · 2023-12-31T10:15:39Z

I don't think these workarounds are require any more. According to abetlen/llama-cpp-python#1000, it's available in v0.2.23 this python library.

I needed to fix moving of my virtual environment, so I wiped the entire environment and rebuilt from scratch (which I'm using pip, so it was just removing venv, then setup the base environment again, then pip install -r requirements.txt). Works OOTB at this point.

github-actions · 2024-02-12T23:16:36Z

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

hherpa added the bug Something isn't working label Dec 11, 2023

TheDarkTrumpet mentioned this issue Dec 31, 2023

Mixtril GGUF not working #5117

Closed

1 task

github-actions bot added the stale label Feb 12, 2024

github-actions bot closed this as completed Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found #4881

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found #4881

hherpa commented Dec 11, 2023

baas-hans commented Dec 11, 2023

Chanka0 commented Dec 11, 2023 •

edited

Loading

thistleknot commented Dec 12, 2023 •

edited

Loading

waefrebeorn commented Dec 13, 2023

Nicoolodion2 commented Dec 13, 2023

EMRD95 commented Dec 13, 2023 •

edited

Loading

ejsnews commented Dec 18, 2023

TheDarkTrumpet commented Dec 31, 2023 •

edited

Loading

github-actions bot commented Feb 12, 2024

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found #4881

error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found #4881

Comments

hherpa commented Dec 11, 2023

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

baas-hans commented Dec 11, 2023

Chanka0 commented Dec 11, 2023 • edited Loading

thistleknot commented Dec 12, 2023 • edited Loading

waefrebeorn commented Dec 13, 2023

Nicoolodion2 commented Dec 13, 2023

EMRD95 commented Dec 13, 2023 • edited Loading

ejsnews commented Dec 18, 2023

TheDarkTrumpet commented Dec 31, 2023 • edited Loading

github-actions bot commented Feb 12, 2024

Chanka0 commented Dec 11, 2023 •

edited

Loading

thistleknot commented Dec 12, 2023 •

edited

Loading

EMRD95 commented Dec 13, 2023 •

edited

Loading

TheDarkTrumpet commented Dec 31, 2023 •

edited

Loading