Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example Chat-UI (ChatGPT OSS Alternative) causing crash of API with preloaded model #574

Closed
typoworx-de opened this issue Jun 12, 2023 · 11 comments · Fixed by #2232
Closed
Assignees
Labels
bug Something isn't working

Comments

@typoworx-de
Copy link

typoworx-de commented Jun 12, 2023

LocalAI version:
quay.io/go-skynet/local-ai:latest

Environment, CPU architecture, OS, and Version:
IBM x3400 Server
with

  • VMware Host (x86-64 CPU Arch)
  • VM Guest: Ubuntu 20.04 (x86-64 CPU Arch)
  • Docker version 24.0.2, build cb74dfc
  • docker-compose version 1.29.2

Describe the bug
I'm new to localai and was trying to set-up the example "ChatGPT OSS Alternative" presented on localai-homepage. Link to example is: https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui

At first it looks like the localai-api is running fine, but sending any prompft using the chat-ui to the API causes crashing (see logs attached).

To Reproduce
Try this example:
https://github.com/go-skynet/LocalAI/tree/master/examples/chatbot-ui

This is my resulting docker-compose.yaml trying to adopt it:

version: '3.8'

services:
  api:
    # https://localai.io/basics/getting_started/index.html#run-localai-in-kubernetes
    #image: quay.io/go-skynet/local-ai:v1.18.0
    image: quay.io/go-skynet/local-ai:latest
    build:
      context: .
      dockerfile: Dockerfile
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
      interval: 1m
      timeout: 20m
      retries: 20
    ports:
      - 8080:8080
    env_file:
      - .env
    environment:
      #- DEBUG=true
      - MODELS_PATH=/models
      # You can preload different models here as well.
      # See: https://github.com/go-skynet/model-gallery
      - 'PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}]'
    volumes:
      - "./models:/models:cached"
    command: ["/usr/bin/local-ai" ]

  chatgpt:
    depends_on:
      api:
        condition: service_healthy
    image: ghcr.io/mckaywrigley/chatbot-ui:main
    ports:
      - 3000:3000
    environment:
      - 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX'
      - 'OPENAI_API_HOST=http://api:8080'

Expected behavior
I've expected a working example with at least any output to the chat-gpt like prompt. But there's only "internal error" response popping up.

Logs
Log file from docker-container

Additional context

@typoworx-de typoworx-de added the bug Something isn't working label Jun 12, 2023
@typoworx-de
Copy link
Author

local-ai_api_1_logs.txt

@typoworx-de
Copy link
Author

typoworx-de commented Jun 12, 2023

Possibly related to this issues as well:
#195, #192

@typoworx-de
Copy link
Author

Just leaving it here in case others have similar problems ... obvisously my docker-machine had not enough RAM-Memory assigned, causing the crash when trying to load the models into RAM memory. Trying with more memory assigned to the VM and reporting here if it works then.

@typoworx-de
Copy link
Author

Tried with 16 GB RAM attached still crashes the docker-container for localai-api without useful exception pointing out what's going wrong.

@typoworx-de
Copy link
Author

I've cross checked now and deployed the same docker-compose setup on my notebook-workstation (Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz") with Ubuntu OS/Docker. There it works!

The previous deployment that caused problems was on my IBM Server which runs VMware ESXi using a Intel(R) Xeon(R) CPU E5620 @ 2.40GHz and Ubuntu OS/Docker VM.

So either local-ai stack has any kind of problems with VMware virtualisation or with Intel Xeon CPU or Xeon Model E5620?!

@kroshira
Copy link

i have a Xeon E5649 CPU and have the same issue with the api crashing. I suspect it is an incompatible CPU.

Server specs
Dell R710
96 gig RAM
2x Xeon E5649 12 core @ 2.53GHz
28 TB storage
ubuntu 20.04 LTS 5.4.0-86-generic kernel

my docker compose file

version: '3.6'

services:
  api:
    image: quay.io/go-skynet/local-ai:latest
    # As initially LocalAI will download the models defined in PRELOAD_MODELS
    # you might need to tweak the healthcheck values here according to your network connection.
    # Here we give a timespan of 20m to download all the required files.
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
      interval: 1m
      timeout: 20m
      retries: 20
    build:
      context: ./
      dockerfile: Dockerfile
    ports:
      - 8050:8080
    environment:
      - DEBUG=true
      - REBUILD=true
      - BUILD_TYPE=generic
      - MODELS_PATH=/models
      - THREADS=14
      - CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
      # You can preload different models here as well.
      # See: https://github.com/go-skynet/model-gallery
      - 'PRELOAD_MODELS=[{"url": "github:go-skynet/model-gallery/mpt-7b-chat.yaml", "name": "mpt-7b-chat"},{"url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt-3.5-turbo"}, { "url": "github:go-skynet/model-gallery/bert-embeddings.yaml", "name": "text-embedding-ada-002"},{"url": "github:go-skynet/model-gallery/stablediffusion.yaml"}]'
    volumes:
      - ./models:/models:cached
    command: ["/usr/bin/local-ai" ]
  chatgpt:
    depends_on:
      api:
        condition: service_healthy
    image: ghcr.io/mckaywrigley/chatbot-ui:main
    ports:
      - 3500:3000
    environment:
      - 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX'
      - 'OPENAI_API_HOST=http://api:8080'
    volumes:
      - ./models:/models:cached

failure message (there is additional output that i can provide but i will truncate it here as this seems the most relevant):

5:53PM DBG Loading model llama from WizardLM-7B-uncensored.ggmlv3.q5_1
5:53PM DBG Loading model in memory from file: /models/WizardLM-7B-uncensored.ggmlv3.q5_1
SIGILL: illegal instruction
PC=0xa1ab80 m=9 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc5 0xf9 0x6f 0x5 0x98 0xbe 0x8c 0x0 0xc7 0x47 0x10 0x0 0x0 0x0 0x0 0x48

note: i have tried multiple models. best case scenario they return no response. worst case is it crashes like this. Would love to get this working on my server just for funsies. but im pretty sure the CPU is the limiting factor here. I know for a fact it does not have AVX so... thats a bad sign from the get go

@bnusunny
Copy link
Contributor

This is most likely caused by AVX support. You can compile local-ai on this machine to get a version optimized for it.

@cstuart1310
Copy link

Anyone else scouring through the issues for a solution, build it locally like bnusunny mentions above like this, but it does make it incredibly slow
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" make build

#288 (comment)

@bnusunny
Copy link
Contributor

Yeah, without AVX, ggml will be slow on CPU.

@stereotypy
Copy link

Anyone else scouring through the issues for a solution, build it locally like bnusunny mentions above like this, but it does make it incredibly slow CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" make build

#288 (comment)

Does this work in the docker container? For me I got it to run locally with just CMAKE_ARGS="-DLLAMA_AVX2=OFF" but it was still crashing in the docker container.

@localai-bot
Copy link
Contributor

⚠️⚠️⚠️⚠️⚠️

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful 😸 and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me,
I don't support (yet) replying!

⚠️⚠️⚠️⚠️⚠️

I see that the issue is related to the LocalAI API image version. The latest version of LocalAI is 1.20.0, and the image used in the example is 1.18.0. This could be causing the issue. Please try updating the image tag in the api service in your docker-compose.yaml file to use the latest version of the LocalAI API image:

image: quay.io/go-skynet/local-ai:latest

Also, make sure that you have installed all the required dependencies and packages for LocalAI on your system. You can do this by following the installation instructions provided in the LocalAI documentation.

Sources:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants