Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: RTX 4080 - crash while loading Vulkan #886

Open
SeriousOldMan opened this issue Aug 1, 2024 · 8 comments
Open

[BUG]: RTX 4080 - crash while loading Vulkan #886

SeriousOldMan opened this issue Aug 1, 2024 · 8 comments

Comments

@SeriousOldMan
Copy link

SeriousOldMan commented Aug 1, 2024

Description

System: Win 11, RTX 4080 with abolute latest Nvidia driver

  1. No CUDA installed in system
  2. cuda 11, cuda 12 and vulkan all installed in runtimes folder
  3. When starting up and loading model, an exception is thrown while initializing Vulkan
  4. Error occurs also, when cuda 11 and cuda 12 are not present

This error is gone, once CUDA is installed from the Nvidia site, but I wanted to use Vulkan, since this does not require to install additional drivers from Nvidia (in theory).

Reproduction Steps

See above.

Code:

using LLama.Common;
using LLama;
using System.Globalization;
using System.Text;

namespace LLMRuntime;

public class LLMExecutor
{
    string ModelPath;
    double Temperature;
    int MaxTokens;
    int GPULayers;

    ModelParams Parameters;
    LLamaWeights Model;
    InteractiveExecutor Executor;

    public LLMExecutor(string modelPath, double temperature, int maxTokens, int gpuLayers)
    {
        ModelPath = modelPath;
        Temperature = temperature;
        MaxTokens = maxTokens;
        GPULayers = gpuLayers;

        Parameters = new ModelParams(modelPath)
        {
            ContextSize = 32768,
            GpuLayerCount = gpuLayers 
        };
        Model = LLamaWeights.LoadFromFile(Parameters);
        Executor = new InteractiveExecutor(Model.CreateContext(Parameters));
    }

    public string ParsePrompt(ChatHistory chatHistory, string prompt)
    {
        void addMessage(AuthorRole role, string message)
        {
            if (role != AuthorRole.Unknown)
                chatHistory.AddMessage(role, message);
        }

        AuthorRole role = AuthorRole.Unknown;
        string message = "";

        foreach (string line in prompt.Split(new string[] { Environment.NewLine, "\n" }, StringSplitOptions.None))
        {
            string input = line.Trim();

            if (input.StartsWith("<|###"))
            {
                addMessage(role, message);

                message = "";

                if (input == "<|### System ###|>")
                    role = AuthorRole.System;
                else if (input == "<|### Assistant ###|>")
                    role = AuthorRole.Assistant;
                else if (input == "<|### User ###|>")
                    role = AuthorRole.User;
            }
            else
                message += (input + Environment.NewLine);
        }

        return (role == AuthorRole.User) ? message : "";
    }

    public async Task<string> AskAsync(string prompt)
    {
        // Add chat histories as prompt to tell AI how to act.
        var chatHistory = new ChatHistory();

        string userInput = ParsePrompt(chatHistory, prompt);

        ChatSession session = new(Executor, chatHistory);

        InferenceParams inferenceParams = new InferenceParams()
        {
            MaxTokens = MaxTokens,
            AntiPrompts = new List<string> { "User:" }
        };

        string result = "";

        await foreach (
            var text
            in session.ChatAsync(
                new ChatHistory.Message(AuthorRole.User, userInput),
                inferenceParams))
            result += text;

        return result;
    }

    public string Ask(string prompt)
    {

        return AskAsync(prompt).Result;
    }
}

static class Program
{
    static string WaitForPrompt(string fileName)
    {
        while (true)
        {
            if (File.Exists(fileName))
            {
                StreamReader promptStream = new StreamReader(fileName);

                string prompt = promptStream.ReadToEnd();

                promptStream.Close();

                File.Delete(fileName);

                return prompt;
            }

            Thread.Sleep(100);
        }
    }

    [STAThread]
    static void Main(string[] args)
    {
        Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture("en-US");

        try
        {
            LLMExecutor executor = new LLMExecutor(args[2],
                                                   (args.Length > 3) ? Double.Parse(args[3]) : 0.5,
                                                   (args.Length > 4) ? int.Parse(args[4]) : 2048,
                                                   (args.Length > 5) ? int.Parse(args[5]) : 0);

            while (true)
            {
                string prompt = WaitForPrompt(args[0]);

                if (prompt.Trim() == "Exit")
                    break;

                try
                {
                    string answer = executor.Ask(prompt);
                    StreamWriter outStream = new StreamWriter(args[1], false, Encoding.Unicode);

                    outStream.Write(answer);
                    outStream.Flush();

                    outStream.Close();
                }
                catch (Exception e)
                {
                    StreamWriter outStream = new StreamWriter(args[1], false, Encoding.Unicode);

                    outStream.Write("Error");
                    outStream.Flush();

                    outStream.Close();
                }
            }
        }
        catch (Exception e)
        {
            System.Environment.Exit(1);
        }
    }
}

Environment & Configuration

See description...

Known Workarounds

Install Cuda

@m0nsky
Copy link
Contributor

m0nsky commented Aug 2, 2024

What is the error?

@SeriousOldMan
Copy link
Author

SeriousOldMan commented Aug 2, 2024

What is the error?

Hi, it is the same error, as described in issue #887 (opened just after mine report):

ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: NVIDIA GeForce RTX 4080 (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32
Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:


at LLama.Native.SafeLlamaModelHandle.llama_load_model_from_file(System.String, LLama.Native.LLamaModelParams)


at LLama.Native.SafeLlamaModelHandle.LoadFromFile(System.String, LLama.Native.LLamaModelParams)
at LLama.LLamaWeights.LoadFromFile(LLama.Abstractions.IModelParams)
at LLMRuntime.LLMExecutor..ctor(System.String, Double, Int32, Int32)
at LLMRuntime.Program.Main(System.String[])

@SeriousOldMan
Copy link
Author

By the way, it is independent of the context size in the ModelParameters. I tried different values, Vulkan always crashes. CUDA is fine.
image

@m0nsky
Copy link
Contributor

m0nsky commented Aug 6, 2024

Seems like this is a llama.cpp issue, not a LLamaSharp issue.

Not sure why installing CUDA impacts it, though. Are you sure it was not a coincidence? It seems like the native library loader is correctly selecting the Vulkan backend and it's throwing an error on the llama.cpp side.

@LSXAxeller
Copy link

ah yes..I submitted an issue on llama.cpp after testing their binaries a few days ago, looks like upstream issue, I thought it was just problem with AMD but it looks like a general issue with the Vulkan backend

@SeriousOldMan
Copy link
Author

Seems like this is a llama.cpp issue, not a LLamaSharp issue.

Not sure why installing CUDA impacts it, though. Are you sure it was not a coincidence? It seems like the native library loader is correctly selecting the Vulkan backend and it's throwing an error on the llama.cpp side.

It may be that the installation of CUDA fixed it, because then CUDA is selected first and Vulkan is never touched. By the way, it fixes it only, if also the CUDA driver by Nvidia has been installed, otherwise CUDA is skipped and Vulkan is the next try. Seems reasonable...

@GalactixGod
Copy link

I had this issue using Vulkan as well. I don't have GPU on that device but it would always crash.

Randomly, setting the GPULayer count = 1 got it running for me.
0 crashes, 2 crashes
Both with the same error you are reporting about an attempt to read/write to protected memory.
YMMV

@LSXAxeller
Copy link

it's problem of driver crash during device initialization, some external program get's hooked to vulkan driver, in my case it was Mirillis's Action! game recorder, after uninstalling it everything ran fine, return to my llama.cpp issue user 0cc4m guided me to fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants