whisper.cpp ggml-cuda the provided PTX was compiled with an unsupported toolchain. #1627

EverThingy · 2024-01-23T09:48:25Z

LocalAI version:
quay.io/go-skynet/local-ai:v2.3.0-cublas-cuda12-ffmpeg-core

Environment, CPU architecture, OS, and Version:
Linux pc 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux
GPU 4x Nvidia RTX 4090
CPU AMD Ryzen Threadripper PRO 5975WX 32-Cores

.env variables: REBUILD=false, BUILD_TYPE=cuBLAS rest set to default

Describe the bug
In LocalAI v2.4.0, running the following request:

curl http://localhost:8080/v1/audio/transcriptions -H "Content-Type: multipart/form-data" -F file="@/home/user/Documents/George_W_Bush_Columbia_FINAL.ogg" -F model="whisper-1"

Returns the following cuda error:
ggml-cuda the provided PTX was compiled with an unsupported toolchain.

Logs

@@@@@
Skipping rebuild
@@@@@
If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true
If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed:
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF"
see the documentation at: https://localai.io/basics/build/index.html
Note: See also https://github.com/go-skynet/LocalAI/issues/288
@@@@@
CPU info:
model name	: AMD Ryzen Threadripper PRO 5975WX 32-Cores
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin brs arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
CPU:    AVX    found OK
CPU:    AVX2   found OK
CPU: no AVX512 found
@@@@@
�[90m10:14AM�[0m �[32mINF�[0m Starting LocalAI using 32 threads, with models path: /models
�[90m10:14AM�[0m �[32mINF�[0m LocalAI version: de28867 (de28867374c2334f23f78af3b3930d8105d2808d)
�[90m10:14AM�[0m �[32mINF�[0m Preloading models from /models
�[90m10:14AM�[0m �[33mDBG�[0m Model: gpt-3.5-turbo (config: {PredictionOptions:{Model:wizardlm-13b-v1.2.ggmlv3.q4_0.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-3.5-turbo F16:true Threads:0 Debug:false Roles:map[assistant:### Response: system:### System: user:### Instruction:] Embeddings:false Backend:llama-stable TemplateConfig:{Chat:wizardlm-chat ChatMessage: Completion:wizardlm-completion Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:16 MMap:true MMlock:false LowVRAM:true Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:2048 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[]})
�[90m10:14AM�[0m �[33mDBG�[0m Model: gpt-3.5-turbo-2 (config: {PredictionOptions:{Model: Language: N:0 TopP:0 TopK:0 Temperature:0 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-3.5-turbo-2 F16:false Threads:0 Debug:false Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[]})
�[90m10:14AM�[0m �[33mDBG�[0m Extracting backend assets files to /tmp/localai/backend_data

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.50.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ............ 74  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................ 14 │ 
 └───────────────────────────────────────────────────┘ 
�[90m9: 23AM�[0m �[33mDBG�[0m Request received: 
�[90m9:23AM�[0m �[33mDBG�[0m Audio file copied to: /tmp/whisper3752305688/George_W_Bush_Columbia_FINAL.ogg
�[90m9:23AM�[0m �[32mINF�[0m Loading model 'ggml-whisper-base.bin' with backend whisper
�[90m9:23AM�[0m �[33mDBG�[0m Model already loaded in memory: ggml-whisper-base.bin
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:35593: connect: connection refused"
�[90m9:23AM�[0m �[33mDBG�[0m GRPC Model not responding: ggml-whisper-base.bin
�[90m9:23AM�[0m �[33mDBG�[0m GRPC Process is not responding: ggml-whisper-base.bin
�[90m9:23AM�[0m �[33mDBG�[0m Loading model in memory from file: /models/ggml-whisper-base.bin
�[90m9:23AM�[0m �[33mDBG�[0m Loading Model ggml-whisper-base.bin with gRPC (file: /models/ggml-whisper-base.bin) (backend: whisper): {backendString:whisper model:ggml-whisper-base.bin threads:32 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000426780 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh petals:/build/backend/python/petals/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
�[90m9:23AM�[0m �[33mDBG�[0m Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/whisper
�[90m9:23AM�[0m �[33mDBG�[0m GRPC Service for ggml-whisper-base.bin will be running at: '127.0.0.1:39981'
�[90m9:23AM�[0m �[33mDBG�[0m GRPC Service state dir: /tmp/go-processmanager1770906042
�[90m9:23AM�[0m �[33mDBG�[0m GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:39981: connect: connection refused"
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr 2024/01/23 09:23:42 gRPC Server listening at 127.0.0.1:39981
�[90m9:23AM�[0m �[33mDBG�[0m GRPC Service Ready
�[90m9:23AM�[0m �[33mDBG�[0m GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:ggml-whisper-base.bin ContextSize:0 Seed:0 NBatch:0 F16Memory:false MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:0 MainGPU: TensorSplit: Threads:0 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/ggml-whisper-base.bin Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0}
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_from_file_with_params_no_state: loading model from '/models/ggml-whisper-base.bin'
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: loading model
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_vocab       = 51865
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_audio_ctx   = 1500
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_audio_state = 512
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_audio_head  = 8
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_audio_layer = 6
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_text_ctx    = 448
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_text_state  = 512
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_text_head   = 8
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_text_layer  = 6
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_mels        = 80
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: ftype         = 1
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: qntvr         = 0
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: type          = 2 (base)
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: adding 1608 extra tokens
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: n_langs       = 99
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr ggml_init_cublas: found 4 CUDA devices:
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr   Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr   Device 1: NVIDIA GeForce RTX 4090, compute capability 8.9
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr   Device 2: NVIDIA GeForce RTX 4090, compute capability 8.9
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr   Device 3: NVIDIA GeForce RTX 4090, compute capability 8.9
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_backend_init: using CUDA backend
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load:     CUDA buffer size =   147.46 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_model_load: model size    =  147.37 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_backend_init: using CUDA backend
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_state: kv self size  =   16.52 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_state: kv cross size =   18.43 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_state: compute buffer (conv)   =   14.86 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_state: compute buffer (encode) =   85.99 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_state: compute buffer (cross)  =    4.78 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr whisper_init_state: compute buffer (decode) =   96.48 MB
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr 
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr CUDA error 222 at ggml-cuda.cu:7796: the provided PTX was compiled with an unsupported toolchain.
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr current device: 0
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr GGML_ASSERT: ggml-cuda.cu:7796: !"CUDA error"
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr SIGABRT: abort
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr PC=0x7f046ee01ce1 m=0 sigcode=18446744073709551610
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr signal arrived during cgo execution
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr 
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rax    0x0
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rbx    0x7f044cdea000
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rcx    0x7f046ee01ce1
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rdx    0x0
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rdi    0x2
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rsi    0x7ffd6ed59b80
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rbp    0x7f026ce23310
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rsp    0x7ffd6ed59b80
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r8     0x0
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r9     0x7ffd6ed59b80
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r10    0x8
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r11    0x246
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r12    0xde
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r13    0x7f02732ea600
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r14    0x7f0273200000
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr r15    0x7f026a7dec00
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rip    0x7f046ee01ce1
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr rflags 0x246
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr cs     0x33
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr fs     0x0
�[90m9:23AM�[0m �[33mDBG�[0m GRPC(ggml-whisper-base.bin-127.0.0.1:39981): stderr gs     0x0
[172.23.0.1]:54264 500 - POST /v1/audio/transcriptions

Additional context

The text was updated successfully, but these errors were encountered:

vshapenko · 2024-01-23T11:15:03Z

Looks like CUDA version issue. For me helped install latest cuda version (12.3) and latest drivers

EverThingy · 2024-01-23T14:23:03Z

Thanks! Indeed, seems installing the latest version of Cuda Toolkit (12.3) resolved the issue. link

Old: NVIDIA-SMI 525.147.05 | Driver Version: 525.147.05 | CUDA Version: 12.0
New: NVIDIA-SMI 545.23.08 | Driver Version: 545.23.08 | CUDA Version: 12.3

EverThingy added bug Something isn't working unconfirmed labels Jan 23, 2024

EverThingy closed this as completed Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisper.cpp ggml-cuda the provided PTX was compiled with an unsupported toolchain. #1627

whisper.cpp ggml-cuda the provided PTX was compiled with an unsupported toolchain. #1627

EverThingy commented Jan 23, 2024 •

edited

Loading

vshapenko commented Jan 23, 2024

EverThingy commented Jan 23, 2024 •

edited

Loading

whisper.cpp ggml-cuda the provided PTX was compiled with an unsupported toolchain. #1627

whisper.cpp ggml-cuda the provided PTX was compiled with an unsupported toolchain. #1627

Comments

EverThingy commented Jan 23, 2024 • edited Loading

vshapenko commented Jan 23, 2024

EverThingy commented Jan 23, 2024 • edited Loading

EverThingy commented Jan 23, 2024 •

edited

Loading

EverThingy commented Jan 23, 2024 •

edited

Loading