Replies: 8 comments 18 replies
-
I don't yet have a GPU that can run LLMs, but from what I see on this model in huggingface TheBloke changed how this works with AutoGPTQ where you don't pass in a model_basename any more: https://huggingface.co/TheBloke/vicuna-13b-v1.3.0-GPTQ/discussions/3 Maybe this is the problem? |
Beta Was this translation helpful? Give feedback.
-
Just got the ESP32-S3-BOX-3 I know it's not yet supported, but just to confirm the ESP32-S3-BOX firmware doesn't seem to work on it :) |
Beta Was this translation helpful? Give feedback.
-
If anyone want's to take a look, here is the serial output after flashing, errors in WILLOW/CONFIG and WILLOW/MAIN:
|
Beta Was this translation helpful? Give feedback.
-
I reflashed with WAS and attached a serial monitor, here is the end of the log - can see errors relating to the i2c lcd panel:
I'll mention that the Wake word doesn't seem to be triggering, but with the other hardware issues, I wont worry about that for now :) |
Beta Was this translation helpful? Give feedback.
-
Generally speaking with LLMs we'll be taking a different route shortly which will eventually include removing LLM support from WIS. With WAS our intention is to insert WAS in the flow between Willow devices and WIS/HA/etc with WAS applications/integrations for all kinds of interesting pipelining of combined functionality: Willow -> WAS -> WIS transcript -> LLM via OpenAI API/vllm/lmdeploy/TGI/etc with WAS -> potentially other things -> what we do today. |
Beta Was this translation helpful? Give feedback.
-
@paulgrove - If you want to test I have a branch with several updates and fixes including chatbot support:
Oh one additional thing - you will probably want to adjust your |
Beta Was this translation helpful? Give feedback.
-
@kristiankielhofner I finally decided to try this test branch. The chatbot is disabled even with 'support_chatbot: bool = True' in the custom_settings.py configuration. It says that Device 0 is pre-Volta. The normal "Hi ESP" wakeup and Home Assistant functions are working the same as the original WIS.
|
Beta Was this translation helpful? Give feedback.
-
Nvidia has a handy reference to show compute capability for their GPUs. Compute capability is what the code actually uses to configure/enable/disable functionality as compute capability is what gets returned in software. The CUDA Wikipedia page is also generally accurate and it includes the micro-architecture friendly names that we use in WIS logging messages to the user. |
Beta Was this translation helpful? Give feedback.
-
Hello willow community, what a great project - I can't wait to get involved.
I've ordered a ESP32-S3-BOX-3 which should arrive in a few weeks (couldn't find ESP32-S3-BOX), super excited to get everything set up.
I'll run WAS on a linux server, and for now I'll run WIS on my windows desktop (with a 3090) under WSL2 + docker.
Really hoping I can learn about this project and contribute.
I've got WIS working and have tested the STT via the web interface and it works great and was super easy to set up!
However I am struggling to configure the generative chatbot aspect, with the default settings I get the following error:
Which I can confirm the url returns 401.
I had a search on huggingface and I found a very similarly named model called
TheBloke/vicuna-13b-v1.3.0-GPTQ
if I switch to this one I get another error:I've also tried selecting a few other different models, but I've yet to find a model that will work.
Can you advise how I can find and configure a compatible model from huggingface?
Thanks a million,
Paul
Beta Was this translation helpful? Give feedback.
All reactions