w∆
| Winterdelta
voice-ai
/swift
is now uninterruptible:- If the
ai
has started speaking, it will continue until it has finished speaking. - Aids
conversational-fluency
a little. - The overall
conversational-experience
should be less 'jittery' - e.g. if there is ambient background noise, it won't unexpectedly stop or start speaking. - Easier to listen to, basically, and easier to chat with, overall.
- If the
Swift is a fast AI voice assistant.
- Groq is used for fast inference of OpenAI Whisper (for transcription) and Meta Llama 3 (for generating the text response).
- Cartesia's Sonic voice model is used for fast speech synthesis, which is streamed to the frontend.
- VAD is used to detect when the user is talking, and run callbacks on speech segments.
- The app is a Next.js project written in TypeScript and deployed to Vercel.
Thank you to the teams at Groq and Cartesia for providing access to their APIs for this demo!
- Clone the repository
- Copy
.env.example
to.env.local
and fill in the environment variables. - Run
pnpm install
to install dependencies. - Run
pnpm dev
to start the development server.