You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A tool that accepts an audio file of dictated notes, transcribes the file into text, and uses an LLM to create a summary.
How would it work?
User uploads an audio file
Chunking function cuts the file up into 30 seconds chunks (as this is the only length Whisper ASR can work with) and saves them to the filesystem
Transcription function processes the chunks one at a time, passes them over to Whisper ASR, and writes the transcript to a text file.
The finished transcript is passed to the summarisation function, which runs it through an LLM prompted with something like "Summarise these dictated notes in markdown format."
The finished transcript and summary are saved to the file system.
Tech stack
Python (and Flask?)
Whisper ASR (model run locally)
LLM for text summarisation (chatGPT? I'd prefer to do this for free...)
Issues
Using Python as I couldn't find any evidence that it's possible to run Whisper ASR locally using node, but there is a Python package for this
I don't know if there's an LLM I can use for free to do the summarisation.
Enhancments
Generate tags from the summarised notes, and save the tagged summary to an obsidian vault for future reference
Proof of concept
There's a proof of concept of the file chunking and transcription parts of the programme in this Gist.
The text was updated successfully, but these errors were encountered:
What is it?
A tool that accepts an audio file of dictated notes, transcribes the file into text, and uses an LLM to create a summary.
How would it work?
Tech stack
Issues
Enhancments
Proof of concept
There's a proof of concept of the file chunking and transcription parts of the programme in this Gist.
The text was updated successfully, but these errors were encountered: