Dictation note taking with summary #1

antoni-devlin · 2024-01-03T15:18:35Z

What is it?

A tool that accepts an audio file of dictated notes, transcribes the file into text, and uses an LLM to create a summary.

User uploads an audio file
Chunking function cuts the file up into 30 seconds chunks (as this is the only length Whisper ASR can work with) and saves them to the filesystem
Transcription function processes the chunks one at a time, passes them over to Whisper ASR, and writes the transcript to a text file.
The finished transcript is passed to the summarisation function, which runs it through an LLM prompted with something like "Summarise these dictated notes in markdown format."
The finished transcript and summary are saved to the file system.

Using Python as I couldn't find any evidence that it's possible to run Whisper ASR locally using node, but there is a Python package for this
I don't know if there's an LLM I can use for free to do the summarisation.

Generate tags from the summarised notes, and save the tagged summary to an obsidian vault for future reference

There's a proof of concept of the file chunking and transcription parts of the programme in this Gist.