Segments a wav file into several smaller audio clips using an accompanying .srt closed captioning file.
usage: srt-parse [-h] [--output-dir OUTPUT_DIR]
[--audio-out-file-pattern AUDIO_OUT_FILE_PATTERN]
[--text-out-file-pattern TEXT_OUT_FILE_PATTERN]
[--output-type {txt,csv}] [--csv-seperator CSV_SEPERATOR]
[--csv-filename CSV_FILENAME]
[--update-increment UPDATE_INCREMENT]
[--in-encoding IN_ENCODING] [--out-encoding OUT_ENCODING]
audio_input srt_input
Segment wav files according to a provided .srt closed caption file
positional arguments:
audio_input Location of .wav file to be processed
srt_input Location of .srt file to be processed
optional arguments:
-h, --help show this help message and exit
--output-dir OUTPUT_DIR
Directory for processed files to be saved to
--audio-out-file-pattern AUDIO_OUT_FILE_PATTERN
A python-style f-string for saving audio files
--text-out-file-pattern TEXT_OUT_FILE_PATTERN
A python-style f-string for saving text files
--output-type {txt,csv}
Output filetype
--csv-seperator CSV_SEPERATOR
Character sequence used to seperate values in csv
--csv-filename CSV_FILENAME
Name of file to write as csv
--update-increment UPDATE_INCREMENT
Print progress after every specified amount of
segments.
--in-encoding IN_ENCODING
Encoding used to read the .srt file
--out-encoding OUT_ENCODING
Encoding to use when writing text data to file
Using srt-parse
:
python3 srt-parse.py foo.wav foo.srt
Will produce in the following files in the output directory (by default .\out\
)
0-audio.wav
1-audio.wav
2-audio.wav
3-audio.wav
...
out.csv
Each file is made per subtitle in the .srt file and out.csv groups each audio file to its transcript.