macSubtitleOCR is a tool written entirely in Swift that converts bitmap subtitles into the SubRip subtitle format (SRT) using Optical Character Recognition (OCR). It currently supports both PGS and VobSub bitmap subtitles. The tool utilizes the built-in macOS OCR engine, offering highly accurate text recognition.
For more details on performance, refer to the Accuracy section below.
- Export
.png
images of subtitles for manual correction of OCR output. - Use the macOS OCR engine's language recognition feature to enhance accuracy by validating character sequences as real words.
- Export raw JSON output from the OCR engine for further analysis.
- Experimental internal decoder for development (mostly working, VobSub gives occasional errors)
- PGS (
.mkv
,.sup
) - VobSub (
.sub
,.idx
)
Important
This project requires Swift 6 to compile and run correctly. This project also requires FFmpeg to be installed on your system. Currently only arm64 is supported, PR adding support welcome.
To build macSubtitleOCR, follow these steps:
brew install ffmpeg
git clone https://github.com/ecdye/macSubtitleOCR
cd macSubtitleOCR
swift build
The compiled build will be available in the .build/debug
directory.
The testing process compares OCR output against known correct results. We aim for at least 95% accuracy, because slight differences may occur between machines.
swift test
In tests comparing macSubtitleOCR with the Tesseract OCR engine, the macOS OCR engine often outperforms Tesseract, particularly with challenging cases like the letter 'I'. While methods like binary image comparison, used by tools such as SubtitleEdit, may offer slightly better accuracy in some cases, the macOS OCR engine provides excellent results for most use cases.
For information on how to contribute to the project, please refer to CONTRIBUTING.md.
If you're interested in working on specific features or improvements, check out issues tagged as enhancements.