Warning
I no longer use Hydrus Network. While I'll try to do my best to fix critical bugs, I won't be adding new features.
This project runs OCR on images located in Hydrus Network using an external daemon and a third-party library.
Caution
I am not liable if this destroys your data. Make backups regularly.
- Create a tag service in Hydrus. It can be called whatever you like, but we recommend
ocr
so you remember what it's for. Save the service key for later. - Enable the client API.
- Create a client API access key (documented above). Give it the
edit file notes
,edit file tags
, andsearch for and fetch files
permissions. Save the service key for later.
- Install
hydrus-ocr
and its Python dependencies withpip install https://github.com/tomodachi94/hydrus-ocr/releases/download/v0.2.0/hydrus_ocr-0.2.0-py3-none-any.whl
. - Install either
tesseract
/libtesseract
orcuneiform
and ensure it is available on your$PATH
. - Copy
env.example
to.env
(or to another place where you can set environment variables) and fill in the values. - Run the daemon using
python3 -m hydrus_ocr daemon
. If you want to get fancy, you can configure it to start up automatically withsystemd
, but that is outside of the scope of these docs.- If you only want to run this once (e.g. for running this with
cron
), runpython3 -m hydrus_ocr singular
.
- If you only want to run this once (e.g. for running this with
- Select a file (or a bunch of files!) and right-click them. Select
manage > tags
, selectocr
(or the name you selected for the tag service), and add theocr wanted
tag to the file(s). Apply the changes. - Wait for the daemon to do its job. Depending on the number of files queued, it could take a bit to OCR the files.
- Profit. Check the notes for the file; look for a note titled
ocr
.
This program is configured entirely through environment variables. Here's what they do:
HYDRUS_OCR_ACCESS_KEY
: The access key for the client API. This is a long hexadecimal string.HYDRUS_OCR_API_URL
: The base URL for the client API. This looks likehttp://localhost:45869
by default.HYDRUS_OCR_TAG_SERVICE_KEY
: The service key for the tag service. This is a long hexadecimal string.HYDRUS_OCR_LOOP_DELAY
: This controls the frequency at which the program checks for files to OCR. The default value causes a check every 10 seconds; increase or decrease depending on how many requests your Hydrus server can handle at once.HYDRUS_OCR_LANGUAGE
: The language to OCR the text in (defaults to English). See the Tesseract documentation for a full list of languages. Make sure to install the language(s) you want if it isn't available by default. Multiple languages are supported by separating each with a plus (likeeng+deu+jpn
).
This is a glossary of all possible user-caused errors.
The program couldn't find Tesseract or Cuneiform. See § Installation for more information.
The program couldn't find the client API access key and/or the tag service key. See § Configuration for more information.
The changelog is maintained in ./CHANGELOG.md
.
You shouldn't. You should read the source code yourself. I've tried to make the code as easy-to-read as possible, with docstrings for all (internal) functions and comments for ambiguous lines of code.
I used Hydrus to store a large repository of screenshots of chat logs. I wanted to find a way to search their text, and this is the result.
This program uses Tesseract to do most of the heavy lifting. Tesseract is notoriously bad at OCRing specific types of images, as well as images of lower quality.
Aside from the fact that this would likely be rejected in a PR, OCR can be a resource-intensive operation, and I didn't want to risk the stability of my Hydrus application.