program shutdown in tesseract step due to language files #45

Bacchushlg · 2018-01-22T09:54:53Z

The tesseract step outputs errors:

Failed loading language 'fra'
Failed loading language 'eng'
read_params_file: parameter not found:

After this Audiveris shuts down with no additional errors.

I have tried several combinations: put the language file to a certain directory and set TESSDATA_PREFIX to this directory or create a directory "C:\Program Files (x86)\tesseract-ocr\tessdata" and put the language files there. I get the error in both cases.
The language files date from 15.01.2018.

hbitteur · 2018-01-22T10:47:00Z

These error messages don't originate from Audiveris java code, so we can assume they come from Tesseract C++ binary code.

IIRC there is a caveat with TESSDATA_PREFIX. It is not meant to point to the language file directory but rather to the directory which contains the tessdata directory which in turns contains your language files. Or something like that :-)

For example, see this issue tesseract-ocr/tesseract#221

maximumspatium · 2018-01-22T11:25:23Z

@Bacchushlg
Some more info would be helpful. Which Tesseract version is shown at Audiveris' startup?
Which Java/OS are you running?

Bacchushlg · 2018-01-22T11:31:39Z

Audiveris: 5.0.0:743f229a9
OS: Windows 10 10.0
Architecture: amd64
Java VM: Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
OCR Engine: Tesseract OCR, version 3.04.01

Just one more question: I have just the 4 language files in the tessdata directory: deu, eng, fra and ita.traineddata
Should there be more?

maximumspatium · 2018-01-22T11:51:15Z

@Bacchushlg Thanks!

Please provide me with a detailed description on how did you install Tesseract language files (where are they located, where they came from etc.)

I have just the 4 language files in the tessdata directory: deu, eng, fra and ita.traineddata
Should there be more?

It depends on your targets. In the minimal configuration, only eng.traineddata + config files are required. For every further language in your scores, you'll need to add an appropriate OCR language.

Bacchushlg · 2018-01-22T12:12:36Z

I found the reason for my problem: I had downloaded the actual .traineddata-files, which only work with tesseract version 4, while audiveris uses version 3.
I downloaded the correct ones now and it works fine.
Just one more question: is there some documentation about the GUI of audiveris? I don't understand some features, esp. I don't understand how to train elements (e.g. make audiveris understand, that a certain "3" belongs to a triole).

maximumspatium · 2018-01-22T12:33:20Z

I downloaded the correct ones now and it works fine.

I'm glad you solved your problem!

is there some documentation about the GUI of audiveris?

Currently no, but we'll add one very soon because v5.1 is about to be released.

I don't understand how to train elements (e.g. make audiveris understand, that a certain "3" belongs to a triple).

Left click on your "3" in the score, go to "Shape" palette in the panel to the the right, click on the pedal mark (𝆮) followed by a double-click on the TUPLET_THREE symbol. With a bunch of luck, your symbol will be converted to the desired tuple...

maximumspatium · 2018-01-22T12:34:20Z

I'll close this issue because the original problem has been solved.

hbitteur · 2018-01-22T13:13:24Z

@Bacchushlg
Writing doc on the UI provided by coming 5.1 release stands high on our todo list, right after fixing some hot issues like popup menu on MacOS (see #2), or lyric lines (see #44), etc. (8 issues as of today).
I'm closing yours since Tesseract is now OK for you. So, that's 7 issues left.

Doc should be available very shortly. Keep in mind however, that the training of classifiers is a bit more complex than plain end-user actions, but we'll address it as well.

maximumspatium added the help wanted label Jan 22, 2018

maximumspatium closed this as completed Jan 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

program shutdown in tesseract step due to language files #45

program shutdown in tesseract step due to language files #45

Bacchushlg commented Jan 22, 2018

hbitteur commented Jan 22, 2018

maximumspatium commented Jan 22, 2018

Bacchushlg commented Jan 22, 2018

maximumspatium commented Jan 22, 2018

Bacchushlg commented Jan 22, 2018

maximumspatium commented Jan 22, 2018 •

edited

Loading

maximumspatium commented Jan 22, 2018

hbitteur commented Jan 22, 2018 •

edited by maximumspatium

Loading

program shutdown in tesseract step due to language files #45

program shutdown in tesseract step due to language files #45

Comments

Bacchushlg commented Jan 22, 2018

hbitteur commented Jan 22, 2018

maximumspatium commented Jan 22, 2018

Bacchushlg commented Jan 22, 2018

maximumspatium commented Jan 22, 2018

Bacchushlg commented Jan 22, 2018

maximumspatium commented Jan 22, 2018 • edited Loading

maximumspatium commented Jan 22, 2018

hbitteur commented Jan 22, 2018 • edited by maximumspatium Loading

maximumspatium commented Jan 22, 2018 •

edited

Loading

hbitteur commented Jan 22, 2018 •

edited by maximumspatium

Loading