-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support different help texts for normal and advanced users and restore legacy mode #1325
Conversation
It is still needed to get text attributes which are unsupported by the LSTM engine, and it also has better recognition rates for some texts. Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
The old option --help now shows a very basic help text. The new option --help-extra shows the full help information. It now also includes a hint that Tesseract supports lists of images. Fix also the indentation in the PSM help and use a more neutral text in the OEM help. Signed-off-by: Stefan Weil <[email protected]>
New help texts:
|
Signed-off-by: Stefan Weil <[email protected]>
My notes:
|
|
See issue #707 and also #1074 (comment) on the role of the legacy engine. |
@stweil
The crash scenario is because traineddata from tessdata_fast do not have
legacy models in them (at least for some languages).
For some languages such as Hindi, sanskrit etc it is intentional, as the
accuracy is very much improved with LSTM engine and model.
However, for other Latin script based languages, 'tesseract' may provide
better results (as you have mentioned).
I will test and post some specific crash scenarios for you. I am sure the
crash can be avoided by checking for available models in the traineddata
files. I will also add link to issues where I have commented about the same.
|
What about:
? |
Fine for me. Adding "engine" in the descriptions looks indeed better. Do you want to send a pull request, or should I prepare one? |
You. |
I think the periods are unnecessary. |
The new text was suggested by Amit Dovev, see tesseract-ocr#1325 (comment). Signed-off-by: Stefan Weil <[email protected]>
PrintHelpForPSM also has periods, and so do the other help texts. I'd keep them for now, but yes, it would also be possible to have a good help text without those periods. |
The new text was suggested by Amit Dovev, see #1325 (comment). Signed-off-by: Stefan Weil <[email protected]>
'extra' => 'advanced' |
These series partially reverts commit 173ad2b.