-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect text rotation without running recognition #3836
Comments
For such image prerocessing I would suggest to have a look at the leptonica programs/function examples) flipdetect_reg ,skewtest, skew_reg, and maybe dewarptest2... Of course there are limitations (see e.g. issue 622), but they are fast and reliable for most of my cases... IMHO such prepossessing should be done outside of tesseract. |
Thanks for your response, I will review the Leptonica scripts linked before deciding how to implement. |
I found a much, must faster solution to detect page rotation. Call SetImage followed by DetectOrientationScript and then call Pix *rotated = pixRotateOrth(pix, (360 - degree) / 90); However, there is currently a bug that causes this to fail randomly so you need my short patch from #4062 |
That is the API to rotate an image, but not the API to detect if it is rotated. Tesseract docs and some StackOverflow comments recommend Recognize(), but that is extremely slow. On a sample tiff I used, it took .9 seconds for DetectOrientationScript vs 2.1 seconds for Recognize - when both were followed by 90 rotation and another Recognize to extra text |
@todd-richmond, you are talking about orientation detection: 0 / 90 / 180 / 270 degrees. @Balearica is talking about a page with some parts that are skewed |
Never mind. I missed the "not" 90 when reading. De-skewing is much more challenging so we haven't bothered dealing with that for now |
Did you try using tesseract/include/tesseract/baseapi.h Lines 433 to 449 in bf7c134
|
@amitdo I did not end up implementing this way, but do believe that running I ended up creating a branch that allows for retrieving the number Tesseract already calculates, which I pushed to #4070. I think this is the most direct approach, and the only approach that does not involve redundant calculations. |
As noted in the documentation , Tesseract performs poorly when the page is at an angle (not a multiple of 90 degrees). This limitation is not problematic from an accuracy standpoint, as Tesseract accurately reports the angle of text lines, so my existing pipeline rotates and re-runs recognition on any image where the angle is significant. However, this is computationally inefficient as there does not appear to be any way to get the page angle without also running recognition (despite estimating page angle/gradient being one of the first things calculated).
Therefore, it would be of significant benefit to be able to get the page angle without running the entire recognition process. I'll work on a build that does this myself--my initial thought is to add a config option that tells Tesseract to report the page angle and quit early (before recognition) if median line angle is above a user-defined threshold, however let me know if others have thoughts on implementation.
The text was updated successfully, but these errors were encountered: