You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using tesseract to perform OCR in a document that contains an email address. Using the eng trained data, it recognizes the email address correctly (but naturally fails in all accented characters). When I switched to the por trained data, it picks up all portuguese characters correctly, but fails to recognize the @ in email addresses.
Is there any additional configuration needed for this special characters?
Thank you!
The text was updated successfully, but these errors were encountered:
That's correct. See por.unicharset and tur.unicharset which do not contain the @ character. So that character was not part of the training data and will therefore never be recognized by those models.
Hi,
I'm using tesseract to perform OCR in a document that contains an email address. Using the eng trained data, it recognizes the email address correctly (but naturally fails in all accented characters). When I switched to the por trained data, it picks up all portuguese characters correctly, but fails to recognize the @ in email addresses.
Is there any additional configuration needed for this special characters?
Thank you!
The text was updated successfully, but these errors were encountered: