-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with setting the "classify_bln_numeric_mode" variable #52
Comments
I've had a look and been able to reproduce the issue. From the looks of it there is an issue with tesseract 3.02. I'll try and setup a build using the latest tesseract sources over the weekend and see if that resolves the issue, and if not file an issue with them. For reference I found https://code.google.com/p/tesseract-ocr/issues/detail?id=261 from 3.01 days with a comment that looks like the same issue however looks like it wasn't resolved (no further comments). |
I have the same issue with 3.02. Did you ever try building against the latest svn sources? |
No not yet, however I've just noticed they've released version 3.3 so will be looking to upgrade soon. |
Where did you see that they released 3.3? The latest version I see on the On Sat, Jan 11, 2014 at 5:41 PM, Charles Weld [email protected]:
|
https://code.google.com/p/tesseract-ocr/source/browse/trunk/ChangeLog looks like they haven't produced the binaries yet. |
OK looks like I got a little a head of myself, see: https://groups.google.com/forum/#!topic/tesseract-dev/UvqR53IfCgA. Anyway looks like we're stuck with 3.02 until 3.03 is released as the language data files are tied to a given version of tesseract. |
@charlesw: Can you test 3.03 from svn? If you need changes in it, you need to send patch ASAP - official release should be at the end of January... |
Notes: * x86 only (no x64 build yet) * based on rev 987 * Highly likely that tesseract and leptonica have different VS Runtimes (tesseract was compiled with VS 2013, while leptonica VS 2012) this will be fixed when I generate the x64 builds. * Confirmed Issue #52 has been resolved. * Could be Interop incompatibilities UNTESTED
@zdenop I can confirm that this issue has been resolved in the latest 3.03 release (revision 987) no patches are required. Thanks |
Thanks, works for me aswell Date: Tue, 14 Jan 2014 02:06:02 -0800 @zdenop I can confirm that this issue has been resolved in the latest 3.03 release (revision 987) no patches are required. Thanks — |
Stupid question: does this parameter help with recognizing only digits? I On Tue, Jan 14, 2014 at 5:11 AM, simonmb [email protected] wrote:
|
Yes, I believe that's its purpose. Note from my very brief tests it will also recognise other characters that could be in numbers; for example |
FWIW, I compiled the new binaries from revision 998 of svn and it still _(lldb) _frame info frame #12: 0x00258964 which is less than clear. On Sat, Jan 11, 2014 at 6:50 PM, Charles Weld [email protected]:
|
Did you try just using the |
No .. actually I discovered that I hadn't cleaned since the last time I But when I actually build from source, I can build the library w/o errors, "tesseract::TessBaseAPI::GetUTF8Text()", referenced from: Has this class actually been removed? I can see the source for it, but it On Wed, Jan 15, 2014 at 5:23 PM, Charles Weld [email protected]:
|
Ok, so I missed a build error - it's trying to include "tiff.h", even I looked for your branch dev_3.0.3 and don't see it anywhere here: Are you looking somewhere else? Rob. On Wed, Jan 15, 2014 at 7:37 PM, Robert Mathews <
|
Well, this diff looks pretty suspicious: And I'm 2 revisions before that .. looks like I need revision 990 On Wed, Jan 15, 2014 at 8:30 PM, Robert Mathews <
|
and revision 990 doesn't compile on OSX, because it is trying to include To be precise, I get this error: ld: library not found for -lrt clang: error: linker command failed with exit code 1 (use -v to see complete build log here: https://gist.github.com/8448535 Help? On Wed, Jan 15, 2014 at 8:34 PM, Robert Mathews <
|
@robmathews: I can try to fix it, but I need OSX tester - please send me your e-mail zdenop (at) gmail (dot) com. |
@robmathews looks like we're at cross purposes, I was talking about this .NET wrapper not compiling / linking the native tesseract-ocr library. The dev_3.03 branch is on THIS repository not tesseract-ocr's and only includes the native dllsI compiled using VS 2013 Express on Windows 8.1. I'm not an expert on c++ / c and haven't really done any work involving c++ for a LONG time, so I'm going to be of limited help here :/ |
@charles - which is "THIS" repository? Is there some other repository that @zdenop - I have a project here for iOS [ Rob. On Thu, Jan 16, 2014 at 5:30 AM, Charles Weld [email protected]:
|
@robmathews "THIS" repository is the one hosting this issue (https://github.com/charlesw/tesseract) and is an unofficial .net wrapper for tesseract. It doesn't contain the tesseract sources for that you'll need the svn repo from the official site (http://tesseract-ocr.googlecode.com/svn/trunk/) which I'm assuming you've got. |
@charles. yes, exactly. On Thu, Jan 16, 2014 at 5:19 PM, Charles Weld [email protected]:
|
Hi I'm trying to read some numbers from a scanned document.
My first test was using the BaseApiTester. The only things I changed in the project was the path to the image, and I added following line:
bool ret = engine.SetVariable("classify_bln_numeric_mode", 1);
before the line:
using (var page = engine.Process(img)){
Whenever I set the variable the program will crash. If I don't set the variable the program does what it is supposed to.
I checked here (http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version)
to see what variable to set.
The error Im getting is:
System.AccessViolationException
Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
And the Command Window:
Process image
first_unichar != NULL:Error:Assert failed:in file .\wordrec\language_model.cpp, line 445
Thanks
The text was updated successfully, but these errors were encountered: