-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Illegal Parameter specification! with Tesseract4Alpha #1010
Comments
Please use the latest source from master branch of github and inform whether you still get the error. |
I'm using the lastest source yet. I have the same Error (*) in two different OS. (*) Error: Illegal Parameter specification! |
what version of c++ are you using?
which ones? I have been able to build on ubuntu 14.04. Travis and appveyor builds are building ok. |
Also, are you able to run tesseract from command line ? tesseract -v also try to OCR the sample image from testing folder. |
@Shreeshrii I'm using g++ 7.1.1 in Arch and 4.8.2 in CentOS 6.7.
|
what about tesseract --list_langs Are you able to OCR an image from command line with the 4.0 version? Do you have 4.00.00alpha version of traineddata files? Download 4.0 traineddata to a different folder and refer to that |
No problems detecting langs with tesseract --list_langs (eng, spa and osd trainned files for LSTM based 4.00.00alpha version). About command line recognition, I have done fine an example from testing folder properly. Perhaps, some Java code has changed from this 4Alpha version? |
@nachobit Please see Quan's Java JNA wrapper for Tesseract OCR API at https://github.com/nguyenq/tess4j |
The problem with 3.05.01 version is that I get different resutls from both OS using same Leptonica and Tesseract ver. in a PDF recognition. Example: 0000 0340 º71º ZL (in CentOS) and 0000 0340 0710 ZL (in Arch). For that reason I'd like to improve the 4Alpha but it's impossible for the error commented some lines back. |
If you have an issue with a wrapper to Tesseract's C/C++ API, please report the issue to the developers of that software. |
Ray said, many years ago, that you can get different results with different compilers. |
Yes. |
Updated to the lastest libs from Tess4J-3.4.0-src I get same error when launch the OCR from Java code. From 3.05.01 version, is there any solution to solve the fail recognizing "zeros" ( º instead of 0)? |
3.4.0 does not include the 4.00 changes. |
You can try to compile with a newer version of gcc. I can't promise that this 'solution' will help you with this issue. |
Ok, that's the problem? Tess4J-3.4.0 (Java) is not supported by 4.00Alpha release? Then I will try compilling with a newer version of GCC. |
I assume that's the source of the problem (It's Tess4J 3.4.0 that seems to not have support for Tesseract 4.00, not vice versa). To be sure, ask the developer. https://sourceforge.net/p/tess4j/discussion/1202294/ |
nguyenq/tess4j@74c8509
+Version 4.0.0 beta (8 June 2017)
+- Upgrade to Tesseract 4.0.0 alpha (8c29e68)
+- Update Lept4J to 1.5.0 (Leptonica 1.74.2)
ShreeDevi
…____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Thu, Jun 29, 2017 at 3:40 PM, Nacho Romero ***@***.***> wrote:
Updated to the lastest libs from *Tess4J-3.4.0-src* I get same error when
launch the OCR from Java code.
From 3.05.01 version, is there any solution to solve the fail recognizing
"zeros" ( º instead of 0)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1010 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AE2_oxFIpUnAMuY0-FEdUVI_o0dR3pxUks5sI3f8gaJpZM4OH3Lp>
.
|
tess4j's master branch is for Tesseract 4.0alpha and includes the latest Tesseract 4.0alpha Windows binary. All of its unit tests passed on Windows 10. We have not tested on Linux OS yet. Since you link against Leptonica 1.74.4, make sure you use lept4j-1.6.0. |
We do not support 3rd party sw including tesseract wrapper. Please reproduce error with c++ |
Hi .. it seems you just have to add environment variable LC_NUMERIC="C" ... and it works. :) |
I dug into Tesseract's code and found that the string "Illegal Parameter specification" only exists in one place, namely in the file classify/clusttool.cpp. After some debugging I realised that the function ReadParamDesc() calls sscanf() at line 82 (for git commit hash 2b854e3), which is locale dependent. It fails since the numeric input (two floating point values) are written with dots (example: 1.23), but using a different locale other than en_US for LC_NUMERIC may cause sscanf() to expect other characters, like commas (1,23). With other words, the error is in tesseract, assuming a locale. It should rather be set explicitly. The workaround is to set LC_NUMERIC=en_US.UTF-8. |
i am facing the same issue.Please could u share the file which has to be changed, so that i can jus go replace the specific file and continue creating the traineddata. |
@nizzeberra : i see the same files as you say, but dont know where to place the code. Please could you share that file. |
@stweil Is it possible to address this for final 4.0.0? |
Setting I wonder whether the @nizzeberra, which systems / C libraries show that strange behaviour? Do you have links to documentation? PS: These code locations use
|
@hamduu I'm not sure I understand what you are asking. The file and the line that I pointed out is where the error is triggered, and should probably not be changed. And LC_NUMERIC is just an environment variable that you can set manually. @stweil I have built and tested tesseract on Linux Mint and I have no info about specific libraries right now. |
Here is the man page, and it's pretty clear about the locale: |
Linux Mint uses Debian packages, so the result should not be much different. The man page only says that LC_NUMERIC can be used to allow separators for multiples of thousand. Here is the test scenario which I used (maybe you can try it on Linux Mint):
|
https://en.cppreference.com/w/cpp/locale/setlocale Here |
That's interesting. So C/C++ programs don't use the locale which was set in the environment, but start running with the "C" locale. That is exactly what I observed in my test. Only if I set That implies that we have no problem for the tesseract executable or the training programs which are provided by Tesseract. Nor will external software have a problem as long as it does not set Maybe Java uses the environment settings to set In addition to the problem with
@Shreeshrii, I don't think that all that code can be found and rewritten for 4.0.0. It would be possible to report a warning when the Tesseract initialisation code detects an unsupported locale setting. |
@stweil Thanks for the investigation. Yes, please make possible changes to point the users in the right direction. Related issue reg: locales. |
sorry to ask about this in detail, i have no expertise in this. i even tried changing the locale from terminal,but i am still finding the same error. Have added the screenshots for the same. how do i make this command work. can use please tell me where exactly do i need to make changes to set the locale differently? Please do help me out. Regards |
Hi , Tesseract is working fine in main method(JAVA), but when i try to run in web application i am facing below error **# A fatal error has been detected by the Java Runtime Environment:SIGSEGV (0xb) at pc=0x00007f96274dbac7, pid=4516, tid=0x00007f9699212700JRE version: OpenJDK Runtime Environment (8.0_171-b11) (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)Java VM: OpenJDK 64-Bit Server VM (25.171-b11 mixed mode linux-amd64 compressed oops)Problematic frame:C [libtesseract.so.3.0.4+0x9dac7] tesseract::Tesseract::recog_all_words(PAGE_RES*, ETEXT_DESC*, TBOX const*, char const*, int)+0x5e7Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java againAn error report file with more information is saved as:/home/ahextech/suresh/softwares/eclipse/hs_err_pid4516.logIf you would like to submit a bug report, please visit:http://bugreport.java.com/bugreport/crash.jspThe crash happened outside the Java Virtual Machine in native code.See problematic frame for where to report the bug.#** My sample code is:
|
After upgrade to Tesseract-4-Alpha, I found this error making the OCR from my JAVA code:
ITesseract instance = new Tesseract(); instance.setDatapath("/usr/share/tessdata/"); instance.setLanguage("spa"); (...) result = instance.doOCR(imageFile);
Environment
Current Behavior:
Error: Illegal Parameter specification!
"Fatal error encountered!" == NULL:Error:Assert failed:in file globaloc.cpp, line 75
A fatal error has been detected by the Java Runtime Environment:
SIGSEGV (0xb) at pc=0x00007ff1b3098549, pid=25091, tid=0x00007ff29d7d7700
JRE version: OpenJDK Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
Java VM: OpenJDK 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)
Problematic frame:
C [libtesseract.so+0x26f549] ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const+0x129
Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
An error report file with more information is saved as:
/opt/wildfly/wildfly-10.1.0.Final/hs_err_pid25091.log
If you would like to submit a bug report, please visit:
http://bugreport.java.com/bugreport/crash.jsp
The crash happened outside the Java Virtual Machine in native code.
See problematic frame for where to report the bug.
*** JBossAS process (25091) received ABRT signal ***
Suggested Fix:
Any idea?
The text was updated successfully, but these errors were encountered: