Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Image with Java 11 + tess4j:5.2.0 + Spring Boot 2.6.6 not working #231

Closed
vipulpatel2103 opened this issue Apr 26, 2022 · 9 comments

Comments

@vipulpatel2103
Copy link

Docker File

FROM openjdk:11
ARG JAR_FILE=build/libs/ocr-0.0.1-SNAPSHOT.jar
COPY ${JAR_FILE} document-ocr.jar
ENTRYPOINT ["java","-jar","/document-ocr.jar"]

Spring Boot version: 2.6.6
Tess4j: 5.2.0

Unable to load JNA module error. Locally it's running fine but not in docker image.

2022-04-26 04:28:30.458 ERROR 5556 --- [nio-8080-exec-1] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.UnsatisfiedLinkError: The specified module could not be found.
] with root cause

java.lang.UnsatisfiedLinkError: The specified module could not be found.

        at com.sun.jna.Native.open(Native Method) ~[jna-5.10.0.jar!/:5.10.0 (b0)]
        at com.sun.jna.NativeLibrary.loadLibrary(NativeLibrary.java:277) ~[jna-5.10.0.jar!/:5.10.0 (b0)]
        at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:461) ~[jna-5.10.0.jar!/:5.10.0 (b0)]
        at com.sun.jna.Library$Handler.<init>(Library.java:192) ~[jna-5.10.0.jar!/:5.10.0 (b0)]
        at com.sun.jna.Native.loadLibrary(Native.java:672) ~[jna-5.10.0.jar!/:5.10.0 (b0)]
        at com.sun.jna.Native.loadLibrary(Native.java:656) ~[jna-5.10.0.jar!/:5.10.0 (b0)]
        at net.sourceforge.tess4j.util.LoadLibs.getTessAPIInstance(LoadLibs.java:85) ~[tess4j-5.2.0.jar!/:5.2.0]
        at net.sourceforge.tess4j.TessAPI.<clinit>(TessAPI.java:42) ~[tess4j-5.2.0.jar!/:5.2.0]
        at net.sourceforge.tess4j.Tesseract.init(Tesseract.java:441) ~[tess4j-5.2.0.jar!/:5.2.0]
        at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:325) ~[tess4j-5.2.0.jar!/:5.2.0]
        at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:308) ~[tess4j-5.2.0.jar!/:5.2.0]
        at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:289) ~[tess4j-5.2.0.jar!/:5.2.0]
        at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:273) ~[tess4j-5.2.0.jar!/:5.2.0]
        at com.doc.ocr.processor.image.ImageProcessor.getSegments(ImageProcessor.java:35) ~[classes!/:na]
        at com.doc.ocr.processor.DocProcessorService.processDocument(DocProcessorService.java:30) ~[classes!/:na]
        at com.doc.ocr.rest.DocOCRController.doOCR(DocOCRController.java:29) ~[classes!/:na]
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
        at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
@nguyenq
Copy link
Owner

nguyenq commented Apr 27, 2022

Which OS are you running on? Do you have the prerequisites (such as VC++ Runtime, if on Windows; or Tesseract, if on others) installed?

@vipulpatel2103
Copy link
Author

Hello @nguyenq , thank you for response.
OS - Windows Server 2019 Datacenter
Docker - Docker version 20.10.9, build 591094d

It's working fine when I run it on target OS. The target OS has VC++ installed.
But when I build and run it through Docker Image, it's failing. I am not sure if docker base image (i.e. openjdk:11) have VC++ or not.

@vipulpatel2103
Copy link
Author

I also tried with OpenJDK base image with Tesseract-ocr. It's failing with below error.

FROM openjdk:8-jdk-alpine
RUN apk update
RUN apk add \
    tesseract-ocr
ARG JAR_FILE=build/libs/ocr-0.0.1-SNAPSHOT.jar
COPY ${JAR_FILE} document-ocr.jar
ENTRYPOINT ["java","-jar","/document-ocr.jar"]

ERROR

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007ffaaa1fe2b6, pid=1, tid=0x00007ffaab26eb10
#
# JRE version: OpenJDK Runtime Environment (8.0_212-b04) (build 1.8.0_212-b04)
# Java VM: OpenJDK 64-Bit Server VM (25.212-b04 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.12.0
# Distribution: Custom build (Sat May  4 17:33:35 UTC 2019)
# Problematic frame:
# C  [libtesseract.so.4.0.0+0x1d62b6]  ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const+0x164
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# //hs_err_pid1.log
!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 209
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   https://icedtea.classpath.org/bugzilla
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

@bromine0x23
Copy link

bromine0x23 commented Apr 27, 2022

Meet similar error on Java 17.0.2 + tess4j 5.2.0 + SpringBoot 2.6.6 + CentOS 8.4.2105 + tesseract-4.1.1-2.el8.x86_64.rpm

# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
#  SIGSEGV (0xb) at pc=0x00007f86b16e2920, pid=1, tid=25 
# 
# JRE version: OpenJDK Runtime Environment Zulu17.32+13-CA (17.0.2+8) (build 17.0.2+8-LTS) 
# Java VM: OpenJDK 64-Bit Server VM Zulu17.32+13-CA (17.0.2+8-LTS, mixed mode, emulated-client, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64) 
# Problematic frame: 
# C  [libtesseract.so.4.0.1+0x268920]  PAGE_RES_IT::ReplaceCurrentWord(tesseract::PointerVector<WERD_RES>*)+0xd0 
# 
# Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to //core.1) 
# 
# An error report file with more information is saved as: 
# //hs_err_pid1.log 
# 
# If you would like to submit a bug report, please visit: 
#   http://www.azul.com/support/ 
# The crash happened outside the Java Virtual Machine in native code. 
# See problematic frame for where to report the bug. 
# 

@vipulpatel2103
Copy link
Author

@bromine0x23 , I am not sure but you are using different version of Tess4j and tesseract source. Just have a look if that's the issue.

@bromine0x23
Copy link

bromine0x23 commented Apr 28, 2022

@vipulpatel2103 I also tested with tess4j 4.1.1 & 4.6.1, and still got error.
And the error didn't appear at the beginning, it appeared suddenly after running successfully for several days.

@bromine0x23
Copy link

Just tested again, no errors occurred on all these version: 4.6.1, 5.2.0 & 5.2.1.
Could this be a time-related problem?

@nguyenq
Copy link
Owner

nguyenq commented Apr 28, 2022

As for the strcmp(locale, "C") exception, it looks like you are using a pretty old Tesseract version. The locale issue has been fixed a while back, so you definitely want to update your Tesseract version.

tesseract-ocr/tesseract#1670
#105

@vipulpatel2103
Copy link
Author

Issue resolved with Java 8 + openjdk:8-jdk-alpine + tess4j:4.5.4.
Seems issue is not with Tess4J but due to Docker Base image with Tesseract.
Closing the ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants