Skip to content

Commit

Permalink
This commit fixes JBIG2 and JPEG 2000 support, by:
Browse files Browse the repository at this point in the history
1) using maven-shade instead of maven-assembly so that META-INF/services files are merged
(previously, because all the imageio libs have to use the same file name, only one was
being included)

2) because the jai-imageio JPEG2000 package didn't declare support for JPX, a proxy Spi
class had to be added to declare it

3) TesseractOCRParser doesn't declare support for JPEG2000 images either, even though
Tesseract with Leptonica compiled with jp2 support will handle these, so an additional
parser was added to handle these files

Issues filed:

jai-imageio/jai-imageio-jpeg2000#8
https://issues.apache.org/jira/browse/TIKA-2174
  • Loading branch information
mattcg committed Nov 9, 2016
1 parent 1d69db0 commit 0e912af
Show file tree
Hide file tree
Showing 11 changed files with 416 additions and 68 deletions.
105 changes: 105 additions & 0 deletions META-INF/services/javax.imageio.spi.ImageWriterSpi
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
org.icij.imageio.jpx.JPXImageWriterSpi
#
# $RCSfile: javax.imageio.spi.ImageWriterSpi,v $
#
#
# Copyright (c) 2005 Sun Microsystems, Inc. All Rights Reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# - Redistribution of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# - Redistribution in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
#
# Neither the name of Sun Microsystems, Inc. or the names of
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# This software is provided "AS IS," without a warranty of any
# kind. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND
# WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE HEREBY
# EXCLUDED. SUN MIDROSYSTEMS, INC. ("SUN") AND ITS LICENSORS SHALL
# NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE AS A RESULT OF
# USING, MODIFYING OR DISTRIBUTING THIS SOFTWARE OR ITS
# DERIVATIVES. IN NO EVENT WILL SUN OR ITS LICENSORS BE LIABLE FOR
# ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT, INDIRECT, SPECIAL,
# CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND
# REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF THE USE OF OR
# INABILITY TO USE THIS SOFTWARE, EVEN IF SUN HAS BEEN ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGES.
#
# You acknowledge that this software is not designed or intended for
# use in the design, construction, operation or maintenance of any
# nuclear facility.
#
# $Revision: 1.2 $
# $Date: 2007/09/05 00:21:08 $
# $State: Exp $
#
# --- JAI-Image I/O ImageWriter plug-ins ---
#
#com.github.jaiimageio.impl.plugins.jpeg.CLibJPEGImageWriterSpi
#com.github.jaiimageio.impl.plugins.png.CLibPNGImageWriterSpi
#com.github.jaiimageio.impl.plugins.jpeg2000.J2KImageWriterSpi
#com.github.jaiimageio.impl.plugins.jpeg2000.J2KImageWriterCodecLibSpi
com.github.jaiimageio.impl.plugins.wbmp.WBMPImageWriterSpi
com.github.jaiimageio.impl.plugins.bmp.BMPImageWriterSpi
com.github.jaiimageio.impl.plugins.gif.GIFImageWriterSpi
com.github.jaiimageio.impl.plugins.pcx.PCXImageWriterSpi
com.github.jaiimageio.impl.plugins.pnm.PNMImageWriterSpi
com.github.jaiimageio.impl.plugins.raw.RawImageWriterSpi
com.github.jaiimageio.impl.plugins.tiff.TIFFImageWriterSpi
#
# $RCSfile: javax.imageio.spi.ImageWriterSpi,v $
#
#
# Copyright (c) 2005 Sun Microsystems, Inc. All Rights Reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# - Redistribution of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# - Redistribution in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in
# the documentation and/or other materials provided with the
# distribution.
#
# Neither the name of Sun Microsystems, Inc. or the names of
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# This software is provided "AS IS," without a warranty of any
# kind. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND
# WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE HEREBY
# EXCLUDED. SUN MIDROSYSTEMS, INC. ("SUN") AND ITS LICENSORS SHALL
# NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE AS A RESULT OF
# USING, MODIFYING OR DISTRIBUTING THIS SOFTWARE OR ITS
# DERIVATIVES. IN NO EVENT WILL SUN OR ITS LICENSORS BE LIABLE FOR
# ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT, INDIRECT, SPECIAL,
# CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND
# REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF THE USE OF OR
# INABILITY TO USE THIS SOFTWARE, EVEN IF SUN HAS BEEN ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGES.
#
# You acknowledge that this software is not designed or intended for
# use in the design, construction, operation or maintenance of any
# nuclear facility.
#
# $Revision: 1.2 $
# $Date: 2007/09/05 00:21:08 $
# $State: Exp $
#
# --- JAI-Image I/O ImageWriter plug-ins ---
#
com.github.jaiimageio.jpeg2000.impl.J2KImageWriterSpi
10 changes: 5 additions & 5 deletions extract.iml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<sourceFolder url="file://$MODULE_DIR$/src/main/java" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/src/main/resources" type="java-resource" />
<sourceFolder url="file://$MODULE_DIR$/src/test/java" isTestSource="true" />
<sourceFolder url="file://$MODULE_DIR$/src/test/resources" type="java-resource" />
<sourceFolder url="file://$MODULE_DIR$/src/test/resources" type="java-test-resource" />
<excludeFolder url="file://$MODULE_DIR$/target" />
</content>
<orderEntry type="inheritedJdk" />
Expand Down Expand Up @@ -93,7 +93,7 @@
<orderEntry type="library" name="Maven: edu.ucar:grib:4.5.5" level="project" />
<orderEntry type="library" name="Maven: org.jdom:jdom2:2.0.4" level="project" />
<orderEntry type="library" name="Maven: org.jsoup:jsoup:1.7.2" level="project" />
<orderEntry type="library" name="Maven: edu.ucar:jj2000:5.2" level="project" />
<orderEntry type="library" name="Maven: edu.ucar:jj2000:5.3" level="project" />
<orderEntry type="library" name="Maven: org.itadaki:bzip2:0.9.1" level="project" />
<orderEntry type="library" name="Maven: edu.ucar:cdm:4.5.5" level="project" />
<orderEntry type="library" name="Maven: edu.ucar:udunits:4.5.5" level="project" />
Expand All @@ -109,9 +109,9 @@
<orderEntry type="library" name="Maven: org.apache.sis.core:sis-metadata:0.6" level="project" />
<orderEntry type="library" name="Maven: org.opengis:geoapi:3.0.0" level="project" />
<orderEntry type="library" name="Maven: javax.measure:jsr-275:0.9.3" level="project" />
<orderEntry type="library" scope="RUNTIME" name="Maven: com.github.jai-imageio:jai-imageio-core:1.3.1" level="project" />
<orderEntry type="library" scope="RUNTIME" name="Maven: com.github.jai-imageio:jai-imageio-jpeg2000:1.3.0" level="project" />
<orderEntry type="library" scope="RUNTIME" name="Maven: com.levigo.jbig2:levigo-jbig2-imageio:1.6.5" level="project" />
<orderEntry type="library" name="Maven: com.levigo.jbig2:levigo-jbig2-imageio:1.6.5" level="project" />
<orderEntry type="library" name="Maven: com.github.jai-imageio:jai-imageio-core:1.3.1" level="project" />
<orderEntry type="library" name="Maven: com.github.jai-imageio:jai-imageio-jpeg2000:1.3.0" level="project" />
<orderEntry type="library" name="Maven: org.apache.solr:solr-solrj:6.2.1" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.core:jackson-annotations:2.5.4" level="project" />
<orderEntry type="library" name="Maven: com.fasterxml.jackson.core:jackson-databind:2.5.4" level="project" />
Expand Down
195 changes: 136 additions & 59 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,14 @@
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<java.version>1.8</java.version>

<app.main.class>org.icij.extract.cli.Main</app.main.class>

<maven-compiler.version>3.6.0</maven-compiler.version>
<maven-dependency.version>2.10</maven-dependency.version>
<maven-jar.version>2.6</maven-jar.version>
<maven-shade.version>2.4.3</maven-shade.version>
<maven-install.version>2.5.2</maven-install.version>
</properties>

<repositories>
Expand Down Expand Up @@ -69,27 +77,24 @@
<version>1.14</version>
</dependency>

<!-- Optional PDFBox dependency for parsing JBIG2 format images in PDF files. -->
<dependency>
<groupId>com.levigo.jbig2</groupId>
<artifactId>levigo-jbig2-imageio</artifactId>
<version>1.6.5</version>
</dependency>

<!-- Optional PDFBox dependency for parsing JPEG2000 and TIFF format images in PDF files. -->
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
<version>1.3.1</version>
<scope>runtime</scope>
</dependency>

<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-jpeg2000</artifactId>
<version>1.3.0</version>
<scope>runtime</scope>
</dependency>

<!-- Optional PDFBox dependency for parsing JBIG2 format images in PDF files. -->
<dependency>
<groupId>com.levigo.jbig2</groupId>
<artifactId>levigo-jbig2-imageio</artifactId>
<version>1.6.5</version>
<scope>runtime</scope>
</dependency>

<dependency>
Expand Down Expand Up @@ -157,43 +162,141 @@

</dependencies>

<dependencyManagement>

<dependencies>

<!-- Upgraded from 5.2 (Tika dep) to avoid exception:
Caused by: java.lang.NoSuchMethodError:
jj2000.j2k.fileformat.reader.FileFormatReader.<init>(
Ljj2000/j2k/io/RandomAccessIO;Lcom/sun/media/imageioimpl/plugins/
jpeg2000/J2KMetadata;)V
-->
<dependency>
<groupId>edu.ucar</groupId>
<artifactId>jj2000</artifactId>
<version>5.3</version>
</dependency>

</dependencies>

</dependencyManagement>

<build>
<finalName>extract</finalName>
<plugins>

<!-- Compilation -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven-compiler.version}</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
<compilerArgument>-Xlint:all</compilerArgument>
</configuration>
</plugin>

<!-- Packaging / Jar -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>${maven-jar.version}</version>
</plugin>

<!-- Resources -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>3.0.1</version>
<configuration>
<includeEmptyDirs>true</includeEmptyDirs>
</configuration>
</plugin>

<!-- Shade -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>${maven-shade.version}</version>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<minimizeJar>false</minimizeJar>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
<exclude>META-INF/*.EC</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>org.icij.extract.cli.Main</Main-Class>
</manifestEntries>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>

<!-- Dependency Unpacking, Purging -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.10</version>
<version>${maven-dependency.version}</version>
<executions>
<execution>
<id>copy-dependencies</id>
<id>unpack-dependencies</id>
<phase>package</phase>
<goals>
<goal>copy-dependencies</goal>
<goal>unpack-dependencies</goal>
</goals>
<configuration>
<excludeScope>system</excludeScope>
<excludes>META-INF/*.SF</excludes>
<excludes>META-INF/*.DSA</excludes>
<excludes>META-INF/*.RSA</excludes>
<excludes>META-INF/*.EC</excludes>
<excludeGroupIds>junit,org.mockito,org.hamcrest</excludeGroupIds>
<outputDirectory>${project.build.directory}/classes</outputDirectory>
</configuration>
</execution>
<execution>
<id>purge-local-dependencies</id>
<phase>clean</phase>
<goals>
<goal>purge-local-repository</goal>
</goals>
<configuration>
<outputDirectory>${project.build.directory}/dependencies</outputDirectory>
<overWriteReleases>false</overWriteReleases>
<overWriteSnapshots>false</overWriteSnapshots>
<overWriteIfNewer>true</overWriteIfNewer>
</configuration>
</execution>
</executions>
</plugin>

<!-- COMPILER -->
<!-- Installation -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
<compilerArgument>-Xlint:all</compilerArgument>
</configuration>
<artifactId>maven-install-plugin</artifactId>
<version>${maven-install.version}</version>
</plugin>

<!-- Test -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
Expand All @@ -215,47 +318,21 @@
</systemProperties>
</configuration>
</plugin>

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>3.0.1</version>
<configuration>
<includeEmptyDirs>true</includeEmptyDirs>
</configuration>
</plugin>

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>org.icij.extract.cli.Main</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>

<resources>
<resource>
<directory>src/main/resources</directory>
</resource>
</resources>

<testResources>
<testResource>
<directory>src/test/resources</directory>
<excludes>
<exclude>**/.keep</exclude>
</excludes>
</resource>
</resources>
</testResource>
</testResources>
</build>
</project>
Loading

0 comments on commit 0e912af

Please sign in to comment.