Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in apache.pdfbox/fontbox - null font #224

Closed
BbIKTOP opened this issue Jun 1, 2018 · 24 comments
Closed

Exception in apache.pdfbox/fontbox - null font #224

BbIKTOP opened this issue Jun 1, 2018 · 24 comments

Comments

@BbIKTOP
Copy link

BbIKTOP commented Jun 1, 2018

Hi,

Got this, have no idea why:

java.io.IOException: The TrueType font null does not contain a 'cmap' table
        at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapImpl(TrueTypeFont.java:548)
        at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:528)
        at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:514)
        at org.apache.fontbox.ttf.TTFSubsetter.<init>(TTFSubsetter.java:91)
        at org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.subset(TrueTypeEmbedder.java:321)
        at org.apache.pdfbox.pdmodel.font.PDType0Font.subset(PDType0Font.java:239)
        at com.openhtmltopdf.pdfboxout.PdfBoxFontResolver.close(PdfBoxFontResolver.java:86)
        at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.cleanup(PdfBoxRenderer.java:848)
        at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.close(PdfBoxRenderer.java:870)
        at com.openhtmltopdf.pdfboxout.PdfRendererBuilder.run(PdfRendererBuilder.java:37)
        at web.MakePdf.processRequest(MakePdf.java:97)
        at web.MakePdf.doPost(MakePdf.java:116)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:159)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:97)
        at com.caucho.server.dispatch.ServletFilterChain.doFilter(ServletFilterChain.java:109)
        at web.ur.CheckAccess.doFilter(CheckAccess.java:49)
        at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:89)
        at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:156)
        at com.caucho.server.webapp.AccessLogFilterChain.doFilter(AccessLogFilterChain.java:95)
        at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:290)
        at com.caucho.server.hmux.HmuxRequest.handleInvocation(HmuxRequest.java:476)
        at com.caucho.server.hmux.HmuxRequest.handleRequestImpl(HmuxRequest.java:374)
        at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:341)
        at com.caucho.network.listen.TcpSocketLink.dispatchRequest(TcpSocketLink.java:1362)
        at com.caucho.network.listen.TcpSocketLink.handleRequest(TcpSocketLink.java:1318)
        at com.caucho.network.listen.TcpSocketLink.handleRequestsImpl(TcpSocketLink.java:1302)
        at com.caucho.network.listen.TcpSocketLink.handleRequests(TcpSocketLink.java:1210)
        at com.caucho.network.listen.TcpSocketLink.handleAcceptTaskImpl(TcpSocketLink.java:1006)
        at com.caucho.network.listen.ConnectionTask.runThread(ConnectionTask.java:117)
        at com.caucho.network.listen.ConnectionTask.run(ConnectionTask.java:93)
        at com.caucho.network.listen.SocketLinkThreadLauncher.handleTasks(SocketLinkThreadLauncher.java:169)
        at com.caucho.network.listen.TcpSocketAcceptThread.run(TcpSocketAcceptThread.java:61)
        at com.caucho.env.thread2.ResinThread2.runTasks(ResinThread2.java:173)
        at com.caucho.env.thread2.ResinThread2.run(ResinThread2.java:118)

Probably this Exception#printStackTrace call should be removed, although I don't understand how can font be null:

public void close() {
		for (FontDescription fontDescription : _fontCache.values()) {
			/*
			 * If the font is not yet subset, we must subset it, otherwise we may leak a
			 * file handle because the PDType0Font may still have the font file open.
			 */
			if (fontDescription._font != null && fontDescription._font.willBeSubset()) {
				try {
					fontDescription._font.subset();
				} catch (IOException e) {
					e.printStackTrace();
				}
			}
		}
		_fontCache.clear();

		// Close all still open TrueTypeCollections
		for (TrueTypeCollection collection : _collectionsToClose) {
			try {
				collection.close();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
		_collectionsToClose.clear();
	}

@danfickle
Copy link
Owner

@rototor I'm getting the same thing. If you get time, could you take a look please?

@BbIKTOP
Copy link
Author

BbIKTOP commented Jun 11, 2018

I see "COSDictionary{COSStream has been closed and cannot be read. Perhaps its enclosing PDDocument has been closed?}"
maybe the font is closed? fontDescription#_font#ttf is null probably because it's the otf font

@rototor
Copy link
Contributor

rototor commented Jun 18, 2018

What font are you using? Can this be reproduced with some of our example files?

We call subset() here only as a workaround. Maybe @THausherr has some clue what went wrong here.

@THausherr
Copy link

"null" is just the name of the font.

@THausherr
Copy link

"COSStream has been closed and cannot be read. Perhaps its enclosing PDDocument has been closed" happens if you close a document and then read a stream resource. Usually if you use a resource in two documents and close one of them.

@THausherr
Copy link

Yes it would be useful to have the font to see what happens when doing ordinary subsetting with it.

@BbIKTOP
Copy link
Author

BbIKTOP commented Jun 18, 2018

I put the example here: https://base.etogo.net/_generated.html
This is servlet-generated html, exactly the same that produce the error. You could find all the fonts in the css in it: https://base.etogo.net/css/html2pdf.css
Thank you, guys, I commented out this printStackTrace in your sources, but it would be just great to have a real solution.

@rototor
Copy link
Contributor

rototor commented Jun 18, 2018

@BbIKTOP By commenting out the printStackTrace() you can of course reduce the log noise, but your server might still leak file handles with every report generated... See also here to view how much file handles are used by the web server process.

@BbIKTOP
Copy link
Author

BbIKTOP commented Jun 18, 2018

Sorry I forgot to include this. Here's the complete code block from the servlet, no other resources are opened or closed:

        OutputStream os = resp.getOutputStream();

        try
        {
            PdfRendererBuilder builder = new PdfRendererBuilder();

            builder.withProducer("DMate ltd.");
            builder.useFastMode();

            builder.withHtmlContent(buf.toString(), url.toString());
            builder.toStream(os);

            builder.run();

            os.flush();
            os.close();
        } catch (Exception e)
        {
            System.out.println("PDF generation exception :" + e.toString());
            //ex.printStackTrace();
        }

@BbIKTOP
Copy link
Author

BbIKTOP commented Jun 18, 2018

Not sure, I suppose container would close all resources after servlet thread run is completed. Anyway, I suppose you'll fix it!😉
Tried it myself, but the code is a bit hard to understand because i'm not familiar with it.

@BbIKTOP
Copy link
Author

BbIKTOP commented Jun 18, 2018

Just made some tests, no leaks. I launched it about 20-30 times, it consumes some resources when pdf is generating but after some time of inactivity, the gc returns everything back as I see:

while true; do ps=ps -ef|grep resin|awk '{ print $2 }';date; lsof -p echo $ps|sed 's/\s/,/g'|wc -l;sleep 60;done
Mon Jun 18 20:50:37 EEST 2018
404
Mon Jun 18 20:51:37 EEST 2018
407
...
Mon Jun 18 20:57:37 EEST 2018
406
Mon Jun 18 20:58:38 EEST 2018
403
Mon Jun 18 20:59:38 EEST 2018
399
Mon Jun 18 21:00:38 EEST 2018
391
Mon Jun 18 21:01:38 EEST 2018
391
Mon Jun 18 21:02:38 EEST 2018
397
Mon Jun 18 21:03:38 EEST 2018
401
Mon Jun 18 21:04:38 EEST 2018
403
Mon Jun 18 21:05:38 EEST 2018
403
Mon Jun 18 21:06:38 EEST 2018
403
Mon Jun 18 21:07:39 EEST 2018
403
Mon Jun 18 21:08:39 EEST 2018
403
Mon Jun 18 21:09:39 EEST 2018
403
Mon Jun 18 21:10:39 EEST 2018
405
Mon Jun 18 21:11:39 EEST 2018
399
Mon Jun 18 21:12:39 EEST 2018
398
Mon Jun 18 21:13:39 EEST 2018
398
Mon Jun 18 21:14:39 EEST 2018
398
Mon Jun 18 21:15:39 EEST 2018
398
Mon Jun 18 21:16:40 EEST 2018
400
Mon Jun 18 21:17:40 EEST 2018
400
Mon Jun 18 21:18:40 EEST 2018
402
...
Mon Jun 18 21:20:40 EEST 2018
398

@THausherr
Copy link

I found out that the cmap exception also comes if font.subset() is called twice.

@nullquery
Copy link

nullquery commented Jul 10, 2018

I'm running into this problem too. To start off I've had to use TTC files in favor of multiple TTF files to get fonts to work properly at all. This is likely because the multiple TTF files I have are for different styles (normal, bold, italic, bold+italic) but for some reason java.awt.Font#getStyle always returns 0 (java.awt.Font.PLAIN) for all of the fonts. I guess this information is not available in the TTF file? (I'm not an expert on fonts...)

So now I've got a TTC font and I can set subset to true or false. If I set it to true, I get the aforementioned NullPointerException "TrueType font does not contain a 'cmap' table" exception, but everything seems to work. Setting it to false avoids the exception, but embeds the font in the resulting PDF file even if the font is never used.

At bare minimum I'd say that all of these printStackTrace calls have to be changed to use the logging mechanism provided (in my case log4j) but as I've understood the exception means that file handles may remain open every time a PDF is generated.

@nullquery
Copy link

It seems to me that the error can be ignored in my use case, because TrueType Collections are closed manually after the loop. See com.openhtmltopdf.pdfboxout.PdfBoxFontResolver#close

If this is the case, then the issue can be resolved (at least for me) by passing the error to the com.openhtmltopdf.util.XRLog class.

@THausherr
Copy link

There have been some recent changes in PDFBox that can be tested by using the latest jar at
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.12-SNAPSHOT/
I don't know if this helps with the problems here... to investigate I need some PDFBox related code. You mentioned "the aforementioned NullPointerException" but there was no NullPointerException mentioned. The "The TrueType font null does not contain a 'cmap' table" exception would happen with a bad font or when calling subset() twice. (I wonder if I should throw an illegalstateexception for that one).
The PDFBox code does look suspicious... the subsetter and PDType0Font do close ttf after subsetting even if they didn't open the ttf themselves.

@nullquery
Copy link

Oops! I meant that exception, sorry. I've updated my OP to reflect this.

I've also noticed that this only occurs when the com.openhtmltopdf.pdfboxout.PdfBoxRenderer#close is called, which is after the PDF is generated. If I don't close the PdfBoxRenderer object I don't see the exception in my log. This seems to point to an issue with cleanup, not an issue with rendering the PDF.

@nullquery
Copy link

Is this issue related to #215?

@THausherr
Copy link

Yes, possibly, it's the same code area, but the issue there has been solved in snapshot, that's why I'd like that it be retested.

@danfickle
Copy link
Owner

@THausherr

I tested the following code with both PDF-BOX 2.0.11 and 2.0.12-SNAPSHOT:

    public static void main(String...args) throws Exception {
        for (int i = 0; i < 100000; i++) {
            File hand = new File("/Users/me/Documents/pdf-issues/fonts/JustAnotherHand.ttf");
            
            PDDocument doc = new PDDocument();
            PDFont unused = PDType0Font.load(doc, hand);
            doc.close(); // Does this close unused?
            System.out.println(i);
        }
    }

The result on 2.0.11 was:

8767
Exception in thread "main" java.io.FileNotFoundException: /Users/me/Documents/pdf-issues/fonts/JustAnotherHand.ttf (Too many open files in system)
	at java.io.RandomAccessFile.open0(Native Method)
	at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
	at org.apache.fontbox.ttf.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:88)
	at org.apache.fontbox.ttf.RAFDataStream.<init>(RAFDataStream.java:63)
	at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:84)
	at org.apache.pdfbox.pdmodel.font.PDType0Font.load(PDType0Font.java:66)
	at DefaultTestBed.main(DefaultTestBed.java:690)

2.0.12 completed successfully. So the issue seems to be fixed in snapshot. Big thanks! We can remove our kludgy workaround code as soon as 2.0.12 is released.

danfickle added a commit that referenced this issue Jul 14, 2018
…ith our close font workaround. [ci skip]

This really needs a release of PDFBOX 2.0.12 for a proper fix.
@THausherr
Copy link

Yes, the file leak was fixed in https://issues.apache.org/jira/browse/PDFBOX-4242 .

@rekhubs
Copy link

rekhubs commented Aug 15, 2018

I ran into the same problem, then I checked this post:
#225 (comment)

and tried a bit with FSSupplier, and it's working now:

  • internal class implementing FSSupplier
  • set subset as false (using the long one useFont() )

@danfickle
Copy link
Owner

Closing now as workaround code removed as fix was published in dependency PDFBOX 2.0.12. Thanks everyone for reporting.

@f-necas
Copy link

f-necas commented Mar 11, 2020

I'm still having the issue in 2.0.18 ..

 java.io.IOException: The TrueType font null does not contain a 'cmap' table
 	at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapImpl(TrueTypeFont.java:553)
 	at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:533)
 	at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:519)
 	at org.apache.fontbox.ttf.TTFSubsetter.<init>(TTFSubsetter.java:90)
 	at org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.subset(TrueTypeEmbedder.java:321)
 	at org.apache.pdfbox.pdmodel.font.PDType0Font.subset(PDType0Font.java:256)
 	at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1349)
 	at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1328)
 	at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1316)
 	at com.seripress.framework.utils.CopyTestHelper.generateCopyTest(CopyTestHelper.java:432)
 	at com.seripress.servlet.ImageServlet.generateFinalRendering(ImageServlet.java:488)
 	at com.seripress.servlet.ImageServlet.handleRequestedType(ImageServlet.java:313)
 	at com.seripress.servlet.ImageServlet.handleFileRequest(ImageServlet.java:140)
 	at com.seripress.servlet.ImageServlet.doGet(ImageServlet.java:94)
 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:686)
 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:791)
 	at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
 	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:129)
 	at io.undertow.websockets.jsr.JsrWebSocketFilter.doFilter(JsrWebSocketFilter.java:173)
 	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
 	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
 	at com.seripress.framework.filter.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:29)
 	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
 	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
 	at com.seripress.filter.SessionFilter.doFilter(SessionFilter.java:61)
 	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
 	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
 	at io.opentracing.contrib.jaxrs2.server.SpanFinishingFilter.doFilter(SpanFinishingFilter.java:55)
 	at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
 	at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
 	at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
 	at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
 	at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
 	at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
 	at org.wildfly.extension.undertow.security.SecurityContextAssociationHandler.handleRequest(SecurityContextAssociationHandler.java:78)
 	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
 	at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
 	at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
 	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
 	at io.undertow.security.handlers.AuthenticationConstraintHandler.handleRequest(AuthenticationConstraintHandler.java:53)
 	at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
 	at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
 	at io.undertow.servlet.handlers.security.ServletSecurityConstraintHandler.handleRequest(ServletSecurityConstraintHandler.java:59)
 	at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
 	at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
 	at io.undertow.security.handlers.NotificationReceiverHandler.handleRequest(NotificationReceiverHandler.java:50)
 	at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
 	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
 	at org.wildfly.extension.undertow.security.jacc.JACCContextIdHandler.handleRequest(JACCContextIdHandler.java:61)
 	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
 	at org.wildfly.extension.undertow.deployment.GlobalRequestControllerHandler.handleRequest(GlobalRequestControllerHandler.java:68)
 	at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
 	at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
 	at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
 	at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
 	at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
 	at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
 	at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
 	at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
 	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
 	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
 	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
 	at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
 	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
 	at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
 	at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
 	at io.undertow.server.Connectors.executeRootHandler(Connectors.java:364)
 	at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
 	at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
 	at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
 	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
 	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
 	at java.lang.Thread.run(Thread.java:745)

@THausherr
Copy link

Better create a new issue with the smallest possible code that reproduces the problem. (Is this related to openhtmltopdf at all? I don't see it in the stack trace)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants