Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix font filehandle leak #215

Merged
merged 5 commits into from
Apr 25, 2018

Conversation

rototor
Copy link
Contributor

@rototor rototor commented Apr 24, 2018

We've got a file handle leak in production which caused our Tomcat to stop working after some days, and were able to track this down to fonts files which were not closed. After the user generated enough reports the 2048 file descriptors were exhausted and everything stopped working ..

If a font file was registered but not used in the PDF, those font files would leak their file handles.

Otherwise this may leak file handles. Those file handles should be
finalized() sometime in future by the GC, but this may take way to long
before all file handles are exhausted.

It seems strange that you must subset a font to let its file handle close,
but thats the way PDFont is implemented. Usually you always use all the
fonts you load into your PDF document. But if you don't use all fonts,
the unused fonts leak if they are not subsetted.

We also must close all TrueTypeCollections we load.

And we don't load fonts provided by file instantly, but rather use a supplier
for the font. This more or less also workarounds the file handle leak, as
no PDFont is loaded till it is really needed. Before this change the PDFont
always has been loaded, even if it is not used. And of course was leaked if
the font was not subset.
@rototor rototor mentioned this pull request Apr 24, 2018
@danfickle
Copy link
Owner

Thanks @rototor

Are you saying only fonts loaded but not used leak? If so, perhaps this is a bug in PDFBOX. Anyway, is it possible to use the workaround mentioned in comments at link below, rather than calling subset? Ie. Get a TrueTypeFont object and close that in your close method?

https://stackoverflow.com/questions/35983135/pdfbox-leaves-font-files-open-after-generating-pdf

@rototor
Copy link
Contributor Author

rototor commented Apr 25, 2018

@danfickle Yes, that should theoretically work too. But it may throw an NPE as close() in TrueTypeFont is not allowed to be called more then once (which is a violation of the close() contract, but ...). To be exact RAFDataStream.close() will crash if it is called twice. Using subset() is a workaround for this problem...

Maybe we should merge this ugly workaround and open a upstream ticket to fix close() in fontbox? And as soon as a fix is released we can cleanup this here.

@danfickle
Copy link
Owner

OK, I'll file an issue with PDFBOX and merge this now. Thanks again.

@danfickle danfickle merged commit b03a9e8 into danfickle:open-dev-v1 Apr 25, 2018
@rototor rototor deleted the fix_font_filehandle_leak branch April 25, 2018 10:41
@THausherr
Copy link

@danfickle did you ever create an issue in PDFBox about this?

@danfickle
Copy link
Owner

@THausherr

No, I failed to get around to it. Should I file one now? By the way, huge thanks for your work on PDFBOX.

@THausherr
Copy link

No longer needed, see https://issues.apache.org/jira/browse/PDFBOX-4242, please read it and comment if you want. I was just wondering whether we had missed an issue.

@danfickle
Copy link
Owner

Thanks @THausherr

I think that discussion covers it. In regard to this project, I was thinking we are only opening unused fonts for their font metrics, so we should probably cache this information between runs and only load the font if actually used in a document. This would be a performance improvement as well as a workaround for this issue.

danfickle added a commit that referenced this pull request Jul 14, 2018
…ith our close font workaround. [ci skip]

This really needs a release of PDFBOX 2.0.12 for a proper fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants