Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF file is broken after updating to 0.41 #523

Closed
flash286 opened this issue Oct 13, 2017 · 8 comments
Closed

PDF file is broken after updating to 0.41 #523

flash286 opened this issue Oct 13, 2017 · 8 comments
Labels
bug Existing features not working as expected

Comments

@flash286
Copy link

flash286 commented Oct 13, 2017

After updating to version 0.41 I began to receive errors from ghost script:

**** Error: replacing malformed number '--nostringval--' with 0. Output may be incorrect. **** Error reading a content stream. The page may be incomplete. Output may be incorrect.

I'm using Ghostscript(version=9.21) to compress my documents. Final documents look really broken, some pages, for instance, are blank.

How i run Ghostscript:
gs -sDEVICE=pdfwrite -dNOPAUSE -dQUIET -dBATCH -dPDFSETTINGS=/printer -dCompatibilityLevel=1.4 -sOutputFile=/tmp/890b4e33-99d1-4769-8c61-92de3196804f_printable.pdf ~/Downloads/b235f2a2-1392-4cc6-be69-5333c393fc5c.pdf

On version 0.40 same generated original file doesn't have any problems, but on 0.40 I have problem with #441

@liZe liZe added the bug Existing features not working as expected label Oct 14, 2017
@liZe
Copy link
Member

liZe commented Oct 14, 2017

Thanks for reporting this issue.

A lot of things happened about PDF generation in WeasyPrint 0.41 (see #516), but I was hoping to solve problems, not to create new ones!

I've tried to render many websites trying to reproduce this bug and I've been able to get some random errors like:

   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: Form stream has unbalanced q/Q operators (too many q's)
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.

The generated PDF is OK though, I don't get blank pages. Other PDF libraries (namely poppler and those included in web browsers) don't complain when opening the file generated by WeasyPrint. Oh, and I get larger PDF documents with Ghostscript 😉.

Could you please provide a simple HTML+CSS example that gives a blank page for you?

@flash286
Copy link
Author

@liZe Here is an example of what I'm generating
https://www.dropbox.com/s/s204nsa9nac7b2k/pdf_document_example.zip?dl=0

@liZe
Copy link
Member

liZe commented Oct 17, 2017

What a great document!

I've tried to generate the PDF with WeasyPrint 0.41 and compress it with Ghostscript, I don't get any error 😢. The PDF I get is really small and works perfectly, Ghostscript and the PDF viewer don't complain at all.

I can't reproduce the problem probably because I'm using different versions of softwares/libraries than you. The usual suspects are Python 2.x, Cairo 1.14.x and macOS, do you use any of those on your server?

@liZe
Copy link
Member

liZe commented Oct 18, 2017

@flash286 I've seen that the PDF included in you archive is generated by pypdf according to its metadata. Do you use it in your toolchain?

@flash286
Copy link
Author

flash286 commented Oct 18, 2017

@liZe Yes, I'm using it to set bleed, trim... boxes for printable version. But this step is going after Ghostscript.

By the wat about versions
I'm using Python 2.x, Cairo 1.14.x and macOS as you.

@liZe
Copy link
Member

liZe commented Oct 18, 2017

@liZe Yes, I'm using it to set bleed, trim... boxes for printable version. But this step is going after Ghostscript.

The support of bleed has been added in 0.41 (I don't know if it's enough for you, I'd be glad to get feedback about this feature).

I'm using Python 2.x and Cairo 1.14.x as you.

Python 2.x and Cairo 1.14.x are the usual suspects because I don't use them, I use Python 3.x and Cairo 1.15.x 😄. I'll launch some tests tomorrow with your versions and see if I can reproduce.

@liZe
Copy link
Member

liZe commented Oct 19, 2017

I've reproduced the problem with Cairo 1.14.10 and Python 2.7.14, at least the errors given by Ghostscript (I still get a perfect PDF).

The "fix" has been introduced in Cairo 1.15.4 where a lot of stuff has been done to improve PDF generation.

Fun fact, I only get the error with both Python 2.7 and Cairo 1.14.10:

  • Python 2.7 and Cairo 1.14.10: problem
  • Python 2.7 and Cairo 1.15.8: ok
  • Python 3.5/6 and Cairo 1.14.10: ok
  • Python 3.5/6 and Cairo 1.15.8: ok

I'm not sure that I want to know what's really happening here 😉. I don't know why the bug is only visible with WeasyPrint 0.41, but I suspect that there are 2 different bugs in Cairo < 1.15.4 and in pdfrw + Python 2.x leading to this error in Ghostscript.

If you have the possibility to try a more recent version of Python and/or Cairo, that may be the easiest workaround. Truth is that I'd prefer dropping Python 2 support than spending hours debugging encoding problems during PDF generation in third-party libraries 😒.

@flash286
Copy link
Author

@liZe Thank you for your investigation, I will try to update cairolib on my servers or implement separate module on python 3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Existing features not working as expected
Projects
None yet
Development

No branches or pull requests

2 participants