Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set SMaskInData to 1 for PDFs with alpha #7316

Merged
merged 3 commits into from
Aug 4, 2023

Conversation

radarhere
Copy link
Member

Helps #7312

Pillow can convert a PNG with transparency to PDF

from PIL import Image
im = Image.open("Tests/images/transparent.png")
im.save("transparent.pdf")

However, what does transparency mean exactly in PDF? There is debate about this in #7312, and various image viewers are discussed, so I'll try and keep it simple here by saving it that if I convert the PDF back to PNG with ImageMagick, convert transparent.pdf out.png, at the moment, I get a black background instead of transparency.

In https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf, under 7.4.9 'JPXDecode Filter', it states

SMaskInData specifies whether soft-mask information packaged with the image samples shall be used (see 11.6.5.3, "Soft-Mask Images"); if it is, the SMask entry shall not be present. If SMaskInData is nonzero, there shall be only one opacity channel in the JPEG2000 data and it shall apply to all colour channels.

If I set SMaskInData to 1 when saving the PDF, the roundtripped PNG keeps its transparency.

@radarhere
Copy link
Member Author

The issue has also pointed out that in the specification I linked to earlier, under 8.9.5 'Image Dictionaries', it describes 'BitsPerComponent' by saying

If the image stream uses the JPXDecode filter, this entry is optional and shall be ignored if present. The bit depth is determined by the conforming reader in the process of decoding the JPEG2000 image.

The issue has also pointed out that the specification, under 7.4.9 'JPXDecode Filter', states

ColorSpace shall be optional since JPEG2000 data contain colour space specifications.

So I've added commits to remove these two entries for JPXDecode, to save space.

@homm
Copy link
Member

homm commented Aug 3, 2023

Have you tested it, does it solves problems with pdf.js from #7312?

@radarhere
Copy link
Member Author

Testing pdf.js, with main, I get
before
With this branch, I get
after

This is the same final result that the user reported in the issue. They believe that the inverted colors is a bug in pdf.js, that they have reported in mozilla/pdf.js#16782

@hugovk hugovk merged commit 917769b into python-pillow:main Aug 4, 2023
49 of 50 checks passed
@radarhere radarhere deleted the pdf_alpha branch August 4, 2023 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants