pypdf
equivalent of PyMuPDF's pixmap?
#2785
-
Does I often deal with PDFs where many pages are comprised primarily of vector graphics (ex. a plot) along with some image overlays. I would like to extract the vector graphics as raster graphics and perform image processing using other purpose-built tools. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
No, pypdf does not have the concept of pixmaps, as it has no actual rendering capabilities. To render an image of a page or a page region, I tend to use Speaking of your use case, you should be able to call |
Beta Was this translation helpful? Give feedback.
No, pypdf does not have the concept of pixmaps, as it has no actual rendering capabilities.
To render an image of a page or a page region, I tend to use
pdftocairo
directly. Image objects can be extracted as Pillow image objects - either throughpage.images
or for any image content stream not covered by this method by callingcontent_stream.decode_as_image()
.Speaking of your use case, you should be able to call
image.replace()
to get rid of the overlays if required (see https://pypdf.readthedocs.io/en/latest/modules/PageObject.html#pypdf._page.PageObject.images) and then render the page/page region with your tool of choice.