Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pages without Resources dictionary #270

Closed
jvalenzuela opened this issue Jun 23, 2016 · 12 comments · Fixed by #1349
Closed

Pages without Resources dictionary #270

jvalenzuela opened this issue Jun 23, 2016 · 12 comments · Fixed by #1349
Labels
is-robustness-issue From a users perspective, this is about robustness needs-pdf The issue needs a PDF file to show the problem needs-test A test should be added before this PR is merged.

Comments

@jvalenzuela
Copy link

Failure using mergePage() with pages that do not have a resource dictionary. This appears to be a valid condition, and the page should then inherit dictionary content from its parent. Trackback below:

Traceback (most recent call last):
  File "C:\Python27\lib\lib-tk\Tkinter.py", line 1536, in __call__
    return self.func(*args)
  File "pdfbind\view.py", line 153, in _on_execute_click
    b.bind()
  File "pdfbind\bind.py", line 176, in bind
    page_bufs.append(page_header.merge(p, header_info))
  File "pdfbind\header.py", line 68, in merge
    header_page.mergePage(orig_page)
  File "PyPDF2\pdf.py", line 2211, in mergePage
    self._mergePage(page2)
  File "PyPDF2\pdf.py", line 2221, in _mergePage
    page2Resources = page2["/Resources"].getObject()
  File "PyPDF2\generic.py", line 512, in __getitem__
    return dict.__getitem__(self, key).getObject()
KeyError: '/Resources'
@mstamy2
Copy link
Collaborator

mstamy2 commented Jun 23, 2016

Could you possibly share the PDF(s) you're working with so I can take a closer look? PyPDF2 does (or is supposed to) support inheritance of missing page attributes from a parent.

@jvalenzuela
Copy link
Author

Here's one of the files causing the problem. Starting with some other page from another document, then calling mergePage() with this PDF results in the above error.
108.pdf

@mstamy2
Copy link
Collaborator

mstamy2 commented Jun 24, 2016

While PyPDF2 does allow inheriting certain page attributes, It appears that the none of the page's parents contain the Resources dictionary either. It is a required entry, however I'll try to implement a workaround in strict=False mode

@MartinThoma MartinThoma added the is-robustness-issue From a users perspective, this is about robustness label Apr 18, 2022
@sjacob90
Copy link

Was this issue resolved?

@MartinThoma MartinThoma added is-robustness-issue From a users perspective, this is about robustness and removed is-robustness-issue From a users perspective, this is about robustness Nonconforming labels Jun 19, 2022
@MartinThoma MartinThoma changed the title Pages w/o Resouce dictionary Pages without Resources dictionary Jul 10, 2022
@pubpub-zz
Copy link
Collaborator

pubpub-zz commented Sep 3, 2022

Here's one of the files causing the problem. Starting with some other page from another document, then calling mergePage() with this PDF results in the above error. 108.pdf
tested successfully :

p = PyPDF2.PdfReader("c:/108.pdf")
m = PyPDF2.PdfMerger()
m.append(p)
with open("c:/tt.pdf","wb") as f:
    m.write(f)

issue can be closed

@MartinThoma
Copy link
Member

Thank you for checking @pubpub-zz ❤️

@FredrikWallstrom
Copy link

FredrikWallstrom commented Sep 14, 2022

Maybe I'm missing something but it looks like this: #1276 only fixes the _extract_text function.
I'm still having issues with the _merge_page function and this call: original_resources = cast(DictionaryObject, self[PG.RESOURCES].get_object()) when I have a page that is missing the \Resources dict.

  File "/site-packages/PyPDF2/_page.py", line 508, in merge_page
    self._merge_page(page2, expand=expand)
  File "/site-packages/PyPDF2/_page.py", line 532, in _merge_page
    original_resources = cast(DictionaryObject, self[PG.RESOURCES].get_object())
  File "/site-packages/PyPDF2/generic/_data_structures.py", line 149, in __getitem__
    return dict.__getitem__(self, key).get_object()
KeyError: '/Resources'

@MartinThoma MartinThoma reopened this Sep 14, 2022
@MartinThoma
Copy link
Member

@FredrikWallstrom Which version of PyPDF2 are you using?

@FredrikWallstrom
Copy link

Which version of PyPDF2 are you using?

2.10.8

@pubpub-zz
Copy link
Collaborator

@FredrikWallstrom
to be sure to focus on the real problem, can you provide test file and code

thanks

@MartinThoma MartinThoma added needs-pdf The issue needs a PDF file to show the problem needs-test A test should be added before this PR is merged. labels Sep 15, 2022
@FredrikWallstrom
Copy link

FredrikWallstrom commented Sep 15, 2022

PDF: 108.pdf

Stupid code example but the principle is the same:

    reader = PdfReader(<108.pdf-stream>)
    page_one = reader.pages[0]
    page_two = reader.pages[0]
    page_one.merge_page(page_two)

@pubpub-zz
Copy link
Collaborator

a good example improves analysis.Thanks

Should be good now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-robustness-issue From a users perspective, this is about robustness needs-pdf The issue needs a PDF file to show the problem needs-test A test should be added before this PR is merged.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants