-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError on malformed PDF #583
Comments
I can confirm this wit Traceback (most recent call last):
File "/home/moose/foo.py", line 8, in <module>
pdf.getFields()
File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/pdf.py", line 1331, in getFields
catalog = self.trailer[TK.ROOT]
File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/generic.py", line 539, in __getitem__
return dict.__getitem__(self, key).getObject()
KeyError: '/Root' According to "TABLE 3.13 Entries in the file trailer dictionary" of the PDF 1.7 Specifications, the "/Root" key is required in the trailer. So I can confirm that the PDF is malformed. |
I've just updated the example and confirmed that it still happens. |
Some more examples where this is happening when accessing, e.g.,
They all seem to be broken, however, there is one which can at least be opened in a normal PDF viewer (but it seems to be empty). |
those files are deeply damaged and cannot be opend with acrobat reader (even yaleb_exs ???) @MartinThoma , this issue should be closed |
+1? (to get below 60😎) |
I just tried to open those files. Yes, my PDF Viewer can open some of them (not all), but all of them had just a blank page. The only thing that sometimes changed was the dimension of that blank page. I'll close this for the moment as we have more important issues to work on that driving robustness up that much. |
I've linked it in #1210 as we might want to throw PdfReadError instead of the KeyError. |
When running the following code with the latest pypi version of PyPDF2 on the attached input results in an unexpected
KeyError
.edit: Updated to reflect the
PyPDF2==2.4.2
version.MCVE: Code + PDF
PDF file: test.pdf
Traceback
The text was updated successfully, but these errors were encountered: