Deleting a widget doesn't remove the object from the fields list, how to remove from the fields list? #2663

ag-gaphp · 2024-05-21T18:41:20Z

ag-gaphp
May 21, 2024

I'm having a hard time accomplishing what I thought would be a simple goal: remove a placeholder with pypdf then add a new signature field with pyHanko.

When I remove the widget, it's gone from the annotations list. However, it appears to still remain in the fields list. When pyHanko checks for existing fields, it is iterating the fields list and not the annotations list, so it throws an exception because the field is already there in the fields list.

How can I remove the reference to the field in the fields list?

This is what my function is doing right now:

# try using pypdf to remove all annotations and see if pyHanko still complains
pypdf_writer = PdfWriter(clone_from=OLD_FILE)
fields = pypdf_writer.get_fields()
for page in pypdf_writer.pages:
    placeholders = []
    for i, annot in enumerate(page[PageAttributes.ANNOTS]):
        annot = annot.get_object()
        if annot[AnnotationDictionaryAttributes.Subtype] == "/Widget":
            try:
                n = annot[FieldDictionaryAttributes.T]
                if n.startswith('sig') or n.startswith('init'):
                    rect = annot['/Rect']
                    boxes[n] = {
                        "page": page.page_number,
                        "box": (rect[0], rect[1], rect[2], rect[3])
                    }
                    placeholders.append(i)
                    fields[n].remove_from_tree()
            except KeyError as e:
                pass

    for i in placeholders[::-1]:
        del page[PageAttributes.ANNOTS][i]

pypdf_writer.write(NEW_FILE)

I tried using remove_from_tree() on the field object, but pypdf says it doesn't appear to be a tree item and doesn't touch it. I didn't see any other reference in the docs on removing fields specifically.

Answered by ag-gaphp

May 22, 2024

I understand they are stored in an array called /Annots per page, but they also have references in another array for the whole document under /Root.

This additional removal is what I needed for the readers/importers that ignore page /Annots list and only look through the document root:
writer._root_object['/AcroForm']['/Fields'].remove(annotation.indirect_reference)

More context on what triggered this issue for me:
MatthiasValvekens/pyHanko#430

View full answer

pubpub-zz · 2024-05-21T21:45:13Z

pubpub-zz
May 21, 2024
Maintainer

Annotations are stored in an array. Please refer to Pdf reference

3 replies

ag-gaphp May 22, 2024
Author

I understand they are stored in an array called /Annots per page, but they also have references in another array for the whole document under /Root.

This additional removal is what I needed for the readers/importers that ignore page /Annots list and only look through the document root:
writer._root_object['/AcroForm']['/Fields'].remove(annotation.indirect_reference)

More context on what triggered this issue for me:
MatthiasValvekens/pyHanko#430

Answer selected by ag-gaphp

pubpub-zz May 22, 2024
Maintainer

Your code deletes the Field not the widget which renders the information on the page.The visual effect may be correct only because the viewer find a broken link, but this is not recommended.

ag-gaphp May 22, 2024
Author

My full tool would run both operations, but PyMuPDF actually handles single field/widget removal out of the box so I have gone with that for this project.

Thanks for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deleting a widget doesn't remove the object from the fields list, how to remove from the fields list? #2663

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Deleting a widget doesn't remove the object from the fields list, how to remove from the fields list? #2663

ag-gaphp May 21, 2024

Replies: 1 comment · 3 replies

pubpub-zz May 21, 2024 Maintainer

ag-gaphp May 22, 2024 Author

pubpub-zz May 22, 2024 Maintainer

ag-gaphp May 22, 2024 Author

ag-gaphp
May 21, 2024

Replies: 1 comment 3 replies

pubpub-zz
May 21, 2024
Maintainer

ag-gaphp May 22, 2024
Author

pubpub-zz May 22, 2024
Maintainer

ag-gaphp May 22, 2024
Author