[`DETR`] Update the processing to adapt masks & bboxes to reflect padding #28363

amyeroberts · 2024-01-05T20:40:01Z

What does this PR do?

Fixes an issue with the processing of batches of images for DETR and DETR related models.

Previously, the annotations for the models - specifically the masks and bboxes, wouldn't be updated to account for the new image sizes if padding occurred. This PR pads the masks to match their corresponding masks.

The following images show the processing behaviour for annotations when there are two images in the batch, of different sizes.

Each image is resized so that it's shortest edge is 800, and the other edge is resized to maintain the aspect ratio. If the resulting longest edge is longer than 1333, then the longest edge is resized to 1333 and the shortest edge resized to maintain the aspect ratio.
The images are padded to the largest height and width dimensions in the batch
The masks are padded to match their respective image's padding.
The bounding box values are readjusted to account for the padded image size

Fixes #28153

Bounding boxes

In the previous processing logic, there were two possible scenarios:

If do_normalize=False then no action is needed. The output bboxes are not in relative format.
If do_normalize=True the relative coordinates of the bbox needs to be updated to account for the additional pixels when padding.

This PR:

Adds a new argument do_convert_annotations which enables the user to control whether the bboxes are converted independent of do_normalize. This is useful because 1) the normalization of bounding boxes is independent of the pixel values 2) The current normalize_annotations logic both normalizes AND converts to a different bbox format ((x0, y0, x1, y1) -> (center_x, center_y, width, height))
Conditionally updates the bounding boxes wrt the padded images only if do_convert_annotations=True. If do_convert_annotations=False this isn't necessary.

Here we see the input and output images, and their bbox annotations.

On main:

On this branch:

Masks

Masks are updated so they have the same padding as their corresponding image.

Here are the input and output images and masks:

On main:

On this branch:

On main:

On this branch:

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2024-02-01T17:47:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ydshieh

Thank you @amyeroberts , LGTM.

I would suggest to make the docstring in normalize_annotation more precisely: i.e. mention it also change the coordinates to relative one.

    def normalize_annotation(self, annotation: Dict, image_size: Tuple[int, int]) -> Dict:
        """
        Normalize the boxes in the annotation from `[top_left_x, top_left_y, bottom_right_x, bottom_right_y]` to
        `[center_x, center_y, width, height]` format.
        """
        return normalize_annotation(annotation, image_size=image_size)

And also make the fact that padding is done in bottom-right explicitly.

Both of these could be done in follow up PRs.

…ding (huggingface#28363) * Update the processing so bbox coords are adjusted for padding * Just pad masks * Tidy up, add tests * Better tests * Fix yolos and mark as slow for pycocotols * Fix yolos - return_tensors * Clarify padding and normalization behaviour

…ding (#28363) * Update the processing so bbox coords are adjusted for padding * Just pad masks * Tidy up, add tests * Better tests * Fix yolos and mark as slow for pycocotols * Fix yolos - return_tensors * Clarify padding and normalization behaviour

amyeroberts force-pushed the fix-detr-bbox-processing branch from 7482b3d to 3c819b0 Compare February 1, 2024 17:27

amyeroberts force-pushed the fix-detr-bbox-processing branch from af6fdbf to e41cbc2 Compare February 2, 2024 17:45

amyeroberts changed the title ~~Update the processing so bbox coords are adjusted for padding~~ [DETR] Update the processing to adapt masks to reflect padding Feb 8, 2024

amyeroberts added 3 commits February 8, 2024 21:56

Update the processing so bbox coords are adjusted for padding

a47fd5b

Just pad masks

d0e8662

Tidy up, add tests

bb22bc6

amyeroberts force-pushed the fix-detr-bbox-processing branch from 7d5e944 to bb22bc6 Compare February 8, 2024 21:56

amyeroberts marked this pull request as ready for review February 8, 2024 21:57

Better tests

370148b

amyeroberts changed the title ~~[DETR] Update the processing to adapt masks to reflect padding~~ [DETR] Update the processing to adapt masks & bboxes to reflect padding Feb 9, 2024

amyeroberts added 2 commits February 9, 2024 18:19

Fix yolos and mark as slow for pycocotols

d9ee41d

Fix yolos - return_tensors

dae9f93

amyeroberts requested a review from ydshieh February 9, 2024 18:47

ydshieh approved these changes Feb 13, 2024

View reviewed changes

Clarify padding and normalization behaviour

9418d8a

amyeroberts merged commit bd4b83e into huggingface:main Feb 13, 2024
17 of 18 checks passed

amyeroberts deleted the fix-detr-bbox-processing branch February 13, 2024 18:27

amyeroberts mentioned this pull request Feb 15, 2024

Fix - don't return pixel mask for yolos #29048

Closed

Dexterp37 mentioned this pull request Feb 23, 2024

YolosImageProcessor.preprocess drops annotations when padding #29239

Closed

4 tasks

amyeroberts mentioned this pull request Feb 26, 2024

[YOLOS] Fix - return padded annotations #29300

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`DETR`] Update the processing to adapt masks & bboxes to reflect padding #28363

[`DETR`] Update the processing to adapt masks & bboxes to reflect padding #28363

amyeroberts commented Jan 5, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 1, 2024

ydshieh left a comment

[DETR] Update the processing to adapt masks & bboxes to reflect padding #28363

[DETR] Update the processing to adapt masks & bboxes to reflect padding #28363

Conversation

amyeroberts commented Jan 5, 2024 • edited Loading

What does this PR do?

Bounding boxes

Masks

Before submitting

HuggingFaceDocBuilderDev commented Feb 1, 2024

ydshieh left a comment

Choose a reason for hiding this comment

[`DETR`] Update the processing to adapt masks & bboxes to reflect padding #28363

[`DETR`] Update the processing to adapt masks & bboxes to reflect padding #28363

amyeroberts commented Jan 5, 2024 •

edited

Loading