Normalize floating point cast #27249

amyeroberts · 2023-11-02T18:15:56Z

What does this PR do?

This is done if the input image isn't of floating type. Issues can occur when do_rescale=False is set in an image processor. When this happens, the image passed to the call is of type uint8 because of the type casting that happens in resize because of the PIL image library. As the mean and std values are cast to match the image dtype, this can cause NaNs and infs to appear in the normalized image, as the floating values being used to divide the image are now set to 0.

The reason the mean and std values are cast is because previously they were set as float32 by default. However, if the input image was of type float16, the normalization would result in the image being upcast to float32 too.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2023-11-02T18:41:15Z

The documentation is not available anymore as the PR was closed or merged.

rafaelpadilla

Wow! Nice finding! :)

Thank you for fixing it.

ydshieh · 2023-11-03T08:45:35Z

tests/test_image_transforms.py

@@ -278,7 +278,7 @@ def test_resize(self):
        self.assertEqual(resized_image.shape, (4, 30, 40))

    def test_normalize(self):
-        image = np.random.randint(0, 256, (224, 224, 3)) / 255
+        image = np.random.randint(0, 256, (224, 224, 3)).astype(np.float32) / 255


Why this place need to cast to fp32?
(we have a test case below for it IIUC.)
If we cast here, we no longer test with python float numbers, right?

This was just a case of my laziness keeping it in float32, because otherwise it's float64 but it all works if I remove the cast!

oh, I didn't realize this ...

ydshieh · 2023-11-03T08:46:27Z

tests/test_image_transforms.py


        # Test image with 4 channels is normalized correctly
        image = np.random.randint(0, 256, (224, 224, 4)) / 255
        mean = (0.5, 0.6, 0.7, 0.8)
        std = (0.1, 0.2, 0.3, 0.4)
-        expected_image = (image - mean) / std
+        expected_image = (image.astype(np.float32) - mean) / std


same question

ydshieh · 2023-11-03T08:47:03Z

tests/test_image_transforms.py

        )

+        # Test float32 image input keeps float32 dtype


This is what I mean previously: we have the test case as above ..(?)

ydshieh · 2023-11-03T08:47:49Z

as the floating values being used to divide the image are now set to 0.

could you explain this a bit more?

amyeroberts · 2023-11-09T17:21:22Z

as the floating values being used to divide the image are now set to 0.

could you explain this a bit more?

Sure! So for a lot of the image processors, the default normalization constants are floats e.g. (0.5, 0.5, 0.5). In the current normalization implementation, mean and std are cast to the dtype of the image. If the input image is e.g. int32, then the normalization constants are cast to int and would become e.g. (0, 0, 0). So when normalizing with img - mean / std, we can end up with division by zero errors

ydshieh · 2023-11-09T21:05:06Z

as the floating values being used to divide the image are now set to 0.

could you explain this a bit more?

Sure! So for a lot of the image processors, the default normalization constants are floats e.g. (0.5, 0.5, 0.5). In the current normalization implementation, mean and std are cast to the dtype of the image. If the input image is e.g. int32, then the normalization constants are cast to int and would become e.g. (0, 0, 0). So when normalizing with img - mean / std, we can end up with division by zero errors

I see, thank you a lot for the detail.

ydshieh

Great!

This is done if the input image isn't of floating type. Issues can occur when do_rescale=False is set in an image processor. When this happens, the image passed to the call is of type uint8 becuase of the type casting that happens in resize because of the PIL image library. As the mean and std values are cast to match the image dtype, this can cause NaNs and infs to appear in the normalized image, as the floating values being used to divide the image are now set to 0. The reason the mean and std values are cast is because previously they were set as float32 by default. However, if the input image was of type float16, the normalization would result in the image being upcast to float32 too.

* Normalize image - cast input images to float32. This is done if the input image isn't of floating type. Issues can occur when do_rescale=False is set in an image processor. When this happens, the image passed to the call is of type uint8 becuase of the type casting that happens in resize because of the PIL image library. As the mean and std values are cast to match the image dtype, this can cause NaNs and infs to appear in the normalized image, as the floating values being used to divide the image are now set to 0. The reason the mean and std values are cast is because previously they were set as float32 by default. However, if the input image was of type float16, the normalization would result in the image being upcast to float32 too. * Add tests * Remove float32 cast

amyeroberts requested review from ydshieh and rafaelpadilla November 2, 2023 18:16

amyeroberts mentioned this pull request Nov 2, 2023

fix CLIPImageProcessor returns NaNs/Infs when input is a float tensor… #26542

Closed

5 tasks

rafaelpadilla approved these changes Nov 2, 2023

View reviewed changes

ydshieh reviewed Nov 3, 2023

View reviewed changes

ydshieh approved these changes Nov 9, 2023

View reviewed changes

amyeroberts added 3 commits November 10, 2023 13:47

Add tests

d4d1638

Remove float32 cast

eb1a1bb

amyeroberts force-pushed the normalize-floating-point-cast branch from 6cf3713 to eb1a1bb Compare November 10, 2023 13:47

amyeroberts merged commit ed115b3 into huggingface:main Nov 10, 2023
21 checks passed

amyeroberts deleted the normalize-floating-point-cast branch November 10, 2023 15:35

amyeroberts mentioned this pull request Nov 10, 2023

Add DINOv2 depth estimation #26092

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize floating point cast #27249

Normalize floating point cast #27249

amyeroberts commented Nov 2, 2023

HuggingFaceDocBuilderDev commented Nov 2, 2023 •

edited

Loading

rafaelpadilla left a comment

ydshieh Nov 3, 2023

amyeroberts Nov 9, 2023

ydshieh Nov 9, 2023

ydshieh Nov 3, 2023

ydshieh Nov 3, 2023

ydshieh commented Nov 3, 2023

amyeroberts commented Nov 9, 2023

ydshieh commented Nov 9, 2023

ydshieh left a comment

Normalize floating point cast #27249

Normalize floating point cast #27249

Conversation

amyeroberts commented Nov 2, 2023

What does this PR do?

Before submitting

HuggingFaceDocBuilderDev commented Nov 2, 2023 • edited Loading

rafaelpadilla left a comment

Choose a reason for hiding this comment

ydshieh Nov 3, 2023

Choose a reason for hiding this comment

amyeroberts Nov 9, 2023

Choose a reason for hiding this comment

ydshieh Nov 9, 2023

Choose a reason for hiding this comment

ydshieh Nov 3, 2023

Choose a reason for hiding this comment

ydshieh Nov 3, 2023

Choose a reason for hiding this comment

ydshieh commented Nov 3, 2023

amyeroberts commented Nov 9, 2023

ydshieh commented Nov 9, 2023

ydshieh left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Nov 2, 2023 •

edited

Loading