-
Notifications
You must be signed in to change notification settings - Fork 26.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize floating point cast #27249
Normalize floating point cast #27249
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! Nice finding! :)
Thank you for fixing it.
tests/test_image_transforms.py
Outdated
@@ -278,7 +278,7 @@ def test_resize(self): | |||
self.assertEqual(resized_image.shape, (4, 30, 40)) | |||
|
|||
def test_normalize(self): | |||
image = np.random.randint(0, 256, (224, 224, 3)) / 255 | |||
image = np.random.randint(0, 256, (224, 224, 3)).astype(np.float32) / 255 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this place need to cast to fp32?
(we have a test case below for it IIUC.)
If we cast here, we no longer test with python float numbers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was just a case of my laziness keeping it in float32, because otherwise it's float64 but it all works if I remove the cast!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I didn't realize this ...
tests/test_image_transforms.py
Outdated
|
||
# Test image with 4 channels is normalized correctly | ||
image = np.random.randint(0, 256, (224, 224, 4)) / 255 | ||
mean = (0.5, 0.6, 0.7, 0.8) | ||
std = (0.1, 0.2, 0.3, 0.4) | ||
expected_image = (image - mean) / std | ||
expected_image = (image.astype(np.float32) - mean) / std |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question
) | ||
|
||
# Test float32 image input keeps float32 dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I mean previously: we have the test case as above ..(?)
could you explain this a bit more? |
Sure! So for a lot of the image processors, the default normalization constants are floats e.g. |
I see, thank you a lot for the detail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
This is done if the input image isn't of floating type. Issues can occur when do_rescale=False is set in an image processor. When this happens, the image passed to the call is of type uint8 becuase of the type casting that happens in resize because of the PIL image library. As the mean and std values are cast to match the image dtype, this can cause NaNs and infs to appear in the normalized image, as the floating values being used to divide the image are now set to 0. The reason the mean and std values are cast is because previously they were set as float32 by default. However, if the input image was of type float16, the normalization would result in the image being upcast to float32 too.
6cf3713
to
eb1a1bb
Compare
* Normalize image - cast input images to float32. This is done if the input image isn't of floating type. Issues can occur when do_rescale=False is set in an image processor. When this happens, the image passed to the call is of type uint8 becuase of the type casting that happens in resize because of the PIL image library. As the mean and std values are cast to match the image dtype, this can cause NaNs and infs to appear in the normalized image, as the floating values being used to divide the image are now set to 0. The reason the mean and std values are cast is because previously they were set as float32 by default. However, if the input image was of type float16, the normalization would result in the image being upcast to float32 too. * Add tests * Remove float32 cast
What does this PR do?
This is done if the input image isn't of floating type. Issues can occur when
do_rescale=False
is set in an image processor. When this happens, the image passed to the call is of type uint8 because of the type casting that happens inresize
because of the PIL image library. As the mean and std values are cast to match the image dtype, this can cause NaNs and infs to appear in the normalized image, as the floating values being used to divide the image are now set to 0.The reason the mean and std values are cast is because previously they were set as float32 by default. However, if the input image was of type float16, the normalization would result in the image being upcast to float32 too.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.