You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# random crop
width, height = video_data[0].size
f = random.uniform(0.5, 1)
i, j, h, w = RandomCrop.get_params(video_data[0], output_size=(int(height*f), int(width*f)))
video_data = [s.crop(box=(j, i, w, h)) for s in video_data]
you pass the arguments (left, upper, width, height) into Image.crop() when it should be (left, upper, right, lower). The result is that the training boxes are smaller than intended.
PyTorch's implementation is the following
def crop(img: Image.Image, top: int, left: int, height: int, width: int) -> Image.Image:
if not _is_pil_image(img):
raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
return img.crop((left, top, left + width, top + height))
with (top, left, height, width) the output of RandomCrop.get_params(). I would recommend using the following to avoid argument-conversion mistakes.
# random crop
width, height = video_data[0].size
f = random.uniform(0.5, 1)
crop_module = RandomCrop(size=(int(height*f), int(width*f)))
video_data = [crop_module.forward(img) for img in images]
Visualization (first existing code, then fixed code)
Note: the exact position of the crop should be ignored. Only the image size is relevant. Also notice how the old pictures are generally not square crops.
f == 0.98 : negligible
f == 0.52 : significant
I haven't run your code with this fix, so I don't know how much the results would improve (if at all).
The text was updated successfully, but these errors were encountered:
Thanks, we also discovered this bug recently. The effect on performance is minimal, but after fixing it the model does converge faster (it needs only 70-80 epochs). I'll post the updated code this week.
In core/dataset.py#L158-L162
[Source]
you pass the arguments (left, upper, width, height) into
Image.crop()
when it should be (left, upper, right, lower). The result is that the training boxes are smaller than intended.PyTorch's implementation is the following
[Source]
with (top, left, height, width) the output of
RandomCrop.get_params()
. I would recommend using the following to avoid argument-conversion mistakes.Visualization (first existing code, then fixed code)
Note: the exact position of the crop should be ignored. Only the image size is relevant. Also notice how the old pictures are generally not square crops.
f == 0.98 : negligible
f == 0.52 : significant
I haven't run your code with this fix, so I don't know how much the results would improve (if at all).
The text was updated successfully, but these errors were encountered: