Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noticed bug in STE data augmentation (core/dataset.py#L158-L162) #19

Open
btamm12 opened this issue May 11, 2021 · 1 comment
Open

Noticed bug in STE data augmentation (core/dataset.py#L158-L162) #19

btamm12 opened this issue May 11, 2021 · 1 comment

Comments

@btamm12
Copy link

btamm12 commented May 11, 2021

In core/dataset.py#L158-L162

# random crop
width, height = video_data[0].size
f = random.uniform(0.5, 1)
i, j, h, w = RandomCrop.get_params(video_data[0], output_size=(int(height*f), int(width*f)))
video_data = [s.crop(box=(j, i, w, h)) for s in video_data]

[Source]

you pass the arguments (left, upper, width, height) into Image.crop() when it should be (left, upper, right, lower). The result is that the training boxes are smaller than intended.

PyTorch's implementation is the following

def crop(img: Image.Image, top: int, left: int, height: int, width: int) -> Image.Image:
    if not _is_pil_image(img):
        raise TypeError('img should be PIL Image. Got {}'.format(type(img)))

    return img.crop((left, top, left + width, top + height))

[Source]

with (top, left, height, width) the output of RandomCrop.get_params(). I would recommend using the following to avoid argument-conversion mistakes.

# random crop
width, height = video_data[0].size
f = random.uniform(0.5, 1)
crop_module = RandomCrop(size=(int(height*f), int(width*f)))
video_data = [crop_module.forward(img) for img in images]

Visualization (first existing code, then fixed code)

Note: the exact position of the crop should be ignored. Only the image size is relevant. Also notice how the old pictures are generally not square crops.

f == 0.98 : negligible
old-f98
new-f98

f == 0.52 : significant
old-f52
new-f52

I haven't run your code with this fix, so I don't know how much the results would improve (if at all).

@fuankarion
Copy link
Owner

fuankarion commented May 11, 2021

Hi,

Thanks, we also discovered this bug recently. The effect on performance is minimal, but after fixing it the model does converge faster (it needs only 70-80 epochs). I'll post the updated code this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants