Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I inform the initial bounding box visibilities? #1767

Open
gui-miotto opened this issue Jun 3, 2024 · 3 comments
Open

Can I inform the initial bounding box visibilities? #1767

gui-miotto opened this issue Jun 3, 2024 · 3 comments
Labels
Need more info question Further information is requested

Comments

@gui-miotto
Copy link

gui-miotto commented Jun 3, 2024

My Question

Is it possible to inform a transformation what is the initial visibility of each bounding box?

As far as I know, the transformation always assume that the objects are 100% visible in the before the transformation. But in real life, that is not always the case

Additional Context

I work with object detection on very high resolution images. As a preprocessing step, the images of the training dataset have to be sliced before they can be used by the model. During this preprocessing, the visibility of many bounding boxes become less than 100%. Of course, I can calculate those values, but is there a way to use them with albumentations?

@gui-miotto gui-miotto added the question Further information is requested label Jun 3, 2024
@gui-miotto
Copy link
Author

If this is not possible, a workaround could be achieved if albumentation returns me the "perceived" visibility after the transformation. In that way I could calculate the "real" visibility as the multiplication of the initial and the perceived visibilities.

@ternaus
Copy link
Collaborator

ternaus commented Jun 3, 2024

I do not understand the question yet.

As I understand, you crop parts from the image and bounding boxes that are not 100% contained in the image get truncated, right? And this becomes an issue.

Or not?

Could you provide some code?

@gui-miotto
Copy link
Author

gui-miotto commented Jun 4, 2024

Hi @ternaus , thanks for the reply.

Yes, you are correct. They get truncated. Therefore their visibility is not 100% to start with.

Unfortunately I don't think providing code will make things any clearer, because this is more of an workflow problem. So let me give an hypothetical situation:

1 - Imagine my dataset have images 2000x2000 px.
2 - My model just works with images of 500x500.
3 - Since I need the full resolution to identify the objects, I should not shrink the images. What I do instead is to slice the full res image (2000x2000) into 16 non-overlapping 500x500 patches.
4 - Now, imagine that there is a 2000x2000 image with two objects. During the slicing process, one object gets cut in half. The other stays fully visible in a single patch.

Everything up to this point happens before training the model. Its dataset pre-processing and has nothing to do with Albumentations.

5 - Now I'll start training a model and use albumentations. Then comes the question: Given that I want to work with minimal visibility of 40%, which value of min_visibility should I give to albumentations?

  • If I use 40%, the object that was cut in half may end up being only 20% visible (40% of 50%)
  • If I use a higher value, that would be too conservative for the object that stayed fully visible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Need more info question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants