Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

position, colour, and background colour of text labels in draw_bounding_boxes #8317

Open
carandraug opened this issue Mar 14, 2024 · 6 comments

Comments

@carandraug
Copy link

carandraug commented Mar 14, 2024

🚀 The feature

Text labels from torchvision.utils.draw_bounding_boxes are currently always inside the box with origin at the top left corner of the box, without a background colour, and the same colour as the bounding box itself. These are three things that would be nice to control.

Motivation, pitch

The problem with the current implementation is that it makes it hard to read the label, particularly when the bounding box is filled (because the text has the same colour as the filling colour and is placed inside the box.

For example, this is the results from the current implementation:

intro-detection-R52854-JRL231711104-coco

Moving the label to outside the box already makes things better:

intro-detection-R52854-JRL231711104

But by controlling those three things (placement of label, background colour behind the label, and text colour) one could fit to whatever they have. For what is worth, in the original issue for this feature, the only example image had labels outside the box, text coloured different from the box (black), and background of the same colour as the box. See #2556 (comment)

I'm happy to contribute this but want to know if this will be accepted and with what interface.

@NicolasHug
Copy link
Member

Thanks for opening this issue @carandraug. I think the proposal is reasonable, the current position of the label does make them difficult to read.

In terms of API / functionality, what exactly would you have in mind?

@Nika-St
Copy link

Nika-St commented Jul 16, 2024

Hi, I'd have the same request!

  • Option 1
    How about adding a 'text_position' argument to torchvision.utils.draw_bounding_boxes, with options 'inside', 'above', 'below', 'left', 'right', and 'auto' (with auto taking one of the options as default but falling back to different ones if text would end up outside of the image or on top of other text). I guess, auto would be a next iteration :)

  • Option 2
    The same 'text_position' argument is a function returning x,y coordinates where the text should start w.r.t. bbox

@carandraug
Copy link
Author

In terms of API / functionality, what exactly would you have in mind?

For the colours, I think something like draw_label_kwargs could then be passed to draw.text. This exposes all the flexibility that PIL provides us.

For label positioning, there is a lot of things going on. There is the position to the box, and then whether the label goes outside or inside the box. I think we could just copy matplotlib syntax for placement of the legend box. Effectively, a string defines location ("upper right" or "center left") and a tuple defines a sort of offset to that (see bbox_to_anchor). This stackoverflow answer explains it it detail https://stackoverflow.com/a/43439132 )

@NicolasHug
Copy link
Member

That SGTM, thanks for the details.
Provided that the licensing terms of matplotlib allow it, I hope there exist a piece of code that we can just copy/paste from matplotlib to get the label position based on the user-defined parameter and on the current bbox position. It would be a lot more complex if we had to implement all of that logic ourselves.

@carandraug
Copy link
Author

That SGTM, thanks for the details.

Can you confirm that both (text colour/font/etc and label positioning) look good? Just want to make sure that you're not only referring to the label positioning? I If so, I will prepare a PR for text colour/etc first and then another for the label positioning.

Provided that the licensing terms of matplotlib allow it, I hope there exist a piece of code that we can just copy/paste from matplotlib to get the label position based on the user-defined parameter and on the current bbox position. It would be a lot more complex if we had to implement all of that logic ourselves.

I'll take a look at matplotlib license and logic but we probably can't use it as is. At the very least, we need to handle the case where placing the label outside the resion places the label outside the image. With matplotlib, the plot has marging but we do not. I'll experiment and see what I think is the most reasonable behaviour when I start coding but I think it will be try to place the closest to the desired position while ensuring that the text stays inside the image.

@NicolasHug
Copy link
Member

NicolasHug commented Jul 29, 2024

I If so, I will prepare a PR for text colour/etc first and then another for the label positioning

Yes, that sounds good. Happy to consider better default for these. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants