[OWLv2] Add method #27003

NielsRogge · 2023-10-23T07:58:29Z

What does this PR do?

This PR adds a new image_guided_detection_v2 method, which unlike the original method, leverages the objectness head to get the top predicted object in the query image.

It also fixes the documentation example which has bad threshold values.

Fixes #26920

HuggingFaceDocBuilderDev · 2023-10-23T08:22:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts

Thanks for adding this!

A few notes:

I'm slightly unclear on the addition of this method. I can see a difference in the two approaches however this would benefit from more detail in the PR and code i.e. how does this relate to the original model? In particular, details should be added to docstrings.

It would be better to have a single image_guided_detection method which can be configured with flags. If necessary, we can break this down into private methods e.g. _image_guided_detection_objectness that are called. xxx_v2 isn't a descriptive method name.

We shouldn't commit notebooks to this repo.

amyeroberts · 2023-10-31T20:39:50Z

src/transformers/models/owlv2/test.ipynb

We shouldn't be uploading notebooks to this repo

amyeroberts · 2023-11-01T12:29:23Z

src/transformers/models/owlv2/modeling_owlv2.py

+        output_hidden_states: Optional[bool] = None,
+        return_dict: Optional[bool] = None,
+    ) -> Owlv2ImageGuidedObjectDetectionOutput:
+        r"""


The docstring should detail what the method does and how it differs from image_guided_detection

amyeroberts · 2023-11-01T12:32:14Z

src/transformers/models/owlv2/modeling_owlv2.py

+        query_pred_boxes = self.box_predictor(query_image_feats, feature_map=query_feature_map)
+        query_class_embeddings = self.class_predictor(query_image_feats)[1]
+
+        # v2 differs from v1 in that we use the objectness head to predict the objectness of the patches of the query image


Is this the naming that the model uses i.e. does the paper use "image_guided_detection_v1", "image_guided_detection_v2"?

As so much of this logic is shared - couldn't we control this with a flag in the original image_guided_detection method?

github-actions · 2023-11-26T08:03:37Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

NielsRogge added 3 commits October 30, 2023 20:42

Add v2 method

c514a71

Add method

3927e0e

Improve code

950c82f

NielsRogge force-pushed the fix_owlv2_image_guided branch from 26dfef1 to 950c82f Compare October 30, 2023 19:42

NielsRogge mentioned this pull request Oct 31, 2023

OWLv2 with Input box image guided detection NielsRogge/Transformers-Tutorials#364

Closed

NielsRogge requested a review from amyeroberts October 31, 2023 18:03

amyeroberts reviewed Nov 1, 2023

View reviewed changes

ArthurZucker mentioned this pull request Nov 6, 2023

Current implementation for DynamicNTKScalingRotaryEmbedding in modeling_llama.py does not update cos, sin correctly. #27226

Closed

github-actions bot closed this Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OWLv2] Add method #27003

[OWLv2] Add method #27003

NielsRogge commented Oct 23, 2023

HuggingFaceDocBuilderDev commented Oct 23, 2023

amyeroberts left a comment

amyeroberts Oct 31, 2023

amyeroberts Nov 1, 2023

amyeroberts Nov 1, 2023

github-actions bot commented Nov 26, 2023

[OWLv2] Add method #27003

[OWLv2] Add method #27003

Conversation

NielsRogge commented Oct 23, 2023

What does this PR do?

HuggingFaceDocBuilderDev commented Oct 23, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts Oct 31, 2023

Choose a reason for hiding this comment

amyeroberts Nov 1, 2023

Choose a reason for hiding this comment

amyeroberts Nov 1, 2023

Choose a reason for hiding this comment

github-actions bot commented Nov 26, 2023