Ported Dinov2 to flax #25579

ifeherva · 2023-08-17T21:55:28Z

Ported the Dinov2 model to jax/flax

This PR adds the dinov2 model in flax. It is based on the vit flax port but uses the existing pytorch dinov2 as base.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sanchit-gandhi @amyeroberts

HuggingFaceDocBuilderDev · 2023-08-18T13:01:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts

Thanks for adding this model!

V. nice and easy to read PR. Mostly just nits. Main comments are to add the copied from statements and integration tests.

src/transformers/models/dinov2/modeling_flax_dinov2.py

tests/models/dinov2/test_modeling_flax_dinov2.py

amyeroberts · 2023-08-18T13:08:51Z

tests/models/dinov2/test_modeling_flax_dinov2.py

There should also be integration tests for the model e.g. like these for beit.

+1 on this!

src/transformers/models/dinov2/modeling_flax_dinov2.py

amyeroberts · 2023-08-18T13:27:55Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+    def setup(self):
+        out_features = self.config.hidden_size
+        hidden_features = int(self.config.hidden_size * self.config.mlp_ratio)
+        hidden_features = (int(hidden_features * 2 / 3) + 7) // 8 * 8


Where re these numbers - 2/3, 7, 8 * 8 coming from?

From the pytorch implementation which seems to be copying from the original repo: https://github.com/facebookresearch/dinov2/blob/main/dinov2/layers/swiglu_ffn.py#L57

src/transformers/models/dinov2/modeling_flax_dinov2.py

sanchit-gandhi

Looking great already! Thanks for such a clean PR @ifeherva 🙌 Echo'ing @amyeroberts's points about using # Copied from statements where possible, and adding a few slow integration tests to check that we get the same outputs as the PyTorch model when using real checkpoints (just need to assert that the values we get out are match an expected array, where the expected array is the same as the PyTorch outputs)

sanchit-gandhi · 2023-08-22T15:15:22Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+
+
+DINOV2_PRETRAINED_MODEL_ARCHIVE_LIST = [
+    "facebook/dinov2-base",


Have you pushed the Flax weights to a pull request on this repo? It would be nice to do this in tandem with this PR!

Not yet, I still need to do that I guess :)

sanchit-gandhi · 2023-08-22T15:19:20Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+
+    def setup(self):
+        self.lambda1 = self.param(
+            "lambda1", jax.nn.initializers.constant(self.config.layerscale_value), (self.config.hidden_size,)


Should this parameter always be in float32 precision, or should it respect the dtype of the model? Usually, we cast everything to the dtype of the model, such that the forward computation is done in the specified dtype

Otherwise, we might upcast inadvertantly to a higher dtype than the model dtype during the forward pass

Suggested change

"lambda1", jax.nn.initializers.constant(self.config.layerscale_value), (self.config.hidden_size,)

"lambda1", jax.nn.initializers.constant(self.config.layerscale_value, dtype=self.dtype), (self.config.hidden_size,)

I am not sure that is intended, there is even an explicit test for checking if all params are initialized to float32 (test_default_params_dtype). If I add the proposed line above the test will fail.

It shouldn't fail if the attribute self.dtype is float32 no? Then it'll be initialised in float32? Currently, in the PyTorch version, if we send the model to bfloat16, then this parameter lambda1 is also in bfloat16. To have equivalence in Flax, we need to pass the dtype attribute to this param, such that we can put the Flax weights in bfloat16 as required

In that case the unit test might be broken. If I pass self.dtype, it actually initializes it in float16 in the unit test and then complain why it is not float32 :)

That's strange - could you check that the attribute dtype is being passed down correctly from the top level modules to the lower level ones? (e.g. you could just print out self.dtype for all of the modules and see where it goes from fp32 -> fp16)

The type is passed down correctly, however this test expects the params to be float32 even when you initialize it with float16: https://github.com/huggingface/transformers/blob/main/tests/test_modeling_flax_common.py#L789
Not sure I really understand the logic here.

Looked into this more - you're correct in that we should not change the dtype of the params! We should only change the dtype of the computation. I'll close this thread and suggest the appropriate fix

src/transformers/models/dinov2/modeling_flax_dinov2.py

sanchit-gandhi · 2023-08-22T15:33:35Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+    >>> image = Image.open(requests.get(url, stream=True).raw)
+
+    >>> image_processor = AutoImageProcessor.from_pretrained("facebook/dinov2-base-patch16-224")
+    >>> model = FlaxDinov2ForImageClassification.from_pretrained("facebook/dinov2-base-patch16-224")


Let's make sure that this checkpoint has flax weights uploaded

This doesn't exist so I will rework this description. Thanks for flagging it.

Feel free to open a PR on the Hub to add the Flax weights. You can load them from_pretrained with from_pt=True:

model = FlaxDinov2ForImageClassification.from_pretrained("facebook/dinov2-base-patch16-224", from_pt=True

And then push the weights to the Hub with:

model.push_to_hub("facebook/dinov2-base-patch16-224", create_pr=True)

Right now only the base model has weights. I converted that and opened a PR on the hub: https://huggingface.co/facebook/dinov2-base/discussions/5

Sure! Thanks for opening a PR to upload them to the Hub, that's great! It would be cool to make sure the fine-tuned version has weights before we merge this PR

I am not sure such weights are in the public domain. At least I couldn't find them... :(

In that case let's just update the code snippet to use the model variant that has the weights - could you use the same checkpoint that is used in the PyTorch dinov2 code possibly?

Yeah, I changed the code yesterday to use the ..-base model for now. Once this PR gets merged I can convert the other (larger) ones as well and push on the hub.

Cool - sounds good :)

tests/models/dinov2/test_modeling_flax_dinov2.py

sanchit-gandhi · 2023-08-22T15:35:11Z

tests/models/dinov2/test_modeling_flax_dinov2.py

+1 on this!

Added missing type annotations

…dule

sanchit-gandhi

Thanks for iterating @ifeherva - a few pending points but otherwise looking good!

sanchit-gandhi · 2023-09-14T14:15:20Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+
+    def setup(self):
+        self.lambda1 = self.param(
+            "lambda1", jax.nn.initializers.constant(self.config.layerscale_value), (self.config.hidden_size,)


Looked into this more - you're correct in that we should not change the dtype of the params! We should only change the dtype of the computation. I'll close this thread and suggest the appropriate fix

sanchit-gandhi · 2023-09-14T14:17:03Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+        )
+
+    def __call__(self, hidden_state):
+        return hidden_state * self.lambda1


Suggested change

return hidden_state * self.lambda1

hidden_state = hidden_state * self.lambda1

return hidden_state.astype(self.dtype)

sanchit-gandhi · 2023-09-14T14:19:32Z

src/transformers/models/dinov2/modeling_flax_dinov2.py

+
+    >>> # model predicts one of the 1000 ImageNet classes
+    >>> predicted_class_idx = jax.numpy.argmax(logits, axis=-1)
+    >>> print("Predicted class:", model.config.id2label[predicted_class_idx.item()])


Still pending! If you could address in a similar way to the pytorch code:

transformers/src/transformers/models/dinov2/modeling_dinov2.py

Lines 823 to 824 in 866df66

>>> list(feature_maps[-1].shape)

[1, 768, 16, 16]

Suggested change

>>> print("Predicted class:", model.config.id2label[predicted_class_idx.item()])

>>> model.config.id2label[predicted_class_idx.item()])

# put predicted class here (without hash symbol)

github-actions · 2023-10-12T08:05:21Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sanchit-gandhi · 2023-10-26T17:37:04Z

Hey @ifeherva! Given you made a great start on this PR, it would be super nice if you were able to see it to completion! On hand to help with any questions/queries 🤗 Of course if you are busy, there's no pressure to finish this. In this case, we can open it up to the community to see if anyone is able to finish the integration so that this work is merged into main

amyeroberts reviewed Aug 18, 2023

View reviewed changes

sanchit-gandhi reviewed Aug 22, 2023

View reviewed changes

ifeherva added 10 commits September 4, 2023 19:17

Added the dinov2 flax port

f977810

Fixed coding style issues

16ee8b8

Fixed code comments

3d03e32

Added missing type annotations

Added copied from statements for matching ViT classes

d5295e4

Added copied-from to FlaxDinov2PatchEmbeddings

6b37a76

Added copied from to FlaxDinov2SelfAttention

3bf0138

Added copied from to FlaxDinov2PreTrainedModel

3eb1cf7

Removed unnecessary comment

63d93fb

Fixed 0 class labels classifier in FlaxDinov2ForImageClassificationMo…

e773dc9

…dule

Reduced batch size in the test file

0ea8763

ifeherva force-pushed the dinov2_flax branch from c9fad6f to 0ea8763 Compare September 5, 2023 02:18

Fixed missing dinov2 weights

5b0d0c7

sanchit-gandhi reviewed Sep 14, 2023

View reviewed changes

github-actions bot closed this Oct 20, 2023

MHRDYN7 mentioned this pull request Jul 14, 2024

Add Flax Dinov2 #31960

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ported Dinov2 to flax #25579

Ported Dinov2 to flax #25579

ifeherva commented Aug 17, 2023

HuggingFaceDocBuilderDev commented Aug 18, 2023

amyeroberts left a comment

amyeroberts Aug 18, 2023

sanchit-gandhi Aug 22, 2023

amyeroberts Aug 18, 2023

ifeherva Aug 18, 2023

sanchit-gandhi left a comment

sanchit-gandhi Aug 22, 2023

ifeherva Aug 29, 2023

sanchit-gandhi Aug 22, 2023

ifeherva Aug 26, 2023

sanchit-gandhi Aug 29, 2023 •

edited

Loading

ifeherva Aug 29, 2023

sanchit-gandhi Aug 30, 2023

ifeherva Sep 12, 2023

sanchit-gandhi Sep 14, 2023

sanchit-gandhi Aug 22, 2023

ifeherva Sep 5, 2023

sanchit-gandhi Sep 5, 2023

ifeherva Sep 6, 2023

sanchit-gandhi Sep 6, 2023

ifeherva Sep 6, 2023

sanchit-gandhi Sep 6, 2023

ifeherva Sep 6, 2023

sanchit-gandhi Sep 7, 2023

sanchit-gandhi Aug 22, 2023

sanchit-gandhi left a comment

sanchit-gandhi Sep 14, 2023

sanchit-gandhi Sep 14, 2023

sanchit-gandhi Sep 14, 2023

github-actions bot commented Oct 12, 2023

sanchit-gandhi commented Oct 26, 2023



		DINOV2_PRETRAINED_MODEL_ARCHIVE_LIST = [
		"facebook/dinov2-base",

	"lambda1", jax.nn.initializers.constant(self.config.layerscale_value), (self.config.hidden_size,)
	"lambda1", jax.nn.initializers.constant(self.config.layerscale_value, dtype=self.dtype), (self.config.hidden_size,)

	return hidden_state * self.lambda1
	hidden_state = hidden_state * self.lambda1
	return hidden_state.astype(self.dtype)

	>>> print("Predicted class:", model.config.id2label[predicted_class_idx.item()])
	>>> model.config.id2label[predicted_class_idx.item()])
	# put predicted class here (without hash symbol)

Ported Dinov2 to flax #25579

Ported Dinov2 to flax #25579

Conversation

ifeherva commented Aug 17, 2023

Ported the Dinov2 model to jax/flax

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Aug 18, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanchit-gandhi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanchit-gandhi Aug 29, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanchit-gandhi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Oct 12, 2023

sanchit-gandhi commented Oct 26, 2023

sanchit-gandhi Aug 29, 2023 •

edited

Loading