Add Support for Electra #400

amitkumarj441 · 2022-08-05T00:02:09Z

Support architecture proposed in Electra: Pre-training text encoders as discriminators rather than generators

calpt

Hey @amitkumarj441, thanks a lot for working on this! While there are still a few tests failing currently, I did a partial review of your changes and left some comments. All in all, it looks very good.

Besides fixing the missing tests, please have look at our contribution guide for the required documentation steps for a new model. Let me know if anything is unclear or you need any assistance from our side!

calpt · 2022-08-26T13:28:41Z

src/transformers/adapters/models/electra/adapter_model.py

+        self.add_prediction_head(head, overwrite_ok=overwrite_ok)
+
+
+class ElectraModelWithHeads(ElectraAdapterModel):


The model classes of the form XModelWithHeads are deprecated, so we don't want to add those classes for newly supported architectures. ElectraAdapterModel should be used for all cases. Please remove this class.

calpt · 2022-08-26T13:31:02Z

src/transformers/adapters/head_utils.py

@@ -401,6 +401,16 @@
        },
        "layers": {"classifier"},
    },
+    # Electra


Since Electra also provides other task-specific model classes (e.g. ElectraForTokenClassification) in its modeling file, it would be great to also have conversations for those here.

calpt · 2022-08-26T13:34:08Z

src/transformers/adapters/mixins/electra.py

+from ..model_mixin import InvertibleAdaptersMixin, ModelAdaptersMixin
+
+
+# For backwards compatibility, ElectraSelfOutput inherits directly from AdapterLayer


Suggested change

# For backwards compatibility, ElectraSelfOutput inherits directly from AdapterLayer

calpt · 2022-08-26T13:34:18Z

src/transformers/adapters/mixins/electra.py

+        super().__init__("mh_adapter", None)
+
+
+# For backwards compatibility, ElectraOutput inherits directly from AdapterLayer


Suggested change

# For backwards compatibility, ElectraOutput inherits directly from AdapterLayer

amitkumarj441 · 2022-09-06T12:43:56Z

Hey @amitkumarj441, thanks a lot for working on this! While there are still a few tests failing currently, I did a partial review of your changes and left some comments. All in all, it looks very good.

Besides fixing the missing tests, please have look at our contribution guide for the required documentation steps for a new model. Let me know if anything is unclear or you need any assistance from our side!

Thanks @calpt for reviewing this PR. I will make changes as suggested soon.

pauli31 · 2022-10-10T08:56:46Z

Any update soon?

pauli31 · 2022-10-25T12:17:40Z

src/transformers/models/electra/modeling_electra.py

-        self.value = nn.Linear(config.hidden_size, self.all_head_size)
+        self.query = LoRALinear(config.hidden_size, self.all_head_size, "selfattn", config)
+        self.key = LoRALinear(config.hidden_size, self.all_head_size, "selfattn", config)
+        self.value = LoRALinear(config.hidden_size, self.all_head_size, "selfattn", config)


In BERT there is the attn_key param

self.query = LoRALinear(config.hidden_size, self.all_head_size, "selfattn", config, attn_key="q") self.key = LoRALinear(config.hidden_size, self.all_head_size, "selfattn", config, attn_key="k") self.value = LoRALinear(config.hidden_size, self.all_head_size, "selfattn", config, attn_key="v")

pauli31 · 2022-10-25T12:19:53Z

src/transformers/models/electra/modeling_electra.py

@@ -295,6 +306,7 @@ def forward(
            # if encoder bi-directional self-attention `past_key_value` is always `None`
            past_key_value = (key_layer, value_layer)

+        key_layer, value_layer, attention_mask = self.prefix_tuning(key_layer, value_layer, attention_mask)


key_layer, value_layer, attention_mask = self.prefix_tuning(key_layer, value_layer, hidden_states, attention_mask)
Missing hidden_states param maybe?

hSterz · 2023-09-05T11:03:12Z

Hey, thanks for your work on this. We have been working on developing a new adapters version of the library which is decoupled from the transformers library (see #584 for details). We want to add Electra support to the adapters library and started implementing the support based on this PR in #583.

calpt · 2023-09-05T20:05:18Z

Closing in favor of #583.

amitkumarj441 added 4 commits August 5, 2022 00:57

Add Electra

b05b5ee

fix quality and tests

01e394a

Add LoRA

9bd09e5

fix tests

2d4e795

amitkumarj441 force-pushed the epatch branch from 61e2aaa to 2d4e795 Compare August 5, 2022 20:09

calpt reviewed Aug 26, 2022

View reviewed changes

calpt linked an issue Oct 10, 2022 that may be closed by this pull request

Add support for ElectraModel #287

Closed

pauli31 reviewed Oct 25, 2022

View reviewed changes

calpt force-pushed the master branch from 8793a65 to 068286d Compare November 23, 2022 12:02

calpt closed this Sep 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for Electra #400

Add Support for Electra #400

amitkumarj441 commented Aug 5, 2022 •

edited

Loading

calpt left a comment

calpt Aug 26, 2022

calpt Aug 26, 2022

calpt Aug 26, 2022

calpt Aug 26, 2022

amitkumarj441 commented Sep 6, 2022

pauli31 commented Oct 10, 2022 •

edited

Loading

pauli31 Oct 25, 2022

pauli31 Oct 25, 2022

hSterz commented Sep 5, 2023

calpt commented Sep 5, 2023

		self.add_prediction_head(head, overwrite_ok=overwrite_ok)


		class ElectraModelWithHeads(ElectraAdapterModel):

		from ..model_mixin import InvertibleAdaptersMixin, ModelAdaptersMixin


		# For backwards compatibility, ElectraSelfOutput inherits directly from AdapterLayer

		super().__init__("mh_adapter", None)


		# For backwards compatibility, ElectraOutput inherits directly from AdapterLayer

Add Support for Electra #400

Add Support for Electra #400

Conversation

amitkumarj441 commented Aug 5, 2022 • edited Loading

calpt left a comment

Choose a reason for hiding this comment

calpt Aug 26, 2022

Choose a reason for hiding this comment

calpt Aug 26, 2022

Choose a reason for hiding this comment

calpt Aug 26, 2022

Choose a reason for hiding this comment

calpt Aug 26, 2022

Choose a reason for hiding this comment

amitkumarj441 commented Sep 6, 2022

pauli31 commented Oct 10, 2022 • edited Loading

pauli31 Oct 25, 2022

Choose a reason for hiding this comment

pauli31 Oct 25, 2022

Choose a reason for hiding this comment

hSterz commented Sep 5, 2023

calpt commented Sep 5, 2023

amitkumarj441 commented Aug 5, 2022 •

edited

Loading

pauli31 commented Oct 10, 2022 •

edited

Loading