diff --git a/docs/embeddings.md b/docs/embeddings.md index 1714615071..5699f11897 100644 --- a/docs/embeddings.md +++ b/docs/embeddings.md @@ -1,7 +1,7 @@ # Embeddings With `adapters`, we support dynamically adding, loading, and deleting of `Embeddings`. This section -will give you an overview of these features. +will give you an overview of these features. A toy example is illustrated in this [notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/Adapter_With_Embeddings.ipynb). ## Adding and Deleting Embeddings The methods for handling embeddings are similar to the ones handling adapters. To add new embeddings we call @@ -12,13 +12,12 @@ is currently active, the `active_embeddings` property contains the currently act ```python model.add_embeddings('name', tokenizer, reference_embedding='default', reference_tokenizer=reference_tokenizer) -embedding_name = model.active_embeddings ``` The original embedding of the transformers model is always available under the name `"default"`. To set it as the active embedding simply call the `set_active_embedding('name')` method. ```python -model.set_active_embeddings("default") +model.set_active_embeddings('name') ``` Similarly, all other embeddings can be set as active by passing their name to the `set_active_embedding` method. @@ -29,6 +28,13 @@ model.delete_embeddings('name') ``` Please note, that if the active embedding is deleted the default embedding is set as the active embedding. +## Training Embeddings +Embeddings can only be trained with an adapter. To freeze all weights except for the embedding and the adapter: +```python +model.train_adapter('adapter_name', train_embeddings=True) +``` +Except for the `train_embeddings` flag, the training is the same as for just training an adapter (see [Adapter Training](training.md)). + ## Saving and Loading Embeddings You can save the embeddings by calling `save_embeddings('path/to/dir', 'name')` and load them with `load_embeddings('path/to/dir', 'name')`. diff --git a/notebooks/Adapter_With_Embeddings.ipynb b/notebooks/Adapter_With_Embeddings.ipynb new file mode 100644 index 0000000000..badf9ed266 --- /dev/null +++ b/notebooks/Adapter_With_Embeddings.ipynb @@ -0,0 +1,784 @@ +{ + "cells": [ + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Adapter Training with Embeddings\n", + "\n", + "The `adapters` library also allows you to train the embeddings with your adapter. This can also be used with a completly different tokenizer. This can be beneficial e.g. if the language you are working with is not well suited for the tokenizer of the model.\n", + "\n", + "This notebook will show how to train embeddings for a new tokenizer with an example case. (Note that this is only if an illustrative example that trains for a shorter number of steps, so the difference between the original and the new embeddings performance is very small.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "_CTO-c0uBrA7", + "outputId": "390048c1-4423-4229-cc46-52bd2d39110a" + }, + "outputs": [], + "source": [ + "! pip install -U adapters\n", + "! pip install -q datasets\n", + "! pip install -q accelerate" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "id": "GUSfPj1jFtBF" + }, + "source": [ + "Adding embeddings follows the same structure as adding adapters. Simply call `add_embeddings` and provide a new name for the embedding and the tokenizer that the embeddings should work with.\n", + "\n", + "To copy embeddings that are shared with an other tokenizer provide the name of the embeddings as `reference_embeddings` (or `default` if you want to use the original embeddings of the loaded model) and `reference_tokenizer` corresponding to the reference embeddings." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "eo6klvA3Bsf4", + "outputId": "6f0ceb41-7fa8-4037-ca67-4bd22eacdddb" + }, + "outputs": [], + "source": [ + "from adapters import AutoAdapterModel\n", + "from transformers import AutoTokenizer\n", + "\n", + "model_name = \"roberta-base\"\n", + "\n", + "tokenizer = AutoTokenizer.from_pretrained(\"google-bert/bert-base-chinese\")\n", + "\n", + "chinese_tokenizer = AutoTokenizer.from_pretrained(model_name)\n", + "\n", + "model = AutoAdapterModel.from_pretrained(model_name)\n", + "model.add_adapter(\"a\")\n", + "model.add_embeddings(\"a\", chinese_tokenizer, reference_embedding=\"default\", reference_tokenizer=tokenizer)\n", + "model.add_classification_head(\"a\", num_labels=2)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "id": "OXwk1v7NHCgV" + }, + "source": [ + "To set the active embeddings, call `set_active_embeddings` and pass the name of the embeddings you want to set as active." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "Dt0VVFFFCS0T" + }, + "outputs": [], + "source": [ + "model.set_active_embeddings(\"a\")" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "id": "2sedLcurHLlk" + }, + "source": [ + "To train the embeddings, set the `train_embeddings` attribute to true in the `train_adapter` method. This will set the passed adapter setup as active and freeze all weights except for the adapter weights and the embedding weights (make sure the correct embedding is activated with `set_active_embeddings`)." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "TynU-4B1FQ10" + }, + "outputs": [], + "source": [ + "model.train_adapter(\"a\", train_embeddings=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "G49N4LTnOKuf" + }, + "source": [ + "Next, we load and preprocess the dataset." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "g0Ay4cK5Cdu1", + "outputId": "a0de5a1d-b90e-4950-d5da-974e1cf8bb8b" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "DatasetDict({\n", + " train: Dataset({\n", + " features: ['sentence1', 'sentence2', 'label'],\n", + " num_rows: 62477\n", + " })\n", + " validation: Dataset({\n", + " features: ['sentence1', 'sentence2', 'label'],\n", + " num_rows: 20000\n", + " })\n", + " test: Dataset({\n", + " features: ['sentence1', 'sentence2', 'label'],\n", + " num_rows: 20000\n", + " })\n", + "})" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from datasets import load_dataset\n", + "\n", + "dataset = load_dataset(\"shibing624/nli_zh\", \"ATEC\")\n", + "dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 49, + "referenced_widgets": [ + "98c24a189b24490aac44830482c18ff5", + "49d5550b9d1c464a93ea012c17fcb903", + "08fb5b4d88aa46cf9ac9ae48d752ae35", + "80fd53eb99ed469d9ff7c35efaf71849", + "47d9afe6304c4c6da9357a0f08439a91", + "11d79c476f1949f0b944eeb42a0d7e4c", + "fa2d8938a1f04fbd984b5621ef74fa27", + "949b410645704312be1043287b9686ad", + "7ebf93d6ab3843f9b98b52da2fc34def", + "f3c836272a8a40a98fb7258269dac760", + "71a73c610e0c48c3881c5329fb9f0e75" + ] + }, + "id": "ARMSP7k_Dbsv", + "outputId": "a05ddef4-33dd-487c-f88f-2363103a2971" + }, + "outputs": [ + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "98c24a189b24490aac44830482c18ff5", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Map: 0%| | 0/20000 [00:00\n", + " \n", + " \n", + " [625/625 01:29]\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "{'eval_loss': 0.4556490182876587,\n", + " 'eval_acc': 0.81615,\n", + " 'eval_runtime': 89.8272,\n", + " 'eval_samples_per_second': 222.65,\n", + " 'eval_steps_per_second': 6.958,\n", + " 'epoch': 2.56}" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "trainer.evaluate()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": { + "id": "IPu6q0_NOWpX" + }, + "source": [ + "You can dynamically change the embeddings. For instance, to evaluate with the original embedding you can simply do the following:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "4pcHbes-JDba" + }, + "outputs": [], + "source": [ + "model.set_active_embeddings(\"default\")" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 147 + }, + "id": "luyH4MIXLck-", + "outputId": "1743e066-f868-499f-b687-9c9df47c5ab7" + }, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " \n", + " \n", + " [625/625 02:58]\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "{'eval_loss': 0.5493476390838623,\n", + " 'eval_acc': 0.81535,\n", + " 'eval_runtime': 88.7409,\n", + " 'eval_samples_per_second': 225.375,\n", + " 'eval_steps_per_second': 7.043,\n", + " 'epoch': 2.56}" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "trainer.evaluate()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This notebook provides a a toy example on how to add, train and change the embedding. For more info, check our [documentation](https://docs.adapterhub.ml/embeddings.html) and the [EmbeddingMixin](https://docs.adapterhub.ml/classes/model_mixins.html#embeddingadaptersmixin). " + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "gpuType": "T4", + "provenance": [] + }, + "kernelspec": { + "display_name": "test_env", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.8.18 (default, Sep 11 2023, 08:17:16) \n[Clang 14.0.6 ]" + }, + "vscode": { + "interpreter": { + "hash": "bc73c86fbf8de5a71ff9cca63348d7fa7cfa59fe04f3885030a826622402fe3d" + } + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "08fb5b4d88aa46cf9ac9ae48d752ae35": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_949b410645704312be1043287b9686ad", + "max": 20000, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_7ebf93d6ab3843f9b98b52da2fc34def", + "value": 20000 + } + }, + "11d79c476f1949f0b944eeb42a0d7e4c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "47d9afe6304c4c6da9357a0f08439a91": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "49d5550b9d1c464a93ea012c17fcb903": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_11d79c476f1949f0b944eeb42a0d7e4c", + "placeholder": "​", + "style": "IPY_MODEL_fa2d8938a1f04fbd984b5621ef74fa27", + "value": "Map: 100%" + } + }, + "71a73c610e0c48c3881c5329fb9f0e75": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "7ebf93d6ab3843f9b98b52da2fc34def": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "80fd53eb99ed469d9ff7c35efaf71849": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_f3c836272a8a40a98fb7258269dac760", + "placeholder": "​", + "style": "IPY_MODEL_71a73c610e0c48c3881c5329fb9f0e75", + "value": " 20000/20000 [00:02<00:00, 5495.99 examples/s]" + } + }, + "949b410645704312be1043287b9686ad": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "98c24a189b24490aac44830482c18ff5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_49d5550b9d1c464a93ea012c17fcb903", + "IPY_MODEL_08fb5b4d88aa46cf9ac9ae48d752ae35", + "IPY_MODEL_80fd53eb99ed469d9ff7c35efaf71849" + ], + "layout": "IPY_MODEL_47d9afe6304c4c6da9357a0f08439a91" + } + }, + "f3c836272a8a40a98fb7258269dac760": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "fa2d8938a1f04fbd984b5621ef74fa27": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}