From 3613eeb7349ab9f38b6fadac6004a104bb79de5c Mon Sep 17 00:00:00 2001 From: BenoitDherin Date: Thu, 14 Sep 2023 23:04:36 +0000 Subject: [PATCH 1/7] initial import --- .../solutions/vertex_llm_tuning.ipynb | 1014 +++++++++++++++++ 1 file changed, 1014 insertions(+) create mode 100644 notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb new file mode 100644 index 00000000..b28e4b74 --- /dev/null +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb @@ -0,0 +1,1014 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ur8xi4C7S06n" + }, + "outputs": [], + "source": [ + "# Copyright 2023 Google LLC\n", + "#\n", + "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tvgnzT1CKxrO" + }, + "source": [ + "# Tuning and deploy a foundation model\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "
\n", + " \n", + " \"Google
Run in Colab\n", + "
\n", + "
\n", + " \n", + " \"GitHub
View on GitHub\n", + "
\n", + "
\n", + " \n", + " \"Vertex
Open in Vertex AI Workbench\n", + "
\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "JAPoU8Sm5E6e" + }, + "source": [ + "Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. On Vertex AI, tuning allows you to customize a foundation model for more specific tasks or knowledge domains.\n", + "\n", + "While the prompt design is excellent for quick experimentation, if training data is available, you can achieve higher quality by tuning the model. Tuning a model enables you to customize the model response based on examples of the task you want the model to perform.\n", + "\n", + "For more details on tuning have a look at the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "d975e698c9a4" + }, + "source": [ + "### Objective\n", + "\n", + "This tutorial teaches you how to tune a foundational model on new unseen data and you will use the following Google Cloud products:\n", + "\n", + "- Vertex AI Generative AI Studio\n", + "- Vertex AI Pipelines\n", + "- Vertex AI Model Registry\n", + "- Vertex AI Endpoints\n", + "\n", + "The steps performed include:\n", + "\n", + "- Get training data from BQ and generate a JSONL file\n", + "- Upload training data\n", + "- Create a pipeline job\n", + "- Inspect your model on Vertex AI Model Registry\n", + "- Get predictions from your tuned model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6CZvFRbIaalF" + }, + "source": [ + "### Quota\n", + "**important**: Tuning the text-bison@001 model uses the tpu-v3-8 training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6q2bKpVjaalF" + }, + "source": [ + "### Costs\n", + "This tutorial uses billable components of Google Cloud:\n", + "\n", + "* Vertex AI Generative AI Studio\n", + "\n", + "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),\n", + "and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)\n", + "to generate a cost estimate based on your projected usage." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "acBlvcGFaalF" + }, + "source": [ + "### Install Vertex AI SDK" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "BEtR1xyRaalG" + }, + "outputs": [], + "source": [ + "!pip install google-cloud-aiplatform google-cloud-bigquery sequence-evaluate sentence-transformers rouge --upgrade --user" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qAMVnZC9aalG" + }, + "source": [ + "**Colab only:** Uncomment the following cell to restart the kernel or use the restart button. For Vertex AI Workbench you can restart the terminal using the button on top." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "MdQC6wcuaalG" + }, + "outputs": [], + "source": [ + "# Automatically restart kernel after installs so that your environment can access the new packages\n", + "# import IPython\n", + "\n", + "# app = IPython.Application.instance()\n", + "# app.kernel.do_shutdown(True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2LlxsZrWaalG" + }, + "source": [ + "### Authenticating your notebook environment\n", + "* If you are using **Colab** to run this notebook, uncomment the cell below and continue.\n", + "* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "oh-QANoIaalG" + }, + "outputs": [], + "source": [ + "# from google.colab import auth\n", + "# auth.authenticate_user()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qW8qtGsmaalG" + }, + "source": [ + "### BigQuery IAM\n", + "Now you need to add permissions to the service account:\n", + "- Go to the [IAM page](https://console.cloud.google.com/iam-admin/) in the console\n", + "- Look for the default compute service account. It should look something like this: `-compute@developer.gserviceaccount.com`\n", + "- Assign the default compute service account with `bigquery.user`" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZmhnHOjlaalH" + }, + "source": [ + "### Set your project ID\n", + "\n", + "**If you don't know your project ID**, you may be able to get your project ID using `gcloud`. Otherwise, check the support page: Locate the [project ID](https://support.google.com/googleapi/answer/7014113). Please update `PROJECT_ID` below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "j8nXkkYxaalH", + "outputId": "6726fc18-7cd3-4dd8-afef-aeeebd9aa0e5" + }, + "outputs": [], + "source": [ + "PROJECT_ID = \"\" # @param {type:\"string\"}\n", + "\n", + "# Set the project id\n", + "! gcloud config set project {PROJECT_ID}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PrsmSjICaalH" + }, + "source": [ + "### Create a bucket\n", + "Now you have to create a bucket that we will use to store our tuning data. To avoid name collisions between users on resources created, you generate a UUID for each instance session and append it to the name of the resources you create in this tutorial." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "LiKRZOgqaalH" + }, + "outputs": [], + "source": [ + "import random\n", + "import string\n", + "\n", + "\n", + "# Generate a uuid of a specifed length(default=8)\n", + "def generate_uuid(length: int = 8) -> str:\n", + " return \"\".join(random.choices(string.ascii_lowercase + string.digits, k=length))\n", + "\n", + "\n", + "UUID = generate_uuid()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-D28-KrtaalH" + }, + "source": [ + "Choose a bucket name and update the `BUCKET_NAME` parameter." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "pxRSNVCYaalH" + }, + "outputs": [], + "source": [ + "BUCKET_NAME = \"\" # @param {type:\"string\"}\n", + "BUCKET_URI = f\"gs://{BUCKET_NAME}\"\n", + "REGION = \"us-central1\" # @param {type: \"string\"}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ZpjqMRc-aalH" + }, + "outputs": [], + "source": [ + "if BUCKET_NAME == \"\" or BUCKET_NAME is None or BUCKET_NAME == \"\":\n", + " BUCKET_NAME = \"vertex-\" + UUID\n", + " BUCKET_URI = f\"gs://{BUCKET_NAME}\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WtJg8ILPaalH" + }, + "source": [ + "Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "NSRiXkavaalH", + "outputId": "8b752c8a-d575-4982-85f8-5a40317c8ac3" + }, + "outputs": [], + "source": [ + "! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jNL0oqUJaalH" + }, + "source": [ + "Finally, validate access to your Cloud Storage bucket by examining its contents:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "leJFL5oIaalH" + }, + "outputs": [], + "source": [ + "! gsutil ls -al $BUCKET_URI" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XoEqT2Y4DJmf" + }, + "source": [ + "### Import libraries" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Colab only**: Uncomment the following cell to initialize the Vertex AI SDK. For Vertex AI Workbench, you don't need to run this." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# import vertexai\n", + "\n", + "# PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", + "# vertexai.init(project=PROJECT_ID, location=\"us-central1\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "pRUOFELefqf1" + }, + "outputs": [], + "source": [ + "from typing import Union\n", + "\n", + "import pandas as pd\n", + "from sklearn.model_selection import train_test_split\n", + "import numpy as np\n", + "\n", + "from vertexai.preview.language_models import TextGenerationModel\n", + "from google.cloud import aiplatform\n", + "from google.cloud import bigquery" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WdtNETYxaalH" + }, + "source": [ + "## Tune your Model\n", + "\n", + "Now it's time for you to create a tuning job. Tune a foundation model by creating a pipeline job using Generative AI Studio, cURL, or the Python SDK. In this notebook, we will be using the Python SDK. You will be using a Q&A with a context dataset in JSON format.\n", + "\n", + "### Training Data\n", + "💾 Your model tuning dataset must be in a JSONL format where each line contains a single training example. You must make sure that you include instructions.\n", + "\n", + "You will use the StackOverflow data on BigQuery Public Datasets, limiting to questions with the `python` tag, and accepted answers for answers since 2020-01-01." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Puc3jl8QaalI" + }, + "source": [ + "First create a helper function to let you easily query BigQuery and return the results as a Pandas DataFrame." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Eg60aUgvaalI" + }, + "outputs": [], + "source": [ + "def run_bq_query(sql: str) -> Union[str, pd.DataFrame]:\n", + " \"\"\"\n", + " Run a BigQuery query and return the job ID or result as a DataFrame\n", + " Args:\n", + " sql: SQL query, as a string, to execute in BigQuery\n", + " Returns:\n", + " df: DataFrame of results from query, or error, if any\n", + " \"\"\"\n", + "\n", + " bq_client = bigquery.Client()\n", + "\n", + " # Try dry run before executing query to catch any errors\n", + " job_config = bigquery.QueryJobConfig(dry_run=True, use_query_cache=False)\n", + " bq_client.query(sql, job_config=job_config)\n", + "\n", + " # If dry run succeeds without errors, proceed to run query\n", + " job_config = bigquery.QueryJobConfig()\n", + " client_result = bq_client.query(sql, job_config=job_config)\n", + "\n", + " job_id = client_result.job_id\n", + "\n", + " # Wait for query/job to finish running. then get & return data frame\n", + " df = client_result.result().to_arrow().to_pandas()\n", + " print(f\"Finished job_id: {job_id}\")\n", + "\n", + " return df" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1BydoFfTaalI" + }, + "source": [ + "Next define the query." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "9VTaovLtaalI" + }, + "outputs": [], + "source": [ + "df = run_bq_query(\n", + " \"\"\"SELECT\n", + " CONCAT(q.title, q.body) as input_text,\n", + " a.body AS output_text\n", + "FROM\n", + " `bigquery-public-data.stackoverflow.posts_questions` q\n", + "JOIN\n", + " `bigquery-public-data.stackoverflow.posts_answers` a\n", + "ON\n", + " q.accepted_answer_id = a.id\n", + "WHERE\n", + " q.accepted_answer_id IS NOT NULL AND\n", + " REGEXP_CONTAINS(q.tags, \"python\") AND\n", + " a.creation_date >= \"2020-01-01\"\n", + "LIMIT\n", + " 10000\n", + "\"\"\"\n", + ")\n", + "\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qYUg8cBbaalJ" + }, + "source": [ + "There should be 10k questions and answers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "6FqbVHoeaalJ" + }, + "outputs": [], + "source": [ + "print(len(df))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OftmoPZ6aalJ" + }, + "source": [ + "Lets split the data into training and evalation. For Extractive Q&A tasks we advise 100+ training examples. In this case you will use 800." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "aXqBwSwaaalJ" + }, + "outputs": [], + "source": [ + "# split is set to 80/20\n", + "train, evaluation = train_test_split(df, test_size=0.2)\n", + "print(len(train))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nf-q8TpnaalJ" + }, + "source": [ + "For tuning, the training data first needs to be converted into a JSONL format." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "FqRbOwzEaalJ" + }, + "outputs": [], + "source": [ + "tune_jsonl = train.to_json(orient=\"records\", lines=True)\n", + "\n", + "print(f\"Length: {len(tune_jsonl)}\")\n", + "print(tune_jsonl[0:100])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "r04PWISCaalJ" + }, + "source": [ + "Next, you can write it to a local JSONL before transferring it to Google Cloud Storage (GCS)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vXVV9c0HaalJ" + }, + "outputs": [], + "source": [ + "training_data_filename = \"tune_data_stack_overflow_python_qa.jsonl\"\n", + "\n", + "with open(training_data_filename, \"w\") as f:\n", + " f.write(tune_jsonl)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FV8Wxz7JaalN" + }, + "source": [ + "You can then export the local file to GCS, so that it can be used by Vertex AI for the tuning job." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vDDLHac5aalN" + }, + "outputs": [], + "source": [ + "! gsutil cp $training_data_filename $BUCKET_URI" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Ff68wmzoaalN" + }, + "source": [ + "You can check to make sure that the file successfully transferred to your Google Cloud Storage bucket:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "2-DnKpYlaalN" + }, + "outputs": [], + "source": [ + "! gsutil ls -al $BUCKET_URI" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "8wE9P7OFaalN" + }, + "outputs": [], + "source": [ + "TRAINING_DATA_URI = f\"{BUCKET_URI}/{training_data_filename}\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-mW7K57BaalN", + "tags": [] + }, + "source": [ + "### Model Tuning\n", + "Now it's time to start to tune a model. You will use the Vertex AI SDK to submit our tuning job.\n", + "\n", + "#### Recommended Tuning Configurations\n", + "✅ Here are some recommended configurations for tuning a foundation model based on the task, in this example Q&A. You can find more in the [documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).\n", + "\n", + "Extractive QA:\n", + "- Make sure that your train dataset size is 100+\n", + "- Training steps [100-500]. You can try more than one value to get the best performance on a particular dataset (e.g. 100, 200, 500)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "26HRfld3aalN" + }, + "outputs": [], + "source": [ + "MODEL_NAME = f\"genai-workshop-tuned-model-{UUID}\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "on4baTh5aalN" + }, + "outputs": [], + "source": [ + "# Function that starts the tuning job\n", + "def tuned_model(\n", + " project_id: str,\n", + " location: str,\n", + " training_data: str,\n", + " model_display_name: str,\n", + " train_steps=100,\n", + "):\n", + " \"\"\"Prompt-tune a new model, based on a prompt-response data.\n", + "\n", + " \"training_data\" can be either the GCS URI of a file formatted in JSONL format\n", + " (for example: training_data=f'gs://{bucket}/{filename}.jsonl'), or a pandas\n", + " DataFrame. Each training example should be JSONL record with two keys, for\n", + " example:\n", + " {\n", + " \"input_text\": ,\n", + " \"output_text\": \n", + " },\n", + "\n", + " Args:\n", + " project_id: GCP Project ID, used to initialize aiplatform\n", + " location: GCP Region, used to initialize aiplatform\n", + " training_data: GCS URI of training file or pandas dataframe of training data\n", + " model_display_name: Name for your model.\n", + " train_steps: Number of training steps to use when tuning the model\n", + " \"\"\"\n", + "\n", + " aiplatform.init(project=project_id, location=location)\n", + " model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n", + "\n", + " model.tune_model(\n", + " training_data=training_data,\n", + " model_display_name=model_display_name,\n", + " train_steps=train_steps,\n", + " # Tuning can only happen in the \"europe-west4\" location\n", + " tuning_job_location=\"europe-west4\",\n", + " # Model can only be deployed in the \"us-central1\" location\n", + " tuned_model_location=\"us-central1\",\n", + " )\n", + "\n", + " # Test the tuned model:\n", + " print(\n", + " model.predict(\n", + " \"Can you provide me with a Python implementation of BERT with Tensorflow? Example: \"\n", + " )\n", + " )\n", + "\n", + " return model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "o0XNL9ojaalN" + }, + "source": [ + "Next it's time to start your tuning job. **Disclaimer:** tuning and deploying a model takes time." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "sYoG5UazaalN" + }, + "outputs": [], + "source": [ + "# This will start the tuning job and output a URL where you can monitor the pipeline execution.\n", + "model = tuned_model(PROJECT_ID, REGION, TRAINING_DATA_URI, MODEL_NAME)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PRCkdxXvaalO" + }, + "source": [ + "Following the link above, you can view your pipeline run. As you can see in the screenshot below, it will execute the following steps:\n", + "\n", + "- Validation\n", + "- Export managed dataset\n", + "- Convert JSONL to TFRecord\n", + "- Large language model tuning\n", + "- Upload LLM Model" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "O6JC8XplaalO" + }, + "source": [ + "## View your tuned foundational model on Vertex AI Model registry\n", + "When your tuning job is finished, your model will be available on Vertex AI Model Registry. The following Python SDK sample shows you how to list tuned models." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GPWX0ITCaalO" + }, + "outputs": [], + "source": [ + "def list_tuned_models(project_id, location):\n", + " aiplatform.init(project=project_id, location=location)\n", + " model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n", + " tuned_model_names = model.list_tuned_model_names()\n", + " print(tuned_model_names)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "bAIwCGYJaalO" + }, + "outputs": [], + "source": [ + "list_tuned_models(PROJECT_ID, REGION)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ZriyF0V-aalO" + }, + "source": [ + "You can also use the Google Cloud Console UI to view all of your model in [Vertex AI Model Registry](https://console.cloud.google.com/vertex-ai/models?e=13802955&jsmode=O&mods=-ai_platform_fake_service&project=cloud-llm-preview1). Below you can see an example of a tuned foundational model available on Vertex AI Model Registry." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cFftY6-EaalO" + }, + "source": [ + "## Use your tuned model to get predictions\n", + "Now it's time to get predictions. First you need to get the latest tuned model from the Vertex AI Model registry." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "vU-K3EIkaalO" + }, + "outputs": [], + "source": [ + "def fetch_model(project_id, location):\n", + " aiplatform.init(project=project_id, location=location)\n", + " model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n", + " list_tuned_models = model.list_tuned_model_names()\n", + " tuned_model = list_tuned_models[0]\n", + "\n", + " return tuned_model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "j66dr12taalO" + }, + "outputs": [], + "source": [ + "deployed_model = fetch_model(PROJECT_ID, REGION)\n", + "deployed_model = TextGenerationModel.get_tuned_model(deployed_model)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xDOueoptaalO" + }, + "source": [ + "Now you can start send a prompt to the API. Feel free to update the following prompt." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "2ERbfPJPaalO" + }, + "outputs": [], + "source": [ + "PROMPT = \"\"\"\n", + "How can I store my TensorFlow checkpoint on Google Cloud Storage?\n", + "\n", + "Python example:\n", + "\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "trzon4EyaalO" + }, + "outputs": [], + "source": [ + "print(deployed_model.predict(PROMPT))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qtYr_KNPaalO" + }, + "source": [ + "## Evaulation\n", + "It's essential to evaluate your model to understand its performance. Evaluation can be done in an automated way using evaluation metrics like F1 or Rouge. You can also leverage human evaluation methods. Human evaluation methods involve asking humans to rate the quality of the LLM's answers. This can be done through crowdsourcing or by having experts evaluate the responses. Some standard human evaluation metrics include fluency, coherence, relevance, and informativeness. Often you want to choose a mix of evaluation metrics to get a good understanding of your model performance. Below you will find an example of how you can do the evaluation.\n", + "\n", + "In this example you will be using [sequence-evaluate](https://pypi.org/project/sequence-evaluate/) to evaluation the tuned model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "9856CuicaalO" + }, + "outputs": [], + "source": [ + "from seq_eval import SeqEval\n", + "\n", + "evaluator = SeqEval()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AS10ybdraalO" + }, + "source": [ + "Earlier in the notebook, you created a train and eval dataset. Now it's time to take some of the eval data. You will use the questions to get a response from our tuned model, and the answers we will use as a reference:\n", + "\n", + "- **Candidates**: Answers generated by the tuned model.\n", + "- **References**: Original answers that we will use to compare." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "LKMmIH0XaalO" + }, + "outputs": [], + "source": [ + "evaluation = evaluation.head(10) # you can change the number of rows you want to use\n", + "evaluation_question = evaluation[\"input_text\"]\n", + "evaluation_answer = evaluation[\"output_text\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jx-g2molaalP" + }, + "source": [ + "Now you can go ahead and generate candidates using the tuned model based on the questions you took from the eval dataset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "e5DqVXvEaalP" + }, + "outputs": [], + "source": [ + "candidates = []\n", + "\n", + "for i in evaluation_question:\n", + " response = deployed_model.predict(i)\n", + " candidates.append(response.text)\n", + "\n", + "len(candidates)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oftLTb0maalP" + }, + "source": [ + "You will also have to create a list of our references. These will you use to evaluate the model's performance." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "y7zN70CJaalP" + }, + "outputs": [], + "source": [ + "references = evaluation_answer.tolist()\n", + "\n", + "len(references)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next you will generate the evaluation metrics. `evaluator.evaluate` will return a few eval metrics. Some of the important ones are:\n", + "- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.\n", + "- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "B828sNxUaalP" + }, + "outputs": [], + "source": [ + "scores = evaluator.evaluate(candidates, references, verbose=False)\n", + "print(scores)" + ] + } + ], + "metadata": { + "colab": { + "provenance": [], + "toc_visible": true + }, + "environment": { + "kernel": "python3", + "name": "tf2-gpu.2-11.m108", + "type": "gcloud", + "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-11:m108" + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.10" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} From 342d594d4979290f9cc85dde69326545798331ff Mon Sep 17 00:00:00 2001 From: BenoitDherin Date: Thu, 14 Sep 2023 23:11:18 +0000 Subject: [PATCH 2/7] precommit --- .../solutions/vertex_llm_tuning.ipynb | 22 ++++++++++++------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb index b28e4b74..142473bb 100644 --- a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb @@ -246,7 +246,9 @@ "\n", "# Generate a uuid of a specifed length(default=8)\n", "def generate_uuid(length: int = 8) -> str:\n", - " return \"\".join(random.choices(string.ascii_lowercase + string.digits, k=length))\n", + " return \"\".join(\n", + " random.choices(string.ascii_lowercase + string.digits, k=length)\n", + " )\n", "\n", "\n", "UUID = generate_uuid()" @@ -282,7 +284,11 @@ }, "outputs": [], "source": [ - "if BUCKET_NAME == \"\" or BUCKET_NAME is None or BUCKET_NAME == \"\":\n", + "if (\n", + " BUCKET_NAME == \"\"\n", + " or BUCKET_NAME is None\n", + " or BUCKET_NAME == \"\"\n", + "):\n", " BUCKET_NAME = \"vertex-\" + UUID\n", " BUCKET_URI = f\"gs://{BUCKET_NAME}\"" ] @@ -369,13 +375,11 @@ "source": [ "from typing import Union\n", "\n", + "import numpy as np\n", "import pandas as pd\n", + "from google.cloud import aiplatform, bigquery\n", "from sklearn.model_selection import train_test_split\n", - "import numpy as np\n", - "\n", - "from vertexai.preview.language_models import TextGenerationModel\n", - "from google.cloud import aiplatform\n", - "from google.cloud import bigquery" + "from vertexai.preview.language_models import TextGenerationModel" ] }, { @@ -905,7 +909,9 @@ }, "outputs": [], "source": [ - "evaluation = evaluation.head(10) # you can change the number of rows you want to use\n", + "evaluation = evaluation.head(\n", + " 10\n", + ") # you can change the number of rows you want to use\n", "evaluation_question = evaluation[\"input_text\"]\n", "evaluation_answer = evaluation[\"output_text\"]" ] From da16ce22157d5cd98003d1d3e9aaab9097fb3ae5 Mon Sep 17 00:00:00 2001 From: BenoitDherin Date: Tue, 19 Sep 2023 21:59:12 +0000 Subject: [PATCH 3/7] precommit --- .../solutions/vertex_llm_tuning.ipynb | 757 +++++++----------- 1 file changed, 307 insertions(+), 450 deletions(-) diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb index 142473bb..3eb3b634 100644 --- a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb @@ -1,66 +1,12 @@ { "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ur8xi4C7S06n" - }, - "outputs": [], - "source": [ - "# Copyright 2023 Google LLC\n", - "#\n", - "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, { "cell_type": "markdown", "metadata": { "id": "tvgnzT1CKxrO" }, "source": [ - "# Tuning and deploy a foundation model\n", - "\n", - "\n", - " \n", - " \n", - " \n", - "
\n", - " \n", - " \"Google
Run in Colab\n", - "
\n", - "
\n", - " \n", - " \"GitHub
View on GitHub\n", - "
\n", - "
\n", - " \n", - " \"Vertex
Open in Vertex AI Workbench\n", - "
\n", - "
\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "JAPoU8Sm5E6e" - }, - "source": [ - "Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. On Vertex AI, tuning allows you to customize a foundation model for more specific tasks or knowledge domains.\n", - "\n", - "While the prompt design is excellent for quick experimentation, if training data is available, you can achieve higher quality by tuning the model. Tuning a model enables you to customize the model response based on examples of the task you want the model to perform.\n", - "\n", - "For more details on tuning have a look at the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models)." + "# Tuning and deploy a foundation model\n" ] }, { @@ -69,7 +15,7 @@ "id": "d975e698c9a4" }, "source": [ - "### Objective\n", + "### Learning Objective\n", "\n", "This tutorial teaches you how to tune a foundational model on new unseen data and you will use the following Google Cloud products:\n", "\n", @@ -90,24 +36,19 @@ { "cell_type": "markdown", "metadata": { - "id": "6CZvFRbIaalF" - }, - "source": [ - "### Quota\n", - "**important**: Tuning the text-bison@001 model uses the tpu-v3-8 training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas)." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6q2bKpVjaalF" + "id": "JAPoU8Sm5E6e", + "tags": [] }, "source": [ - "### Costs\n", - "This tutorial uses billable components of Google Cloud:\n", + "Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. On Vertex AI, tuning allows you to customize a foundation model for more specific tasks or knowledge domains.\n", + "\n", + "While the prompt design is excellent for quick experimentation, if training data is available, you can achieve higher quality by tuning the model. Tuning a model enables you to customize the model response based on examples of the task you want the model to perform.\n", "\n", - "* Vertex AI Generative AI Studio\n", + "For more details on tuning have a look at the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).\n", "\n", + "**Quota**: Tuning the text-bison@001 model uses the tpu-v3-8 training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas).\n", + "\n", + "**Costs:** This tutorial uses billable a component of Google Cloud `Vertex AI Generative AI Studio`.\n", "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),\n", "and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)\n", "to generate a cost estimate based on your projected usage." @@ -115,11 +56,9 @@ }, { "cell_type": "markdown", - "metadata": { - "id": "acBlvcGFaalF" - }, + "metadata": {}, "source": [ - "### Install Vertex AI SDK" + "## Setup" ] }, { @@ -130,181 +69,44 @@ }, "outputs": [], "source": [ - "!pip install google-cloud-aiplatform google-cloud-bigquery sequence-evaluate sentence-transformers rouge --upgrade --user" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "qAMVnZC9aalG" - }, - "source": [ - "**Colab only:** Uncomment the following cell to restart the kernel or use the restart button. For Vertex AI Workbench you can restart the terminal using the button on top." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "MdQC6wcuaalG" - }, - "outputs": [], - "source": [ - "# Automatically restart kernel after installs so that your environment can access the new packages\n", - "# import IPython\n", - "\n", - "# app = IPython.Application.instance()\n", - "# app.kernel.do_shutdown(True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2LlxsZrWaalG" - }, - "source": [ - "### Authenticating your notebook environment\n", - "* If you are using **Colab** to run this notebook, uncomment the cell below and continue.\n", - "* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "oh-QANoIaalG" - }, - "outputs": [], - "source": [ - "# from google.colab import auth\n", - "# auth.authenticate_user()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "qW8qtGsmaalG" - }, - "source": [ - "### BigQuery IAM\n", - "Now you need to add permissions to the service account:\n", - "- Go to the [IAM page](https://console.cloud.google.com/iam-admin/) in the console\n", - "- Look for the default compute service account. It should look something like this: `-compute@developer.gserviceaccount.com`\n", - "- Assign the default compute service account with `bigquery.user`" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ZmhnHOjlaalH" - }, - "source": [ - "### Set your project ID\n", - "\n", - "**If you don't know your project ID**, you may be able to get your project ID using `gcloud`. Otherwise, check the support page: Locate the [project ID](https://support.google.com/googleapi/answer/7014113). Please update `PROJECT_ID` below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "j8nXkkYxaalH", - "outputId": "6726fc18-7cd3-4dd8-afef-aeeebd9aa0e5" - }, - "outputs": [], - "source": [ - "PROJECT_ID = \"\" # @param {type:\"string\"}\n", - "\n", - "# Set the project id\n", - "! gcloud config set project {PROJECT_ID}" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "PrsmSjICaalH" - }, - "source": [ - "### Create a bucket\n", - "Now you have to create a bucket that we will use to store our tuning data. To avoid name collisions between users on resources created, you generate a UUID for each instance session and append it to the name of the resources you create in this tutorial." + "#!pip install sequence-evaluate sentence-transformers rouge --upgrade --user" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "LiKRZOgqaalH" - }, + "execution_count": 42, + "metadata": {}, "outputs": [], "source": [ "import random\n", "import string\n", + "import time\n", + "from typing import Union\n", "\n", - "\n", - "# Generate a uuid of a specifed length(default=8)\n", - "def generate_uuid(length: int = 8) -> str:\n", - " return \"\".join(\n", - " random.choices(string.ascii_lowercase + string.digits, k=length)\n", - " )\n", - "\n", - "\n", - "UUID = generate_uuid()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "-D28-KrtaalH" - }, - "source": [ - "Choose a bucket name and update the `BUCKET_NAME` parameter." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "pxRSNVCYaalH" - }, - "outputs": [], - "source": [ - "BUCKET_NAME = \"\" # @param {type:\"string\"}\n", - "BUCKET_URI = f\"gs://{BUCKET_NAME}\"\n", - "REGION = \"us-central1\" # @param {type: \"string\"}" + "import numpy as np\n", + "import pandas as pd\n", + "from google.cloud import aiplatform, bigquery\n", + "from sklearn.model_selection import train_test_split\n", + "from vertexai.preview.language_models import TextGenerationModel" ] }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ZpjqMRc-aalH" - }, + "execution_count": 5, + "metadata": {}, "outputs": [], "source": [ - "if (\n", - " BUCKET_NAME == \"\"\n", - " or BUCKET_NAME is None\n", - " or BUCKET_NAME == \"\"\n", - "):\n", - " BUCKET_NAME = \"vertex-\" + UUID\n", - " BUCKET_URI = f\"gs://{BUCKET_NAME}\"" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WtJg8ILPaalH" - }, - "source": [ - "Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket." + "REGION = \"us-central1\"\n", + "PROJECT_ID = !(gcloud config get-value project)\n", + "PROJECT_ID = PROJECT_ID[0]\n", + "\n", + "BUCKET_NAME = PROJECT_ID\n", + "BUCKET_URI = f\"gs://{BUCKET_NAME}\"" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/" @@ -312,74 +114,34 @@ "id": "NSRiXkavaalH", "outputId": "8b752c8a-d575-4982-85f8-5a40317c8ac3" }, - "outputs": [], - "source": [ - "! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "jNL0oqUJaalH" - }, - "source": [ - "Finally, validate access to your Cloud Storage bucket by examining its contents:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "leJFL5oIaalH" - }, - "outputs": [], - "source": [ - "! gsutil ls -al $BUCKET_URI" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "XoEqT2Y4DJmf" - }, - "source": [ - "### Import libraries" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Colab only**: Uncomment the following cell to initialize the Vertex AI SDK. For Vertex AI Workbench, you don't need to run this." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# import vertexai\n", - "\n", - "# PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n", - "# vertexai.init(project=PROJECT_ID, location=\"us-central1\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "pRUOFELefqf1" - }, - "outputs": [], - "source": [ - "from typing import Union\n", - "\n", - "import numpy as np\n", - "import pandas as pd\n", - "from google.cloud import aiplatform, bigquery\n", - "from sklearn.model_selection import train_test_split\n", - "from vertexai.preview.language_models import TextGenerationModel" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "gs://dherin-dev/cord19_embeddings.json\n", + "gs://dherin-dev/salads.csv\n", + "gs://dherin-dev/115851500182/\n", + "gs://dherin-dev/7737964263322419200-616112577574862848/\n", + "gs://dherin-dev/babyweight/\n", + "gs://dherin-dev/babyweight_220707_021136/\n", + "gs://dherin-dev/babyweight_220707_021151/\n", + "gs://dherin-dev/babyweight_220707_021154/\n", + "gs://dherin-dev/car_damage_lab_images/\n", + "gs://dherin-dev/classification-bert-20230411003650/\n", + "gs://dherin-dev/contextual_bandit_checkpoints/\n", + "gs://dherin-dev/covertype/\n", + "gs://dherin-dev/models/\n", + "gs://dherin-dev/movies/\n", + "gs://dherin-dev/staging/\n", + "gs://dherin-dev/taxifare-20230710171207/\n", + "gs://dherin-dev/taxifare-20230710191151/\n", + "gs://dherin-dev/taxifare/\n" + ] + } + ], + "source": [ + "!gsutil ls $BUCKET_URI || gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI" ] }, { @@ -409,21 +171,13 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 8, "metadata": { "id": "Eg60aUgvaalI" }, "outputs": [], "source": [ - "def run_bq_query(sql: str) -> Union[str, pd.DataFrame]:\n", - " \"\"\"\n", - " Run a BigQuery query and return the job ID or result as a DataFrame\n", - " Args:\n", - " sql: SQL query, as a string, to execute in BigQuery\n", - " Returns:\n", - " df: DataFrame of results from query, or error, if any\n", - " \"\"\"\n", - "\n", + "def run_bq_query(sql):\n", " bq_client = bigquery.Client()\n", "\n", " # Try dry run before executing query to catch any errors\n", @@ -454,16 +208,12 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "9VTaovLtaalI" - }, + "execution_count": 14, + "metadata": {}, "outputs": [], "source": [ - "df = run_bq_query(\n", - " \"\"\"SELECT\n", - " CONCAT(q.title, q.body) as input_text,\n", - " a.body AS output_text\n", + "query = \"\"\"\n", + "SELECT CONCAT(q.title, q.body) as input_text, a.body AS output_text\n", "FROM\n", " `bigquery-public-data.stackoverflow.posts_questions` q\n", "JOIN\n", @@ -476,9 +226,101 @@ " a.creation_date >= \"2020-01-01\"\n", "LIMIT\n", " 10000\n", - "\"\"\"\n", - ")\n", - "\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "id": "9VTaovLtaalI" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Finished job_id: a7538c94-f47b-4a06-ac1e-6d76d0b55223\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
input_textoutput_text
0How to solve the alpha component ignored in ki...<p>I don't have a whole picture of your projec...
1SQLAlchemy with_variant() for MySQL and MariaD...<p>What happened is you tried to specify two <...
2Python imap to take multiple functions<p>I am ...<p>If I understand you correctly you can use <...
3Django: Table doesn't exist( python manage.py ...<p>I think what happend is that you lost sync ...
4split pandas rows into columns and comma separ...<p>Use <a href=\"https://pandas.pydata.org/docs...
\n", + "
" + ], + "text/plain": [ + " input_text \\\n", + "0 How to solve the alpha component ignored in ki... \n", + "1 SQLAlchemy with_variant() for MySQL and MariaD... \n", + "2 Python imap to take multiple functions

I am ... \n", + "3 Django: Table doesn't exist( python manage.py ... \n", + "4 split pandas rows into columns and comma separ... \n", + "\n", + " output_text \n", + "0

I don't have a whole picture of your projec... \n", + "1

What happened is you tried to specify two <... \n", + "2

If I understand you correctly you can use <... \n", + "3

I think what happend is that you lost sync ... \n", + "4

Use I am scraping a data from WhatsApp chat backup (chat.txt). It looks like this :<\\/p>\\n

7\\/21\\/20, 1:31 PM - mark: Can we look google  \\n7\\/21\\/20, 1:31 PM - elon: No  \\n7\\/21\\/20, 1:31 PM - mark: Can we smile ?  \\n7\\/21\\/20, 1:31 PM - elon: Ya\\n<\\/code><\\/pre>\\n

While I used line by line extraction<\\/p>\\n

with open ('chat.txt','rb') as file:\\n    for line in file:\\n        print(str(line.strip()))\\n<\\/code><\\/pre>\\n

I got this:<\\/p>\\n

b'7\\/21\\/20, 7:37 AM - mark: Can we look google\\\\xf0\\\\x9f\\\\xa4\\\\xa9\\\\xf0\\\\x9f\\\\x98\\\\x82\\\\xf0\\\\x9f\\\\x98\\\\x82'\\nb'7\\/21\\/20, 7:37 AM - elon: No'\\nb'7\\/21\\/20, 1:31 PM - mark: Can we smile ?'\\nb'7\\/21\\/20, 7:37 AM - elon: Ya\\\\xf0\\\\x9f\\\\x98\\\\x82'\\n<\\/code><\\/pre>\\n
    \\n
  1. How can we git rid of b''<\\/code> ? ( I tried .decode('utf-8')<\\/code>, but it didn't work)<\\/p>\\n<\\/li>\\n

  2. How can I convert<\\/p>\\n

    Can we look google\\\\xf0\\\\x9f\\\\xa4\\\\xa9\\\\xf0\\\\x9f\\\\x98\\\\x82\\\\xf0\\\\x9f\\\\x98\\\\x82\\n<\\/code><\\/pre>\\n

    to<\\/p>\\n

    Can we look google?\\n<\\/code><\\/pre>\\n<\\/li>\\n<\\/ol>\",\"output_text\":\"

    Open the file with the right encoding, not binary mode:<\\/p>\\n

    with open ('chat.txt', encoding='utf8') as file:\\n    for line in file:\\n        print(line, end='')\\n<\\/code><\\/pre>\\n

    How well this works depends on your execution environment. You need a terminal\\/IDE and font that support printing<\\/em> the code points for print<\\/code> to be successful, but that is not a Python issue.<\\/p>\"}\n" + ] + } + ], "source": [ - "training_data_filename = \"tune_data_stack_overflow_python_qa.jsonl\"\n", - "\n", - "with open(training_data_filename, \"w\") as f:\n", - " f.write(tune_jsonl)" + "!head -n 1 $training_data_filename" ] }, { @@ -581,11 +429,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 30, "metadata": { "id": "vDDLHac5aalN" }, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Copying file://tune_data_stack_overflow_python_qa.jsonl [Content-Type=application/octet-stream]...\n", + "/ [1 files][ 22.5 MiB/ 22.5 MiB] \n", + "Operation completed over 1 objects/22.5 MiB. \n" + ] + } + ], "source": [ "! gsutil cp $training_data_filename $BUCKET_URI" ] @@ -601,24 +459,23 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 32, "metadata": { "id": "2-DnKpYlaalN" }, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " 23637845 2023-09-19T21:11:04Z gs://dherin-dev/tune_data_stack_overflow_python_qa.jsonl#1695157864253156 metageneration=1\n", + "TOTAL: 1 objects, 23637845 bytes (22.54 MiB)\n" + ] + } + ], "source": [ - "! gsutil ls -al $BUCKET_URI" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "8wE9P7OFaalN" - }, - "outputs": [], - "source": [ - "TRAINING_DATA_URI = f\"{BUCKET_URI}/{training_data_filename}\"" + "TRAINING_DATA_URI = f\"{BUCKET_URI}/{training_data_filename}\"\n", + "! gsutil ls -al $TRAINING_DATA_URI" ] }, { @@ -641,92 +498,86 @@ }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "26HRfld3aalN" - }, + "execution_count": 43, + "metadata": {}, "outputs": [], "source": [ - "MODEL_NAME = f\"genai-workshop-tuned-model-{UUID}\"" + "aiplatform.init(project=PROJECT_ID, location=REGION)\n", + "\n", + "model = TextGenerationModel.from_pretrained(\"text-bison@001\")" ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": { - "id": "on4baTh5aalN" + "id": "o0XNL9ojaalN" }, - "outputs": [], "source": [ - "# Function that starts the tuning job\n", - "def tuned_model(\n", - " project_id: str,\n", - " location: str,\n", - " training_data: str,\n", - " model_display_name: str,\n", - " train_steps=100,\n", - "):\n", - " \"\"\"Prompt-tune a new model, based on a prompt-response data.\n", - "\n", - " \"training_data\" can be either the GCS URI of a file formatted in JSONL format\n", - " (for example: training_data=f'gs://{bucket}/{filename}.jsonl'), or a pandas\n", - " DataFrame. Each training example should be JSONL record with two keys, for\n", - " example:\n", - " {\n", - " \"input_text\": ,\n", - " \"output_text\": \n", - " },\n", - "\n", - " Args:\n", - " project_id: GCP Project ID, used to initialize aiplatform\n", - " location: GCP Region, used to initialize aiplatform\n", - " training_data: GCS URI of training file or pandas dataframe of training data\n", - " model_display_name: Name for your model.\n", - " train_steps: Number of training steps to use when tuning the model\n", - " \"\"\"\n", + "Next it's time to start your tuning job. \n", "\n", - " aiplatform.init(project=project_id, location=location)\n", - " model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n", - "\n", - " model.tune_model(\n", - " training_data=training_data,\n", - " model_display_name=model_display_name,\n", - " train_steps=train_steps,\n", - " # Tuning can only happen in the \"europe-west4\" location\n", - " tuning_job_location=\"europe-west4\",\n", - " # Model can only be deployed in the \"us-central1\" location\n", - " tuned_model_location=\"us-central1\",\n", - " )\n", - "\n", - " # Test the tuned model:\n", - " print(\n", - " model.predict(\n", - " \"Can you provide me with a Python implementation of BERT with Tensorflow? Example: \"\n", - " )\n", - " )\n", - "\n", - " return model" + "**Disclaimer:** tuning and deploying a model takes time." ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": null, "metadata": { - "id": "o0XNL9ojaalN" + "id": "on4baTh5aalN" }, - "source": [ - "Next it's time to start your tuning job. **Disclaimer:** tuning and deploying a model takes time." + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Model name: genai-workshop-tuned-model-1695160329.9899337\n", + "Creating PipelineJob\n", + "PipelineJob created. Resource name: projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211\n", + "To use this PipelineJob in another session:\n", + "pipeline_job = aiplatform.PipelineJob.get('projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211')\n", + "View Pipeline Job:\n", + "https://console.cloud.google.com/vertex-ai/locations/europe-west4/pipelines/runs/tune-large-model-20230919215211?project=115851500182\n", + "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n", + "PipelineState.PIPELINE_STATE_PENDING\n", + "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n", + "PipelineState.PIPELINE_STATE_RUNNING\n", + "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n", + "PipelineState.PIPELINE_STATE_RUNNING\n", + "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n", + "PipelineState.PIPELINE_STATE_RUNNING\n", + "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n", + "PipelineState.PIPELINE_STATE_RUNNING\n", + "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n", + "PipelineState.PIPELINE_STATE_RUNNING\n" + ] + } + ], + "source": [ + "TRAIN_STEPS = 100\n", + "MODEL_NAME = f\"genai-workshop-tuned-model-{time.time()}\"\n", + "print(\"Model name:\", MODEL_NAME)\n", + "\n", + "model.tune_model(\n", + " training_data=TRAINING_DATA_URI,\n", + " model_display_name=MODEL_NAME,\n", + " train_steps=TRAIN_STEPS,\n", + " # Tuning can only happen in the \"europe-west4\" location\n", + " tuning_job_location=\"europe-west4\",\n", + " # Model can only be deployed in the \"us-central1\" location\n", + " tuned_model_location=\"us-central1\",\n", + ")" ] }, { "cell_type": "code", "execution_count": null, - "metadata": { - "id": "sYoG5UazaalN" - }, + "metadata": {}, "outputs": [], "source": [ - "# This will start the tuning job and output a URL where you can monitor the pipeline execution.\n", - "model = tuned_model(PROJECT_ID, REGION, TRAINING_DATA_URI, MODEL_NAME)" + "print(\n", + " model.predict(\n", + " \"Can you provide me with a Python implementation of BERT with Tensorflow? Example: \"\n", + " )\n", + ")" ] }, { @@ -762,22 +613,7 @@ }, "outputs": [], "source": [ - "def list_tuned_models(project_id, location):\n", - " aiplatform.init(project=project_id, location=location)\n", - " model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n", - " tuned_model_names = model.list_tuned_model_names()\n", - " print(tuned_model_names)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "bAIwCGYJaalO" - }, - "outputs": [], - "source": [ - "list_tuned_models(PROJECT_ID, REGION)" + "model.list_tuned_model_names()" ] }, { @@ -984,6 +820,27 @@ "scores = evaluator.evaluate(candidates, references, verbose=False)\n", "print(scores)" ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ur8xi4C7S06n" + }, + "source": [ + "Copyright 2023 Google LLC\n", + "\n", + "Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "you may not use this file except in compliance with the License.\n", + "You may obtain a copy of the License at\n", + "\n", + " https://www.apache.org/licenses/LICENSE-2.0\n", + "\n", + "Unless required by applicable law or agreed to in writing, software\n", + "distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "See the License for the specific language governing permissions and\n", + "limitations under the License." + ] } ], "metadata": { @@ -993,12 +850,12 @@ }, "environment": { "kernel": "python3", - "name": "tf2-gpu.2-11.m108", + "name": "tf2-gpu.2-12.m111", "type": "gcloud", - "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-11:m108" + "uri": "gcr.io/deeplearning-platform-release/tf2-gpu.2-12:m111" }, "kernelspec": { - "display_name": "Python 3 (ipykernel)", + "display_name": "Python 3", "language": "python", "name": "python3" }, @@ -1012,7 +869,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.10" + "version": "3.10.12" } }, "nbformat": 4, From 5b4e1d53addbbadc3d8e646208e50899c1c21652 Mon Sep 17 00:00:00 2001 From: BenoitDherin Date: Wed, 20 Sep 2023 00:48:51 +0000 Subject: [PATCH 4/7] precommit --- .../solutions/vertex_llm_tuning.ipynb | 471 ++++++++++-------- 1 file changed, 252 insertions(+), 219 deletions(-) diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb index 3eb3b634..a7422a28 100644 --- a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb @@ -15,22 +15,11 @@ "id": "d975e698c9a4" }, "source": [ - "### Learning Objective\n", + "**Learning Objective**\n", "\n", - "This tutorial teaches you how to tune a foundational model on new unseen data and you will use the following Google Cloud products:\n", - "\n", - "- Vertex AI Generative AI Studio\n", - "- Vertex AI Pipelines\n", - "- Vertex AI Model Registry\n", - "- Vertex AI Endpoints\n", - "\n", - "The steps performed include:\n", - "\n", - "- Get training data from BQ and generate a JSONL file\n", - "- Upload training data\n", - "- Create a pipeline job\n", - "- Inspect your model on Vertex AI Model Registry\n", - "- Get predictions from your tuned model" + "1. Learn how to generate a JSONL file for PaLM tuning\n", + "1. Learn how to launch a tuning job on Vertex Pipeline\n", + "1. Learn how to query you tuned LLM and evaluate it" ] }, { @@ -40,13 +29,13 @@ "tags": [] }, "source": [ - "Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. On Vertex AI, tuning allows you to customize a foundation model for more specific tasks or knowledge domains.\n", + "Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. In this notebook, you'll learn how tuning allows you to customize a PaLM foundation model on Vertex Generative AI studio for more specific tasks or knowledge domains.\n", "\n", "While the prompt design is excellent for quick experimentation, if training data is available, you can achieve higher quality by tuning the model. Tuning a model enables you to customize the model response based on examples of the task you want the model to perform.\n", "\n", "For more details on tuning have a look at the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).\n", "\n", - "**Quota**: Tuning the text-bison@001 model uses the tpu-v3-8 training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas).\n", + "**Quota**: Tuning the `text-bison@001` model uses the `tpu-v3-8` training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas).\n", "\n", "**Costs:** This tutorial uses billable a component of Google Cloud `Vertex AI Generative AI Studio`.\n", "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),\n", @@ -64,49 +53,52 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "id": "BEtR1xyRaalG" - }, + "metadata": {}, "outputs": [], "source": [ - "#!pip install sequence-evaluate sentence-transformers rouge --upgrade --user" + "import IPython\n", + "\n", + "# The version of google-cloud-aiplatform needs to be >= 1.33.0\n", + "!pip install --upgrade --user \\\n", + " google-cloud-aiplatform \\\n", + " sequence-evaluate sentence-transformers \\\n", + " rouge\n", + "\n", + "app = IPython.Application.instance()\n", + "app.kernel.do_shutdown(True)" ] }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 45, "metadata": {}, "outputs": [], "source": [ - "import random\n", - "import string\n", "import time\n", - "from typing import Union\n", "\n", - "import numpy as np\n", "import pandas as pd\n", "from google.cloud import aiplatform, bigquery\n", + "from seq_eval import SeqEval\n", "from sklearn.model_selection import train_test_split\n", "from vertexai.preview.language_models import TextGenerationModel" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "REGION = \"us-central1\"\n", "PROJECT_ID = !(gcloud config get-value project)\n", "PROJECT_ID = PROJECT_ID[0]\n", - "\n", "BUCKET_NAME = PROJECT_ID\n", "BUCKET_URI = f\"gs://{BUCKET_NAME}\"" ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/" @@ -121,6 +113,7 @@ "text": [ "gs://dherin-dev/cord19_embeddings.json\n", "gs://dherin-dev/salads.csv\n", + "gs://dherin-dev/tune_data_stack_overflow_python_qa.jsonl\n", "gs://dherin-dev/115851500182/\n", "gs://dherin-dev/7737964263322419200-616112577574862848/\n", "gs://dherin-dev/babyweight/\n", @@ -171,7 +164,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 4, "metadata": { "id": "Eg60aUgvaalI" }, @@ -208,7 +201,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 56, "metadata": {}, "outputs": [], "source": [ @@ -225,13 +218,13 @@ " REGEXP_CONTAINS(q.tags, \"python\") AND\n", " a.creation_date >= \"2020-01-01\"\n", "LIMIT\n", - " 10000\n", + " 1000\n", "\"\"\"" ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 57, "metadata": { "id": "9VTaovLtaalI" }, @@ -240,7 +233,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Finished job_id: a7538c94-f47b-4a06-ac1e-6d76d0b55223\n" + "Finished job_id: 439a8a5f-91d6-477d-8a6d-4d13d2555b36\n" ] }, { @@ -271,28 +264,28 @@ " \n", " \n", " 0\n", - " How to solve the alpha component ignored in ki...\n", - " <p>I don't have a whole picture of your projec...\n", + " append dataframe in nested loop<p>I have the f...\n", + " <p>I am not entirely sure if I understand your...\n", " \n", " \n", " 1\n", - " SQLAlchemy with_variant() for MySQL and MariaD...\n", - " <p>What happened is you tried to specify two <...\n", + " Python pandas find element of one column in li...\n", + " <p>You can do <code>apply</code>:</p>\\n<pre><c...\n", " \n", " \n", " 2\n", - " Python imap to take multiple functions<p>I am ...\n", - " <p>If I understand you correctly you can use <...\n", + " How to add a minimum value constraint in Pyomo...\n", + " <p>figured it out. The two methods I described...\n", " \n", " \n", " 3\n", - " Django: Table doesn't exist( python manage.py ...\n", - " <p>I think what happend is that you lost sync ...\n", + " Producing Buffer Radius Polygons - Possible Pr...\n", + " <p>This is apparently an issue with <code>geov...\n", " \n", " \n", " 4\n", - " split pandas rows into columns and comma separ...\n", - " <p>Use <a href=\"https://pandas.pydata.org/docs...\n", + " SMOTE for balancing data<p>I am trying to trai...\n", + " <p>You haven't given enough of your code or da...\n", " \n", " \n", "\n", @@ -300,21 +293,21 @@ ], "text/plain": [ " input_text \\\n", - "0 How to solve the alpha component ignored in ki... \n", - "1 SQLAlchemy with_variant() for MySQL and MariaD... \n", - "2 Python imap to take multiple functions

    I am ... \n", - "3 Django: Table doesn't exist( python manage.py ... \n", - "4 split pandas rows into columns and comma separ... \n", + "0 append dataframe in nested loop

    I have the f... \n", + "1 Python pandas find element of one column in li... \n", + "2 How to add a minimum value constraint in Pyomo... \n", + "3 Producing Buffer Radius Polygons - Possible Pr... \n", + "4 SMOTE for balancing data

    I am trying to trai... \n", "\n", " output_text \n", - "0

    I don't have a whole picture of your projec... \n", - "1

    What happened is you tried to specify two <... \n", - "2

    If I understand you correctly you can use <... \n", - "3

    I think what happend is that you lost sync ... \n", - "4

    Use I am not entirely sure if I understand your... \n", + "1

    You can do apply:

    \\n
    figured it out. The two methods I described...  \n",
    +       "3  

    This is apparently an issue with geov... \n", + "4

    You haven't given enough of your code or da... " ] }, - "execution_count": 15, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } @@ -330,12 +323,12 @@ "id": "qYUg8cBbaalJ" }, "source": [ - "There should be 10k questions and answers." + "There should be 1000 questions and answers." ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 58, "metadata": { "id": "6FqbVHoeaalJ" }, @@ -344,7 +337,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "10000\n" + "1000\n" ] } ], @@ -363,7 +356,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 59, "metadata": { "id": "aXqBwSwaaalJ" }, @@ -372,7 +365,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "8000\n" + "800\n" ] } ], @@ -393,7 +386,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 60, "metadata": {}, "outputs": [], "source": [ @@ -403,14 +396,18 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "{\"input_text\":\"How to convert Bytes (UTF-8) embeded emoji in a string

    I am scraping a data from WhatsApp chat backup (chat.txt). It looks like this :<\\/p>\\n

    7\\/21\\/20, 1:31 PM - mark: Can we look google  \\n7\\/21\\/20, 1:31 PM - elon: No  \\n7\\/21\\/20, 1:31 PM - mark: Can we smile ?  \\n7\\/21\\/20, 1:31 PM - elon: Ya\\n<\\/code><\\/pre>\\n

    While I used line by line extraction<\\/p>\\n

    with open ('chat.txt','rb') as file:\\n    for line in file:\\n        print(str(line.strip()))\\n<\\/code><\\/pre>\\n

    I got this:<\\/p>\\n

    b'7\\/21\\/20, 7:37 AM - mark: Can we look google\\\\xf0\\\\x9f\\\\xa4\\\\xa9\\\\xf0\\\\x9f\\\\x98\\\\x82\\\\xf0\\\\x9f\\\\x98\\\\x82'\\nb'7\\/21\\/20, 7:37 AM - elon: No'\\nb'7\\/21\\/20, 1:31 PM - mark: Can we smile ?'\\nb'7\\/21\\/20, 7:37 AM - elon: Ya\\\\xf0\\\\x9f\\\\x98\\\\x82'\\n<\\/code><\\/pre>\\n
      \\n
    1. How can we git rid of b''<\\/code> ? ( I tried .decode('utf-8')<\\/code>, but it didn't work)<\\/p>\\n<\\/li>\\n

    2. How can I convert<\\/p>\\n

      Can we look google\\\\xf0\\\\x9f\\\\xa4\\\\xa9\\\\xf0\\\\x9f\\\\x98\\\\x82\\\\xf0\\\\x9f\\\\x98\\\\x82\\n<\\/code><\\/pre>\\n

      to<\\/p>\\n

      Can we look google?\\n<\\/code><\\/pre>\\n<\\/li>\\n<\\/ol>\",\"output_text\":\"

      Open the file with the right encoding, not binary mode:<\\/p>\\n

      with open ('chat.txt', encoding='utf8') as file:\\n    for line in file:\\n        print(line, end='')\\n<\\/code><\\/pre>\\n

      How well this works depends on your execution environment. You need a terminal\\/IDE and font that support printing<\\/em> the code points for print<\\/code> to be successful, but that is not a Python issue.<\\/p>\"}\n" + "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", + "To disable this warning, you can either:\n", + "\t- Avoid using `tokenizers` before the fork if possible\n", + "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n", + "{\"input_text\":\"Assignment operator overloading in python Abstract Syntax Trees

      I want to overload assignment operator in python on the fly using Abstract Syntax Trees<\\/a><\\/p>\\n

      import ast\\nimport astunparse\\n\\nclass OverloadAssignments(ast.NodeTransformer):\\n    def visit_Assign(self, node):\\n        if isinstance(node, ast.Assign) and node.targets:\\n            funcs = node.targets[0]\\n            slot_name_candidate = astunparse.unparse(funcs).strip()\\n            if isinstance(funcs, ast.Name) and "_slot" in slot_name_candidate:\\n                slot_name = ast.Constant(value=slot_name_candidate)\\n                context_variable = ast.Constant(value=astunparse.unparse(node.value).strip())\\n                return ast.Expr([ast.Call(func=ast.Name(id='copy_variable_value', ctx=ast.Load),\\n                                          args=[slot_name, context_variable], keywords=[])])\\n            else:\\n                return node\\n        return node\\n\\nassignment_overloader = OverloadAssignments()\\ncode_chunk = "town_slot=cxt.my_town"\\ntree = ast.parse(code_chunk)\\ntree = assignment_overloader.visit(tree)\\n<\\/code><\\/pre>\\n

      I use parseprint<\\/code> function for pretty printing code tree structure from here\\nhttps:\\/\\/bitbucket.org\\/takluyver\\/greentreesnakes\\/src\\/master\\/astpp.py<\\/a><\\/p>\\n

      http:\\/\\/alexleone.blogspot.co.uk\\/2010\\/01\\/python-ast-pretty-printer.html<\\/a><\\/p>\\n

      which gives me the result<\\/p>\\n

      parseprint(tree)\\n\\nModule(body=[\\n    Expr(value=[\\n        Call(func=Name(id='copy_variable_value', ctx=<class 'ast.Load'>), args=[\\n            Constant(value='town_slot', kind=None),\\n            Constant(value='cxt.my_town', kind=None),\\n          ], keywords=[]),\\n      ]),\\n  ], type_ignores=[])\\n\\n<\\/code><\\/pre>\\n

      Than I need to unparse code to string. I do it with another python package:<\\/p>\\n

      astunparse.unparse(tree)\\n\\nAttributeError: 'Unparser' object has no attribute '_str'\\n<\\/code><\\/pre>\\n

      which fails.<\\/p>\\n

      What does cause astunparse to fail in this case?<\\/p>\\n

      How do I correctly unparse the above code?<\\/p>\\n

      I expect astunparse<\\/code> to produce the following code chunk:<\\/p>\\n

      copy_variable_value("town_slot", "cxt.my_town")<\\/code><\\/p>\",\"output_text\":\"

      You do not need to use astunparse<\\/code>, the ast<\\/code> module includes an unparse<\\/code> method:<\\/p>\\n

      import ast\\nclass AssignOverload(ast.NodeTransformer):\\n   def visit_Assign(self, node):\\n      return ast.Call(func=ast.Name(id='copy_variable_value'), \\n         args=[ast.Constant(value=ast.unparse(i)) for i in [*node.targets, node.value]], \\n         keywords=[])\\n\\ncode_chunk = "town_slot=cxt.my_town"\\na = AssignOverload()\\nresult = a.visit(ast.parse(code_chunk))\\nprint(ast.unparse(result))\\n<\\/code><\\/pre>\\n

      Output:<\\/p>\\n

      copy_variable_value('town_slot', 'cxt.my_town')\\n<\\/code><\\/pre>\"}\n"
            ]
           }
          ],
      @@ -429,7 +426,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 30,
      +   "execution_count": 66,
          "metadata": {
           "id": "vDDLHac5aalN"
          },
      @@ -438,14 +435,18 @@
            "name": "stdout",
            "output_type": "stream",
            "text": [
      +      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      +      "To disable this warning, you can either:\n",
      +      "\t- Avoid using `tokenizers` before the fork if possible\n",
      +      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
             "Copying file://tune_data_stack_overflow_python_qa.jsonl [Content-Type=application/octet-stream]...\n",
      -      "/ [1 files][ 22.5 MiB/ 22.5 MiB]                                                \n",
      -      "Operation completed over 1 objects/22.5 MiB.                                     \n"
      +      "/ [1 files][  2.3 MiB/  2.3 MiB]                                                \n",
      +      "Operation completed over 1 objects/2.3 MiB.                                      \n"
            ]
           }
          ],
          "source": [
      -    "! gsutil cp $training_data_filename $BUCKET_URI"
      +    "!gsutil cp $training_data_filename $BUCKET_URI"
          ]
         },
         {
      @@ -459,7 +460,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 32,
      +   "execution_count": 67,
          "metadata": {
           "id": "2-DnKpYlaalN"
          },
      @@ -468,14 +469,18 @@
            "name": "stdout",
            "output_type": "stream",
            "text": [
      -      "  23637845  2023-09-19T21:11:04Z  gs://dherin-dev/tune_data_stack_overflow_python_qa.jsonl#1695157864253156  metageneration=1\n",
      -      "TOTAL: 1 objects, 23637845 bytes (22.54 MiB)\n"
      +      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      +      "To disable this warning, you can either:\n",
      +      "\t- Avoid using `tokenizers` before the fork if possible\n",
      +      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      +      "   2410384  2023-09-20T00:36:52Z  gs://dherin-dev/tune_data_stack_overflow_python_qa.jsonl#1695170212151253  metageneration=1\n",
      +      "TOTAL: 1 objects, 2410384 bytes (2.3 MiB)\n"
            ]
           }
          ],
          "source": [
           "TRAINING_DATA_URI = f\"{BUCKET_URI}/{training_data_filename}\"\n",
      -    "! gsutil ls -al $TRAINING_DATA_URI"
      +    "!gsutil ls -al $TRAINING_DATA_URI"
          ]
         },
         {
      @@ -498,7 +503,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 43,
      +   "execution_count": 68,
          "metadata": {},
          "outputs": [],
          "source": [
      @@ -529,31 +534,31 @@
            "name": "stdout",
            "output_type": "stream",
            "text": [
      -      "Model name: genai-workshop-tuned-model-1695160329.9899337\n",
      +      "Model name: asl-palm-text-tuned-model-1695170250.4494693\n",
             "Creating PipelineJob\n",
      -      "PipelineJob created. Resource name: projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211\n",
      +      "PipelineJob created. Resource name: projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730\n",
             "To use this PipelineJob in another session:\n",
      -      "pipeline_job = aiplatform.PipelineJob.get('projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211')\n",
      +      "pipeline_job = aiplatform.PipelineJob.get('projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730')\n",
             "View Pipeline Job:\n",
      -      "https://console.cloud.google.com/vertex-ai/locations/europe-west4/pipelines/runs/tune-large-model-20230919215211?project=115851500182\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n",
      -      "PipelineState.PIPELINE_STATE_PENDING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n",
      +      "https://console.cloud.google.com/vertex-ai/locations/europe-west4/pipelines/runs/tune-large-model-20230920003730?project=115851500182\n",
      +      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      +      "PipelineState.PIPELINE_STATE_RUNNING\n",
      +      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
             "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n",
      +      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
             "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n",
      +      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
             "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n",
      +      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
             "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230919215211 current state:\n",
      +      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
             "PipelineState.PIPELINE_STATE_RUNNING\n"
            ]
           }
          ],
          "source": [
      -    "TRAIN_STEPS = 100\n",
      -    "MODEL_NAME = f\"genai-workshop-tuned-model-{time.time()}\"\n",
      +    "TRAIN_STEPS = 500\n",
      +    "MODEL_NAME = f\"asl-palm-text-tuned-model-{time.time()}\"\n",
           "print(\"Model name:\", MODEL_NAME)\n",
           "\n",
           "model.tune_model(\n",
      @@ -567,52 +572,35 @@
           ")"
          ]
         },
      -  {
      -   "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {},
      -   "outputs": [],
      -   "source": [
      -    "print(\n",
      -    "    model.predict(\n",
      -    "        \"Can you provide me with a Python implementation of BERT with Tensorflow? Example: \"\n",
      -    "    )\n",
      -    ")"
      -   ]
      -  },
      -  {
      -   "cell_type": "markdown",
      -   "metadata": {
      -    "id": "PRCkdxXvaalO"
      -   },
      -   "source": [
      -    "Following the link above, you can view your pipeline run. As you can see in the screenshot below, it will execute the following steps:\n",
      -    "\n",
      -    "- Validation\n",
      -    "- Export managed dataset\n",
      -    "- Convert JSONL to TFRecord\n",
      -    "- Large language model tuning\n",
      -    "- Upload LLM Model"
      -   ]
      -  },
         {
          "cell_type": "markdown",
          "metadata": {
           "id": "O6JC8XplaalO"
          },
          "source": [
      -    "## View your tuned foundational model on Vertex AI Model registry\n",
      +    "## Retrieve the foundational model from Vertex AI Model registry\n",
      +    "\n",
           "When your tuning job is finished, your model will be available on Vertex AI Model Registry. The following Python SDK sample shows you how to list tuned models."
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "GPWX0ITCaalO"
      -   },
      -   "outputs": [],
      +   "execution_count": 50,
      +   "metadata": {},
      +   "outputs": [
      +    {
      +     "data": {
      +      "text/plain": [
      +       "['projects/115851500182/locations/us-central1/models/7558911543518167040']"
      +      ]
      +     },
      +     "execution_count": 50,
      +     "metadata": {},
      +     "output_type": "execute_result"
      +    }
      +   ],
          "source": [
      +    "model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n",
           "model.list_tuned_model_names()"
          ]
         },
      @@ -622,46 +610,22 @@
           "id": "ZriyF0V-aalO"
          },
          "source": [
      -    "You can also use the Google Cloud Console UI to view all of your model in [Vertex AI Model Registry](https://console.cloud.google.com/vertex-ai/models?e=13802955&jsmode=O&mods=-ai_platform_fake_service&project=cloud-llm-preview1). Below you can see an example of a tuned foundational model available on Vertex AI Model Registry."
      -   ]
      -  },
      -  {
      -   "cell_type": "markdown",
      -   "metadata": {
      -    "id": "cFftY6-EaalO"
      -   },
      -   "source": [
      -    "## Use your tuned model to get predictions\n",
      -    "Now it's time to get predictions. First you need to get the latest tuned model from the Vertex AI Model registry."
      -   ]
      -  },
      -  {
      -   "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "vU-K3EIkaalO"
      -   },
      -   "outputs": [],
      -   "source": [
      -    "def fetch_model(project_id, location):\n",
      -    "    aiplatform.init(project=project_id, location=location)\n",
      -    "    model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n",
      -    "    list_tuned_models = model.list_tuned_model_names()\n",
      -    "    tuned_model = list_tuned_models[0]\n",
      +    "You can also use the Google Cloud Console UI to view all of your model in [Vertex AI Model Registry](https://console.cloud.google.com/vertex-ai/models?). Below you can see an example of a tuned foundational model available on Vertex AI Model Registry.\n",
           "\n",
      -    "    return tuned_model"
      +    "Now it's time to get predictions. First you need to get the latest tuned model from the Vertex AI Model registry."
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      +   "execution_count": 19,
          "metadata": {
           "id": "j66dr12taalO"
          },
          "outputs": [],
          "source": [
      -    "deployed_model = fetch_model(PROJECT_ID, REGION)\n",
      -    "deployed_model = TextGenerationModel.get_tuned_model(deployed_model)"
      +    "deployed_model = TextGenerationModel.get_tuned_model(\n",
      +    "    model.list_tuned_model_names()[0]\n",
      +    ")"
          ]
         },
         {
      @@ -675,56 +639,70 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      +   "execution_count": 26,
          "metadata": {
           "id": "2ERbfPJPaalO"
          },
      -   "outputs": [],
      +   "outputs": [
      +    {
      +     "name": "stdout",
      +     "output_type": "stream",
      +     "text": [
      +      "```python\n",
      +      "import tensorflow as tf\n",
      +      "\n",
      +      "# Create a GCS bucket\n",
      +      "bucket = tf.gfile.GFile('gs://my-bucket/', 'w')\n",
      +      "\n",
      +      "# Create a checkpoint directory\n",
      +      "checkpoint_dir = 'gs://my-bucket/checkpoints/'\n",
      +      "\n",
      +      "# Create a checkpoint file\n",
      +      "checkpoint_file = os.path.join(checkpoint_dir, 'checkpoint')\n",
      +      "\n",
      +      "# Create a saver\n",
      +      "saver = tf.train.Saver()\n",
      +      "\n",
      +      "# Save the checkpoint\n",
      +      "saver.save(sess, checkpoint_file)\n",
      +      "\n",
      +      "# Restore the\n"
      +     ]
      +    }
      +   ],
          "source": [
           "PROMPT = \"\"\"\n",
           "How can I store my TensorFlow checkpoint on Google Cloud Storage?\n",
           "\n",
           "Python example:\n",
           "\n",
      -    "\"\"\""
      +    "\"\"\"\n",
      +    "\n",
      +    "print(deployed_model.predict(PROMPT))"
          ]
         },
         {
      -   "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "trzon4EyaalO"
      -   },
      -   "outputs": [],
      +   "cell_type": "markdown",
      +   "metadata": {},
          "source": [
      -    "print(deployed_model.predict(PROMPT))"
      +    "Next you will generate the evaluation metrics. `evaluator.evaluate` will return a few eval metrics. Some of the important ones are:\n",
      +    "- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.\n",
      +    "- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text."
          ]
         },
         {
          "cell_type": "markdown",
          "metadata": {
      -    "id": "qtYr_KNPaalO"
      +    "id": "qtYr_KNPaalO",
      +    "tags": []
          },
          "source": [
      -    "## Evaulation\n",
      +    "## Evaluation\n",
           "It's essential to evaluate your model to understand its performance. Evaluation can be done in an automated way using evaluation metrics like F1 or Rouge. You can also leverage human evaluation methods. Human evaluation methods involve asking humans to rate the quality of the LLM's answers. This can be done through crowdsourcing or by having experts evaluate the responses. Some standard human evaluation metrics include fluency, coherence, relevance, and informativeness. Often you want to choose a mix of evaluation metrics to get a good understanding of your model performance. Below you will find an example of how you can do the evaluation.\n",
           "\n",
           "In this example you will be using [sequence-evaluate](https://pypi.org/project/sequence-evaluate/) to evaluation the tuned model."
          ]
         },
      -  {
      -   "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "9856CuicaalO"
      -   },
      -   "outputs": [],
      -   "source": [
      -    "from seq_eval import SeqEval\n",
      -    "\n",
      -    "evaluator = SeqEval()"
      -   ]
      -  },
         {
          "cell_type": "markdown",
          "metadata": {
      @@ -739,86 +717,141 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      +   "execution_count": 52,
          "metadata": {
           "id": "LKMmIH0XaalO"
          },
          "outputs": [],
          "source": [
      -    "evaluation = evaluation.head(\n",
      -    "    10\n",
      -    ")  # you can change the number of rows you want to use\n",
      -    "evaluation_question = evaluation[\"input_text\"]\n",
      -    "evaluation_answer = evaluation[\"output_text\"]"
      +    "# you can change the number of rows you want to use\n",
      +    "EVAL_ROWS = 60\n",
      +    "\n",
      +    "evaluation = evaluation.head(EVAL_ROWS)\n",
      +    "evaluation_question = evaluation.input_text\n",
      +    "evaluation_answer = evaluation.output_text"
      +   ]
      +  },
      +  {
      +   "cell_type": "code",
      +   "execution_count": 53,
      +   "metadata": {},
      +   "outputs": [],
      +   "source": [
      +    "def evaluate_model(model, eval_input, eval_output):\n",
      +    "    candidates = []\n",
      +    "\n",
      +    "    for i in eval_input:\n",
      +    "        response = model.predict(i)\n",
      +    "        candidates.append(response.text)\n",
      +    "    references = eval_output.tolist()\n",
      +    "\n",
      +    "    evaluator = SeqEval()\n",
      +    "    return evaluator.evaluate(candidates, references, verbose=False)"
          ]
         },
         {
          "cell_type": "markdown",
      -   "metadata": {
      -    "id": "jx-g2molaalP"
      -   },
      +   "metadata": {},
          "source": [
      -    "Now you can go ahead and generate candidates using the tuned model based on the questions you took from the eval dataset."
      +    "Now we can evaluate the tunned model"
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "e5DqVXvEaalP"
      -   },
      -   "outputs": [],
      +   "execution_count": 54,
      +   "metadata": {},
      +   "outputs": [
      +    {
      +     "data": {
      +      "text/plain": [
      +       "{'bleu_1': 0.04047520756830512,\n",
      +       " 'bleu_2': 0.015100714783626129,\n",
      +       " 'bleu_3': 0.008332257944719989,\n",
      +       " 'bleu_4': 0.004868503649911386,\n",
      +       " 'rouge_1_precision': 0.2226664727278332,\n",
      +       " 'rouge_1_recall': 0.08248341451392938,\n",
      +       " 'rouge_1_f1': 0.11105924745988842,\n",
      +       " 'rouge_2_precision': 0.02592229901067698,\n",
      +       " 'rouge_2_recall': 0.01139208073925231,\n",
      +       " 'rouge_2_f1': 0.01428915384614036,\n",
      +       " 'rouge_l_precision': 0.20558055140278145,\n",
      +       " 'rouge_l_recall': 0.07492196202502902,\n",
      +       " 'rouge_l_f1': 0.10178188164640203,\n",
      +       " 'inter_dist1': 0.02047382269530938,\n",
      +       " 'inter_dist2': 0.1481372832263503,\n",
      +       " 'intra_dist1': 0.11618918174622357,\n",
      +       " 'intra_dist2': 0.4187750753268838,\n",
      +       " 'semantic_textual_similarity': 0.4033529758453369}"
      +      ]
      +     },
      +     "execution_count": 54,
      +     "metadata": {},
      +     "output_type": "execute_result"
      +    }
      +   ],
          "source": [
      -    "candidates = []\n",
      -    "\n",
      -    "for i in evaluation_question:\n",
      -    "    response = deployed_model.predict(i)\n",
      -    "    candidates.append(response.text)\n",
      -    "\n",
      -    "len(candidates)"
      +    "evaluate_model(deployed_model, evaluation_question, evaluation_answer)"
          ]
         },
         {
          "cell_type": "markdown",
      -   "metadata": {
      -    "id": "oftLTb0maalP"
      -   },
      +   "metadata": {},
          "source": [
      -    "You will also have to create a list of our references. These will you use to evaluate the model's performance."
      +    "And we can also compare it to the untuned model:"
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "y7zN70CJaalP"
      -   },
      -   "outputs": [],
      +   "execution_count": 55,
      +   "metadata": {},
      +   "outputs": [
      +    {
      +     "data": {
      +      "text/plain": [
      +       "{'bleu_1': 0.1003128560268671,\n",
      +       " 'bleu_2': 0.05564918804413014,\n",
      +       " 'bleu_3': 0.04236881534645315,\n",
      +       " 'bleu_4': 0.034599527052774505,\n",
      +       " 'rouge_1_precision': 0.267516380123697,\n",
      +       " 'rouge_1_recall': 0.1558227088368697,\n",
      +       " 'rouge_1_f1': 0.17846678760532284,\n",
      +       " 'rouge_2_precision': 0.07045565520237056,\n",
      +       " 'rouge_2_recall': 0.04442757694288905,\n",
      +       " 'rouge_2_f1': 0.04805766823189456,\n",
      +       " 'rouge_l_precision': 0.2548800164873334,\n",
      +       " 'rouge_l_recall': 0.14979437731505252,\n",
      +       " 'rouge_l_f1': 0.17098167425295888,\n",
      +       " 'inter_dist1': 0.016247833586984978,\n",
      +       " 'inter_dist2': 0.12961354726527693,\n",
      +       " 'intra_dist1': 0.09521129488653601,\n",
      +       " 'intra_dist2': 0.3913011373479328,\n",
      +       " 'semantic_textual_similarity': 0.5161213874816895}"
      +      ]
      +     },
      +     "execution_count": 55,
      +     "metadata": {},
      +     "output_type": "execute_result"
      +    }
      +   ],
          "source": [
      -    "references = evaluation_answer.tolist()\n",
      -    "\n",
      -    "len(references)"
      +    "evaluate_model(model, evaluation_question, evaluation_answer)"
          ]
         },
         {
          "cell_type": "markdown",
          "metadata": {},
          "source": [
      -    "Next you will generate the evaluation metrics. `evaluator.evaluate` will return a few eval metrics. Some of the important ones are:\n",
      -    "- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.\n",
      -    "- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text."
      +    "If the score for the tunned model are lower than the original foundation model, you'll need to increase the size the of tuning set, and possibly modify the number of steps you are using for tuning."
          ]
         },
         {
      -   "cell_type": "code",
      -   "execution_count": null,
      -   "metadata": {
      -    "id": "B828sNxUaalP"
      -   },
      -   "outputs": [],
      +   "cell_type": "markdown",
      +   "metadata": {},
          "source": [
      -    "scores = evaluator.evaluate(candidates, references, verbose=False)\n",
      -    "print(scores)"
      +    "## Acknowledgement \n",
      +    "\n",
      +    "This notebook is adapted from a [tutorial](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/tuning/getting_started_tuning.ipynb)\n",
      +    "written by Polong Lin."
          ]
         },
         {
      
      From d2b97c3fab566a5aa331d1e649a17b005a0f07a7 Mon Sep 17 00:00:00 2001
      From: BenoitDherin 
      Date: Wed, 20 Sep 2023 23:12:45 +0000
      Subject: [PATCH 5/7] precommit
      
      ---
       .../solutions/vertex_llm_tuning.ipynb         | 531 ++++++++++--------
       1 file changed, 311 insertions(+), 220 deletions(-)
      
      diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb
      index a7422a28..48d301f1 100644
      --- a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb
      +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb
      @@ -19,7 +19,8 @@
           "\n",
           "1. Learn how to generate a JSONL file for PaLM tuning\n",
           "1. Learn how to launch a tuning job on Vertex Pipeline\n",
      -    "1. Learn how to query you tuned LLM and evaluate it"
      +    "1. Learn how to deploy and query a tuned LLM\n",
      +    "1. Learn how to evaluate a tuned LLM\n"
          ]
         },
         {
      @@ -30,17 +31,16 @@
          },
          "source": [
           "Creating an LLM requires massive amounts of data, significant computing resources, and specialized skills. In this notebook, you'll learn how tuning allows you to customize a PaLM foundation model on Vertex Generative AI studio for more specific tasks or knowledge domains.\n",
      -    "\n",
           "While the prompt design is excellent for quick experimentation, if training data is available, you can achieve higher quality by tuning the model. Tuning a model enables you to customize the model response based on examples of the task you want the model to perform.\n",
           "\n",
           "For more details on tuning have a look at the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).\n",
           "\n",
           "**Quota**: Tuning the `text-bison@001`  model uses the `tpu-v3-8` training resources and the accompanying quotas from your Google Cloud project. Each project has a default quota of eight v3-8 cores, which allows for one to two concurrent tuning jobs. If you want to run more concurrent jobs you need to request additional quota via the [Quotas page](https://console.cloud.google.com/iam-admin/quotas).\n",
           "\n",
      -    "**Costs:** This tutorial uses billable a component of Google Cloud `Vertex AI Generative AI Studio`.\n",
      +    "**Costs:** This tutorial uses a billable component of Google Cloud `Vertex AI Generative AI Studio`.\n",
           "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),\n",
           "and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)\n",
      -    "to generate a cost estimate based on your projected usage."
      +    "to generate a cost estimate based on your projected usage.\n"
          ]
         },
         {
      @@ -52,25 +52,51 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": null,
      +   "execution_count": 1,
          "metadata": {},
      -   "outputs": [],
      +   "outputs": [
      +    {
      +     "data": {
      +      "text/plain": [
      +       "{'status': 'ok', 'restart': True}"
      +      ]
      +     },
      +     "execution_count": 1,
      +     "metadata": {},
      +     "output_type": "execute_result"
      +    }
      +   ],
          "source": [
           "import IPython\n",
           "\n",
           "# The version of google-cloud-aiplatform needs to be >= 1.33.0\n",
      -    "!pip install --upgrade --user \\\n",
      +    "!pip install -q --upgrade --user \\\n",
           "    google-cloud-aiplatform \\\n",
      -    "    sequence-evaluate sentence-transformers \\\n",
      +    "    sequence-evaluate \\\n",
      +    "    sentence-transformers \\\n",
           "    rouge\n",
           "\n",
      +    "# Restart the kernel\n",
           "app = IPython.Application.instance()\n",
           "app.kernel.do_shutdown(True)"
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": 45,
      +   "execution_count": 1,
      +   "metadata": {},
      +   "outputs": [],
      +   "source": [
      +    "import os\n",
      +    "import warnings\n",
      +    "\n",
      +    "warnings.filterwarnings(\"ignore\")\n",
      +    "os.environ[\"TF_CPP_MIN_LOG_LEVEL\"] = \"2\""
      +   ]
      +  },
      +  {
      +   "cell_type": "code",
      +   "execution_count": 2,
          "metadata": {},
          "outputs": [],
          "source": [
      @@ -85,7 +111,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 29,
      +   "execution_count": 3,
          "metadata": {},
          "outputs": [],
          "source": [
      @@ -98,7 +124,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 4,
      +   "execution_count": null,
          "metadata": {
           "colab": {
            "base_uri": "https://localhost:8080/"
      @@ -106,33 +132,7 @@
           "id": "NSRiXkavaalH",
           "outputId": "8b752c8a-d575-4982-85f8-5a40317c8ac3"
          },
      -   "outputs": [
      -    {
      -     "name": "stdout",
      -     "output_type": "stream",
      -     "text": [
      -      "gs://dherin-dev/cord19_embeddings.json\n",
      -      "gs://dherin-dev/salads.csv\n",
      -      "gs://dherin-dev/tune_data_stack_overflow_python_qa.jsonl\n",
      -      "gs://dherin-dev/115851500182/\n",
      -      "gs://dherin-dev/7737964263322419200-616112577574862848/\n",
      -      "gs://dherin-dev/babyweight/\n",
      -      "gs://dherin-dev/babyweight_220707_021136/\n",
      -      "gs://dherin-dev/babyweight_220707_021151/\n",
      -      "gs://dherin-dev/babyweight_220707_021154/\n",
      -      "gs://dherin-dev/car_damage_lab_images/\n",
      -      "gs://dherin-dev/classification-bert-20230411003650/\n",
      -      "gs://dherin-dev/contextual_bandit_checkpoints/\n",
      -      "gs://dherin-dev/covertype/\n",
      -      "gs://dherin-dev/models/\n",
      -      "gs://dherin-dev/movies/\n",
      -      "gs://dherin-dev/staging/\n",
      -      "gs://dherin-dev/taxifare-20230710171207/\n",
      -      "gs://dherin-dev/taxifare-20230710191151/\n",
      -      "gs://dherin-dev/taxifare/\n"
      -     ]
      -    }
      -   ],
      +   "outputs": [],
          "source": [
           "!gsutil ls $BUCKET_URI || gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI"
          ]
      @@ -143,14 +143,14 @@
           "id": "WdtNETYxaalH"
          },
          "source": [
      -    "## Tune your Model\n",
      +    "## Training Data\n",
           "\n",
      -    "Now it's time for you to create a tuning job. Tune a foundation model by creating a pipeline job using Generative AI Studio, cURL, or the Python SDK. In this notebook, we will be using the Python SDK. You will be using a Q&A with a context dataset in JSON format.\n",
           "\n",
      -    "### Training Data\n",
      -    "💾 Your model tuning dataset must be in a JSONL format where each line contains a single training example. You must make sure that you include instructions.\n",
      +    "In this notebook, we will be tuning the Vertex PaLM vertex using the Python SDK on a questions & answers dataset  from StackOverflow. \n",
      +    "Our first step will be to query the StackOverflow data on BigQuery Public Datasets, limiting to questions with the `python` tag, and `accepted` answers from 2020-01-01 only. \n",
           "\n",
      -    "You will use the StackOverflow data on BigQuery Public Datasets, limiting to questions with the `python` tag, and accepted answers for answers since 2020-01-01."
      +    "We will limit the dataset to 1000 samples, 800 of which will be used to tune the LLM and the rest for evaluating the tuned model.\n",
      +    "The second step will be to convert the dataset into a JSONL format, with one example per line, so that the tuning job can consume it.\n"
          ]
         },
         {
      @@ -159,7 +159,7 @@
           "id": "Puc3jl8QaalI"
          },
          "source": [
      -    "First create a helper function to let you easily query BigQuery and return the results as a Pandas DataFrame."
      +    "The cell below contains  a helper function that lets you easily query BigQuery and return the results as a Pandas DataFrame:"
          ]
         },
         {
      @@ -201,7 +201,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 56,
      +   "execution_count": 5,
          "metadata": {},
          "outputs": [],
          "source": [
      @@ -224,7 +224,7 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 57,
      +   "execution_count": 6,
          "metadata": {
           "id": "9VTaovLtaalI"
          },
      @@ -233,7 +233,7 @@
            "name": "stdout",
            "output_type": "stream",
            "text": [
      -      "Finished job_id: 439a8a5f-91d6-477d-8a6d-4d13d2555b36\n"
      +      "Finished job_id: 14864704-b4b7-4f0b-ae64-a6b704033c3c\n"
            ]
           },
           {
      @@ -307,7 +307,7 @@
              "4  

      You haven't given enough of your code or da... " ] }, - "execution_count": 57, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } @@ -328,7 +328,7 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 7, "metadata": { "id": "6FqbVHoeaalJ" }, @@ -351,12 +351,12 @@ "id": "OftmoPZ6aalJ" }, "source": [ - "Lets split the data into training and evalation. For Extractive Q&A tasks we advise 100+ training examples. In this case you will use 800." + "Let's split the data into training and evaluation. To tune PaLM for a Q&A task we advise 100+ training examples. In this case you will use 800." ] }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 8, "metadata": { "id": "aXqBwSwaaalJ" }, @@ -381,36 +381,32 @@ "id": "nf-q8TpnaalJ" }, "source": [ - "For tuning, the training data first needs to be converted into a JSONL format." + "For tuning, the training data first needs to be converted into a JSONL format, which is very easy in Pandas:" ] }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "training_data_filename = \"tune_data_stack_overflow_python_qa.jsonl\"\n", + "\n", "train.to_json(training_data_filename, orient=\"records\", lines=True)" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's inspect the first line of the JSONL file we just created:" + ] + }, { "cell_type": "code", - "execution_count": 65, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n", - "To disable this warning, you can either:\n", - "\t- Avoid using `tokenizers` before the fork if possible\n", - "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n", - "{\"input_text\":\"Assignment operator overloading in python Abstract Syntax Trees

      I want to overload assignment operator in python on the fly using Abstract Syntax Trees<\\/a><\\/p>\\n

      import ast\\nimport astunparse\\n\\nclass OverloadAssignments(ast.NodeTransformer):\\n    def visit_Assign(self, node):\\n        if isinstance(node, ast.Assign) and node.targets:\\n            funcs = node.targets[0]\\n            slot_name_candidate = astunparse.unparse(funcs).strip()\\n            if isinstance(funcs, ast.Name) and "_slot" in slot_name_candidate:\\n                slot_name = ast.Constant(value=slot_name_candidate)\\n                context_variable = ast.Constant(value=astunparse.unparse(node.value).strip())\\n                return ast.Expr([ast.Call(func=ast.Name(id='copy_variable_value', ctx=ast.Load),\\n                                          args=[slot_name, context_variable], keywords=[])])\\n            else:\\n                return node\\n        return node\\n\\nassignment_overloader = OverloadAssignments()\\ncode_chunk = "town_slot=cxt.my_town"\\ntree = ast.parse(code_chunk)\\ntree = assignment_overloader.visit(tree)\\n<\\/code><\\/pre>\\n

      I use parseprint<\\/code> function for pretty printing code tree structure from here\\nhttps:\\/\\/bitbucket.org\\/takluyver\\/greentreesnakes\\/src\\/master\\/astpp.py<\\/a><\\/p>\\n

      http:\\/\\/alexleone.blogspot.co.uk\\/2010\\/01\\/python-ast-pretty-printer.html<\\/a><\\/p>\\n

      which gives me the result<\\/p>\\n

      parseprint(tree)\\n\\nModule(body=[\\n    Expr(value=[\\n        Call(func=Name(id='copy_variable_value', ctx=<class 'ast.Load'>), args=[\\n            Constant(value='town_slot', kind=None),\\n            Constant(value='cxt.my_town', kind=None),\\n          ], keywords=[]),\\n      ]),\\n  ], type_ignores=[])\\n\\n<\\/code><\\/pre>\\n

      Than I need to unparse code to string. I do it with another python package:<\\/p>\\n

      astunparse.unparse(tree)\\n\\nAttributeError: 'Unparser' object has no attribute '_str'\\n<\\/code><\\/pre>\\n

      which fails.<\\/p>\\n

      What does cause astunparse to fail in this case?<\\/p>\\n

      How do I correctly unparse the above code?<\\/p>\\n

      I expect astunparse<\\/code> to produce the following code chunk:<\\/p>\\n

      copy_variable_value("town_slot", "cxt.my_town")<\\/code><\\/p>\",\"output_text\":\"

      You do not need to use astunparse<\\/code>, the ast<\\/code> module includes an unparse<\\/code> method:<\\/p>\\n

      import ast\\nclass AssignOverload(ast.NodeTransformer):\\n   def visit_Assign(self, node):\\n      return ast.Call(func=ast.Name(id='copy_variable_value'), \\n         args=[ast.Constant(value=ast.unparse(i)) for i in [*node.targets, node.value]], \\n         keywords=[])\\n\\ncode_chunk = "town_slot=cxt.my_town"\\na = AssignOverload()\\nresult = a.visit(ast.parse(code_chunk))\\nprint(ast.unparse(result))\\n<\\/code><\\/pre>\\n

      Output:<\\/p>\\n

      copy_variable_value('town_slot', 'cxt.my_town')\\n<\\/code><\\/pre>\"}\n"
      -     ]
      -    }
      -   ],
      +   "outputs": [],
          "source": [
           "!head -n 1 $training_data_filename"
          ]
      @@ -426,25 +422,11 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 66,
      +   "execution_count": null,
          "metadata": {
           "id": "vDDLHac5aalN"
          },
      -   "outputs": [
      -    {
      -     "name": "stdout",
      -     "output_type": "stream",
      -     "text": [
      -      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      -      "To disable this warning, you can either:\n",
      -      "\t- Avoid using `tokenizers` before the fork if possible\n",
      -      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      -      "Copying file://tune_data_stack_overflow_python_qa.jsonl [Content-Type=application/octet-stream]...\n",
      -      "/ [1 files][  2.3 MiB/  2.3 MiB]                                                \n",
      -      "Operation completed over 1 objects/2.3 MiB.                                      \n"
      -     ]
      -    }
      -   ],
      +   "outputs": [],
          "source": [
           "!gsutil cp $training_data_filename $BUCKET_URI"
          ]
      @@ -460,26 +442,14 @@
         },
         {
          "cell_type": "code",
      -   "execution_count": 67,
      +   "execution_count": null,
          "metadata": {
           "id": "2-DnKpYlaalN"
          },
      -   "outputs": [
      -    {
      -     "name": "stdout",
      -     "output_type": "stream",
      -     "text": [
      -      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
      -      "To disable this warning, you can either:\n",
      -      "\t- Avoid using `tokenizers` before the fork if possible\n",
      -      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
      -      "   2410384  2023-09-20T00:36:52Z  gs://dherin-dev/tune_data_stack_overflow_python_qa.jsonl#1695170212151253  metageneration=1\n",
      -      "TOTAL: 1 objects, 2410384 bytes (2.3 MiB)\n"
      -     ]
      -    }
      -   ],
      +   "outputs": [],
          "source": [
           "TRAINING_DATA_URI = f\"{BUCKET_URI}/{training_data_filename}\"\n",
      +    "\n",
           "!gsutil ls -al $TRAINING_DATA_URI"
          ]
         },
      @@ -496,9 +466,9 @@
           "#### Recommended Tuning Configurations\n",
           "✅ Here are some recommended configurations for tuning a foundation model based on the task, in this example Q&A. You can find more in the [documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models).\n",
           "\n",
      -    "Extractive QA:\n",
      +    "Question Answering task:\n",
           "- Make sure that your train dataset size is 100+\n",
      -    "- Training steps [100-500]. You can try more than one value to get the best performance on a particular dataset (e.g. 100, 200, 500)"
      +    "- Choose your training steps in the range 100-500. You can try more than one value to get the best performance on a particular dataset (e.g. 100, 200, 500)"
          ]
         },
         {
      @@ -529,47 +499,22 @@
          "metadata": {
           "id": "on4baTh5aalN"
          },
      -   "outputs": [
      -    {
      -     "name": "stdout",
      -     "output_type": "stream",
      -     "text": [
      -      "Model name: asl-palm-text-tuned-model-1695170250.4494693\n",
      -      "Creating PipelineJob\n",
      -      "PipelineJob created. Resource name: projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730\n",
      -      "To use this PipelineJob in another session:\n",
      -      "pipeline_job = aiplatform.PipelineJob.get('projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730')\n",
      -      "View Pipeline Job:\n",
      -      "https://console.cloud.google.com/vertex-ai/locations/europe-west4/pipelines/runs/tune-large-model-20230920003730?project=115851500182\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      -      "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      -      "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      -      "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      -      "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      -      "PipelineState.PIPELINE_STATE_RUNNING\n",
      -      "PipelineJob projects/115851500182/locations/europe-west4/pipelineJobs/tune-large-model-20230920003730 current state:\n",
      -      "PipelineState.PIPELINE_STATE_RUNNING\n"
      -     ]
      -    }
      -   ],
      +   "outputs": [],
          "source": [
           "TRAIN_STEPS = 500\n",
           "MODEL_NAME = f\"asl-palm-text-tuned-model-{time.time()}\"\n",
      -    "print(\"Model name:\", MODEL_NAME)\n",
           "\n",
           "model.tune_model(\n",
           "    training_data=TRAINING_DATA_URI,\n",
           "    model_display_name=MODEL_NAME,\n",
           "    train_steps=TRAIN_STEPS,\n",
      -    "    # Tuning can only happen in the \"europe-west4\" location\n",
      +    "    # Tuning can only happen in the \"europe-west4\" location for now\n",
           "    tuning_job_location=\"europe-west4\",\n",
      -    "    # Model can only be deployed in the \"us-central1\" location\n",
      +    "    # Model can only be deployed in the \"us-central1\" location for now\n",
           "    tuned_model_location=\"us-central1\",\n",
      -    ")"
      +    ")\n",
      +    "\n",
      +    "print(\"Model name:\", MODEL_NAME)"
          ]
         },
         {
      @@ -578,23 +523,25 @@
           "id": "O6JC8XplaalO"
          },
          "source": [
      -    "## Retrieve the foundational model from Vertex AI Model registry\n",
      +    "## Retrieve the tuned model from your Vertex AI Model registry\n",
      +    "\n",
           "\n",
      -    "When your tuning job is finished, your model will be available on Vertex AI Model Registry. The following Python SDK sample shows you how to list tuned models."
      +    "When your tuning job is finished, your model will be available on Vertex AI Model Registry. The next cell shows you how to list tuned models."
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": 50,
      +   "execution_count": 9,
          "metadata": {},
          "outputs": [
           {
            "data": {
             "text/plain": [
      -       "['projects/115851500182/locations/us-central1/models/7558911543518167040']"
      +       "['projects/115851500182/locations/us-central1/models/4267906115817177088',\n",
      +       " 'projects/115851500182/locations/us-central1/models/7558911543518167040']"
             ]
            },
      -     "execution_count": 50,
      +     "execution_count": 9,
            "metadata": {},
            "output_type": "execute_result"
           }
      @@ -610,21 +557,21 @@
           "id": "ZriyF0V-aalO"
          },
          "source": [
      -    "You can also use the Google Cloud Console UI to view all of your model in [Vertex AI Model Registry](https://console.cloud.google.com/vertex-ai/models?). Below you can see an example of a tuned foundational model available on Vertex AI Model Registry.\n",
      +    "You can also use the Google Cloud Console UI to view all of your models in [Vertex AI Model Registry](https://console.cloud.google.com/vertex-ai/models?). \n",
           "\n",
      -    "Now it's time to get predictions. First you need to get the latest tuned model from the Vertex AI Model registry."
      +    "It's time to get predictions. First you need to get the latest tuned model from the Vertex AI Model registry."
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": 19,
      +   "execution_count": 11,
          "metadata": {
           "id": "j66dr12taalO"
          },
          "outputs": [],
          "source": [
      -    "deployed_model = TextGenerationModel.get_tuned_model(\n",
      -    "    model.list_tuned_model_names()[0]\n",
      +    "tuned_model = TextGenerationModel.get_tuned_model(\n",
      +    "    model.list_tuned_model_names()[-1]\n",
           ")"
          ]
         },
      @@ -634,12 +581,12 @@
           "id": "xDOueoptaalO"
          },
          "source": [
      -    "Now you can start send a prompt to the API. Feel free to update the following prompt."
      +    "Now you can start sending a prompt to the API. Feel free to update the following prompt:"
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": 26,
      +   "execution_count": 12,
          "metadata": {
           "id": "2ERbfPJPaalO"
          },
      @@ -678,16 +625,7 @@
           "\n",
           "\"\"\"\n",
           "\n",
      -    "print(deployed_model.predict(PROMPT))"
      -   ]
      -  },
      -  {
      -   "cell_type": "markdown",
      -   "metadata": {},
      -   "source": [
      -    "Next you will generate the evaluation metrics. `evaluator.evaluate` will return a few eval metrics. Some of the important ones are:\n",
      -    "- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.\n",
      -    "- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text."
      +    "print(tuned_model.predict(PROMPT))"
          ]
         },
         {
      @@ -698,37 +636,125 @@
          },
          "source": [
           "## Evaluation\n",
      -    "It's essential to evaluate your model to understand its performance. Evaluation can be done in an automated way using evaluation metrics like F1 or Rouge. You can also leverage human evaluation methods. Human evaluation methods involve asking humans to rate the quality of the LLM's answers. This can be done through crowdsourcing or by having experts evaluate the responses. Some standard human evaluation metrics include fluency, coherence, relevance, and informativeness. Often you want to choose a mix of evaluation metrics to get a good understanding of your model performance. Below you will find an example of how you can do the evaluation.\n",
           "\n",
      -    "In this example you will be using [sequence-evaluate](https://pypi.org/project/sequence-evaluate/) to evaluation the tuned model."
      +    "\n",
      +    "It's essential to evaluate your model to understand its performance. Evaluation can be done in an automated way using evaluation metrics like F1, Bleu, or Rouge. You can also leverage human evaluation methods. Human evaluation methods involve asking humans to rate the quality of the LLM's answers. This can be done through crowdsourcing or by having experts evaluate the responses. Some standard human evaluation metrics include fluency, coherence, relevance, and informativeness. Often you want to choose a mix of evaluation metrics to get a good understanding of your model performance. \n",
      +    "\n",
      +    "\n",
      +    "Among other metrics we will compute the following two metrics that provide crude measures albeit automated of how two texts may have the same meaning: \n",
      +    "- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.\n",
      +    "- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text.\n",
      +    "\n",
      +    "\n",
      +    "We will use  [sequence-evaluate](https://pypi.org/project/sequence-evaluate/) to to compute the scores.\n",
      +    "Earlier in the notebook, you created a train and eval dataset. Now it's time to take some of the eval data. You will use the questions to get a response from our tuned model, and the answers we will use as a reference:\n",
      +    "- **Candidates**: Answers generated by the tuned model.\n",
      +    "- **References**: Original answers that we will use to compare\n"
          ]
         },
         {
          "cell_type": "markdown",
      -   "metadata": {
      -    "id": "AS10ybdraalO"
      -   },
      +   "metadata": {},
          "source": [
      -    "Earlier in the notebook, you created a train and eval dataset. Now it's time to take some of the eval data. You will use the questions to get a response from our tuned model, and the answers we will use as a reference:\n",
      -    "\n",
      -    "- **Candidates**: Answers generated by the tuned model.\n",
      -    "- **References**: Original answers that we will use to compare."
      +    "Let us first select a sample of our evaluation set:"
          ]
         },
         {
          "cell_type": "code",
      -   "execution_count": 52,
      +   "execution_count": 13,
          "metadata": {
           "id": "LKMmIH0XaalO"
          },
      -   "outputs": [],
      +   "outputs": [
      +    {
      +     "data": {
      +      "text/html": [
      +       "
      \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
      input_textoutput_text
      787Changing some values in a row of pd.DataFrame ...<p>You may want to try this usage of <code>loc...
      254Split Large csv File into multiple files depen...<p>Here is one approach.</p>\\n<pre><code>fn = ...
      458When to use the bitwise and operator (&)?<p>I ...<p>As the comments mentioned, <code>num&amp;1<...
      142Pandas can't select index range as string date...<p>The index dates were strings instead of dat...
      42Scale with Kivy ScatterLayout Doesn't Behave a...<p>You have to scale the size of the <code>Sca...
      \n", + "
      " + ], + "text/plain": [ + " input_text \\\n", + "787 Changing some values in a row of pd.DataFrame ... \n", + "254 Split Large csv File into multiple files depen... \n", + "458 When to use the bitwise and operator (&)?

      I ... \n", + "142 Pandas can't select index range as string date... \n", + "42 Scale with Kivy ScatterLayout Doesn't Behave a... \n", + "\n", + " output_text \n", + "787

      You may want to try this usage of loc... \n", + "254

      Here is one approach.

      \\n
      fn = ...  \n",
      +       "458  

      As the comments mentioned, num&1<... \n", + "142

      The index dates were strings instead of dat... \n", + "42

      You have to scale the size of the Sca... " + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "# you can change the number of rows you want to use\n", "EVAL_ROWS = 60\n", "\n", "evaluation = evaluation.head(EVAL_ROWS)\n", - "evaluation_question = evaluation.input_text\n", - "evaluation_answer = evaluation.output_text" + "evaluation.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The function in the cell below will query our tuned model using the `evaluation.input_text` and store the ground truth in `evaluation.output_text` in a DataFrame next to the model answers:" ] }, { @@ -737,111 +763,176 @@ "metadata": {}, "outputs": [], "source": [ - "def evaluate_model(model, eval_input, eval_output):\n", - " candidates = []\n", + "def create_eval_data(model, evaluation):\n", + " model_answers = []\n", "\n", - " for i in eval_input:\n", - " response = model.predict(i)\n", - " candidates.append(response.text)\n", - " references = eval_output.tolist()\n", + " for prompt in evaluation.input_text:\n", + " response = model.predict(prompt)\n", + " model_answers.append(response.text)\n", "\n", - " evaluator = SeqEval()\n", - " return evaluator.evaluate(candidates, references, verbose=False)" + " eval_df = pd.DataFrame(\n", + " {\"candidate\": model_answers, \"reference\": evaluation.output_text}\n", + " )\n", + " mask = eval_df.candidate == \"\"\n", + " return eval_df[~mask]" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 54, "metadata": {}, + "outputs": [], "source": [ - "Now we can evaluate the tunned model" + "eval_df = create_eval_data(model, evaluation)" ] }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 55, "metadata": {}, "outputs": [ { "data": { + "text/html": [ + "

      \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
      candidatereference
      787The error is because you are trying to change ...<p>You may want to try this usage of <code>loc...
      254```python\\nimport csv\\n\\ndef split_csv(input_f...<p>Here is one approach.</p>\\n<pre><code>fn = ...
      458The bitwise and operator (&) is used to perfor...<p>As the comments mentioned, <code>num&amp;1<...
      142The problem is that the index is a `DatetimeIn...<p>The index dates were strings instead of dat...
      42The problem is that you are not setting the <c...<p>You have to scale the size of the <code>Sca...
      \n", + "
      " + ], "text/plain": [ - "{'bleu_1': 0.04047520756830512,\n", - " 'bleu_2': 0.015100714783626129,\n", - " 'bleu_3': 0.008332257944719989,\n", - " 'bleu_4': 0.004868503649911386,\n", - " 'rouge_1_precision': 0.2226664727278332,\n", - " 'rouge_1_recall': 0.08248341451392938,\n", - " 'rouge_1_f1': 0.11105924745988842,\n", - " 'rouge_2_precision': 0.02592229901067698,\n", - " 'rouge_2_recall': 0.01139208073925231,\n", - " 'rouge_2_f1': 0.01428915384614036,\n", - " 'rouge_l_precision': 0.20558055140278145,\n", - " 'rouge_l_recall': 0.07492196202502902,\n", - " 'rouge_l_f1': 0.10178188164640203,\n", - " 'inter_dist1': 0.02047382269530938,\n", - " 'inter_dist2': 0.1481372832263503,\n", - " 'intra_dist1': 0.11618918174622357,\n", - " 'intra_dist2': 0.4187750753268838,\n", - " 'semantic_textual_similarity': 0.4033529758453369}" + " candidate \\\n", + "787 The error is because you are trying to change ... \n", + "254 ```python\\nimport csv\\n\\ndef split_csv(input_f... \n", + "458 The bitwise and operator (&) is used to perfor... \n", + "142 The problem is that the index is a `DatetimeIn... \n", + "42 The problem is that you are not setting the You may want to try this usage of loc... \n", + "254

      Here is one approach.

      \\n
      fn = ...  \n",
      +       "458  

      As the comments mentioned, num&1<... \n", + "142

      The index dates were strings instead of dat... \n", + "42

      You have to scale the size of the Sca... " ] }, - "execution_count": 54, + "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "evaluate_model(deployed_model, evaluation_question, evaluation_answer)" + "eval_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "And we can also compare it to the untuned model:" + "The function in the next cell computes a number of metrics (Rouge, Blue, etc.) useful to indicate whether two texts have the same meaning. It averages these scores over all the reference answers and those generated by our tuned model, giving scores that can serve as performance metrics for our model." ] }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 56, + "metadata": {}, + "outputs": [], + "source": [ + "def compute_scores(eval_data):\n", + " evaluator = SeqEval()\n", + " reference = eval_data.reference.tolist()\n", + " candidate = eval_data.candidate.tolist()\n", + " return evaluator.evaluate(reference, candidate, verbose=False)" + ] + }, + { + "cell_type": "code", + "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "{'bleu_1': 0.1003128560268671,\n", - " 'bleu_2': 0.05564918804413014,\n", - " 'bleu_3': 0.04236881534645315,\n", - " 'bleu_4': 0.034599527052774505,\n", - " 'rouge_1_precision': 0.267516380123697,\n", - " 'rouge_1_recall': 0.1558227088368697,\n", - " 'rouge_1_f1': 0.17846678760532284,\n", - " 'rouge_2_precision': 0.07045565520237056,\n", - " 'rouge_2_recall': 0.04442757694288905,\n", - " 'rouge_2_f1': 0.04805766823189456,\n", - " 'rouge_l_precision': 0.2548800164873334,\n", - " 'rouge_l_recall': 0.14979437731505252,\n", - " 'rouge_l_f1': 0.17098167425295888,\n", - " 'inter_dist1': 0.016247833586984978,\n", - " 'inter_dist2': 0.12961354726527693,\n", - " 'intra_dist1': 0.09521129488653601,\n", - " 'intra_dist2': 0.3913011373479328,\n", - " 'semantic_textual_similarity': 0.5161213874816895}" + "{'bleu_1': 0.11048626784959818,\n", + " 'bleu_2': 0.049981933337872736,\n", + " 'bleu_3': 0.029993136777644553,\n", + " 'bleu_4': 0.019459490228121507,\n", + " 'rouge_1_precision': 0.14123178063031,\n", + " 'rouge_1_recall': 0.24180495796915416,\n", + " 'rouge_1_f1': 0.15766779691203167,\n", + " 'rouge_2_precision': 0.03205610888282679,\n", + " 'rouge_2_recall': 0.057176760321351904,\n", + " 'rouge_2_f1': 0.03629933742380242,\n", + " 'rouge_l_precision': 0.12927872916655078,\n", + " 'rouge_l_recall': 0.22340454177548244,\n", + " 'rouge_l_f1': 0.14484886963077176,\n", + " 'inter_dist1': 0.0015513048048969688,\n", + " 'inter_dist2': 0.03906219163510438,\n", + " 'intra_dist1': 0.07823671070794777,\n", + " 'intra_dist2': 0.34727942840194825,\n", + " 'semantic_textual_similarity': 0.5892551027495285}" ] }, - "execution_count": 55, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "evaluate_model(model, evaluation_question, evaluation_answer)" + "compute_scores(eval_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "If the score for the tunned model are lower than the original foundation model, you'll need to increase the size the of tuning set, and possibly modify the number of steps you are using for tuning." + "Given two versions of the model (possibly tuned with a different amount of data or training steps), you can now compare the scores to decide which one is the best. However, remember that these automated metrics are very crude proxy of human assessment. " ] }, { From afe4dcd14f814d4f07bbb41756d858668453c2d7 Mon Sep 17 00:00:00 2001 From: BenoitDherin Date: Thu, 21 Sep 2023 22:10:16 +0000 Subject: [PATCH 6/7] precommit --- .../solutions/vertex_llm_tuning.ipynb | 544 ++++-------------- 1 file changed, 102 insertions(+), 442 deletions(-) diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb index 48d301f1..204a5e6b 100644 --- a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb @@ -50,40 +50,60 @@ "## Setup" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you have your version of `google-cloud-aiplatform` is lower than `1.33.0`, please run the next cell:" + ] + }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install -q --upgrade --user google-cloud-aiplatform" + ] + }, + { + "cell_type": "markdown", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'status': 'ok', 'restart': True}" - ] - }, - "execution_count": 1, - "metadata": {}, - "output_type": "execute_result" - } - ], + "source": [ + "We will also need the [evaluate library](https://github.com/huggingface/evaluate/tree/main) to assess the performance of our tuned model:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!pip install -q --upgrade --user evaluate rouge-score" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next cell will now restart the kernel to load the previously installed libraties" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], "source": [ "import IPython\n", "\n", - "# The version of google-cloud-aiplatform needs to be >= 1.33.0\n", - "!pip install -q --upgrade --user \\\n", - " google-cloud-aiplatform \\\n", - " sequence-evaluate \\\n", - " sentence-transformers \\\n", - " rouge\n", - "\n", - "# Restart the kernel\n", "app = IPython.Application.instance()\n", "app.kernel.do_shutdown(True)" ] }, { "cell_type": "code", - "execution_count": 1, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -96,22 +116,22 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "\n", + "import evaluate\n", "import pandas as pd\n", "from google.cloud import aiplatform, bigquery\n", - "from seq_eval import SeqEval\n", "from sklearn.model_selection import train_test_split\n", "from vertexai.preview.language_models import TextGenerationModel" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -153,59 +173,23 @@ "The second step will be to convert the dataset into a JSONL format, with one example per line, so that the tuning job can consume it.\n" ] }, - { - "cell_type": "markdown", - "metadata": { - "id": "Puc3jl8QaalI" - }, - "source": [ - "The cell below contains a helper function that lets you easily query BigQuery and return the results as a Pandas DataFrame:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "id": "Eg60aUgvaalI" - }, - "outputs": [], - "source": [ - "def run_bq_query(sql):\n", - " bq_client = bigquery.Client()\n", - "\n", - " # Try dry run before executing query to catch any errors\n", - " job_config = bigquery.QueryJobConfig(dry_run=True, use_query_cache=False)\n", - " bq_client.query(sql, job_config=job_config)\n", - "\n", - " # If dry run succeeds without errors, proceed to run query\n", - " job_config = bigquery.QueryJobConfig()\n", - " client_result = bq_client.query(sql, job_config=job_config)\n", - "\n", - " job_id = client_result.job_id\n", - "\n", - " # Wait for query/job to finish running. then get & return data frame\n", - " df = client_result.result().to_arrow().to_pandas()\n", - " print(f\"Finished job_id: {job_id}\")\n", - "\n", - " return df" - ] - }, { "cell_type": "markdown", "metadata": { "id": "1BydoFfTaalI" }, "source": [ - "Next define the query." + "Next let us run the query to assemble our dataset into the DataFrame `df`:" ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ - "query = \"\"\"\n", + "%%bigquery df\n", + "\n", "SELECT CONCAT(q.title, q.body) as input_text, a.body AS output_text\n", "FROM\n", " `bigquery-public-data.stackoverflow.posts_questions` q\n", @@ -218,157 +202,42 @@ " REGEXP_CONTAINS(q.tags, \"python\") AND\n", " a.creation_date >= \"2020-01-01\"\n", "LIMIT\n", - " 1000\n", - "\"\"\"" + " 1000" ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": { "id": "9VTaovLtaalI" }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Finished job_id: 14864704-b4b7-4f0b-ae64-a6b704033c3c\n" - ] - }, - { - "data": { - "text/html": [ - "

      \n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
      input_textoutput_text
      0append dataframe in nested loop<p>I have the f...<p>I am not entirely sure if I understand your...
      1Python pandas find element of one column in li...<p>You can do <code>apply</code>:</p>\\n<pre><c...
      2How to add a minimum value constraint in Pyomo...<p>figured it out. The two methods I described...
      3Producing Buffer Radius Polygons - Possible Pr...<p>This is apparently an issue with <code>geov...
      4SMOTE for balancing data<p>I am trying to trai...<p>You haven't given enough of your code or da...
      \n", - "
      " - ], - "text/plain": [ - " input_text \\\n", - "0 append dataframe in nested loop

      I have the f... \n", - "1 Python pandas find element of one column in li... \n", - "2 How to add a minimum value constraint in Pyomo... \n", - "3 Producing Buffer Radius Polygons - Possible Pr... \n", - "4 SMOTE for balancing data

      I am trying to trai... \n", - "\n", - " output_text \n", - "0

      I am not entirely sure if I understand your... \n", - "1

      You can do apply:

      \\n
      figured it out. The two methods I described...  \n",
      -       "3  

      This is apparently an issue with geov... \n", - "4

      You haven't given enough of your code or da... " - ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "df = run_bq_query(query)\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "qYUg8cBbaalJ" - }, - "source": [ - "There should be 1000 questions and answers." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": { - "id": "6FqbVHoeaalJ" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1000\n" - ] - } - ], + "outputs": [], "source": [ - "print(len(df))" + "df.head()" ] }, { "cell_type": "markdown", - "metadata": { - "id": "OftmoPZ6aalJ" - }, + "metadata": {}, "source": [ - "Let's split the data into training and evaluation. To tune PaLM for a Q&A task we advise 100+ training examples. In this case you will use 800." + "The column `input_text` corresponds to the actual questions asked by the StackOverflow users, while the `output_text` column corresponds to the correct answers. From this dataset of 1000 questions-answers pairs, we will now need to generate a JSONL file with one example per line in the format:\n", + "\n", + "```python\n", + "{'input_text': , 'output_text': }\n", + "```\n", + "\n", + "This is the format we need to tune our LLM model.\n", + "\n", + "Let's first split the data into training and evaluation. To tune PaLM for a Q&A task we advise 100+ training examples. In this case you will use 800.\n" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "metadata": { "id": "aXqBwSwaaalJ" }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "800\n" - ] - } - ], + "outputs": [], "source": [ "# split is set to 80/20\n", "train, evaluation = train_test_split(df, test_size=0.2)\n", @@ -386,7 +255,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -473,7 +342,7 @@ }, { "cell_type": "code", - "execution_count": 68, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -490,7 +359,8 @@ "source": [ "Next it's time to start your tuning job. \n", "\n", - "**Disclaimer:** tuning and deploying a model takes time." + "**Disclaimer:** Tuning and deploying a LLM model takes time. \n", + "For 100 train steps, it takes around 1h. For 500 train steps, it takes around 3h30." ] }, { @@ -501,7 +371,7 @@ }, "outputs": [], "source": [ - "TRAIN_STEPS = 500\n", + "TRAIN_STEPS = 100\n", "MODEL_NAME = f\"asl-palm-text-tuned-model-{time.time()}\"\n", "\n", "model.tune_model(\n", @@ -531,21 +401,9 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "['projects/115851500182/locations/us-central1/models/4267906115817177088',\n", - " 'projects/115851500182/locations/us-central1/models/7558911543518167040']" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "model = TextGenerationModel.from_pretrained(\"text-bison@001\")\n", "model.list_tuned_model_names()" @@ -564,7 +422,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "metadata": { "id": "j66dr12taalO" }, @@ -586,37 +444,11 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "metadata": { "id": "2ERbfPJPaalO" }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "```python\n", - "import tensorflow as tf\n", - "\n", - "# Create a GCS bucket\n", - "bucket = tf.gfile.GFile('gs://my-bucket/', 'w')\n", - "\n", - "# Create a checkpoint directory\n", - "checkpoint_dir = 'gs://my-bucket/checkpoints/'\n", - "\n", - "# Create a checkpoint file\n", - "checkpoint_file = os.path.join(checkpoint_dir, 'checkpoint')\n", - "\n", - "# Create a saver\n", - "saver = tf.train.Saver()\n", - "\n", - "# Save the checkpoint\n", - "saver.save(sess, checkpoint_file)\n", - "\n", - "# Restore the\n" - ] - } - ], + "outputs": [], "source": [ "PROMPT = \"\"\"\n", "How can I store my TensorFlow checkpoint on Google Cloud Storage?\n", @@ -642,14 +474,16 @@ "\n", "\n", "Among other metrics we will compute the following two metrics that provide crude measures albeit automated of how two texts may have the same meaning: \n", - "- [Blue](https://en.wikipedia.org/wiki/BLEU): The BLEU evaluation metric is a measure of the similarity between a machine-generated text and a human-written reference text.\n", - "- [Rouge](https://en.wikipedia.org/wiki/ROUGE_(metric)): The ROUGE evaluation metric is a measure of the overlap between a machine-generated text and a human-written reference text.\n", + "- The [BLEU](https://en.wikipedia.org/wiki/BLEU) evaluation metric is a sort of **precision** metric, measuring the proportion of $n$-grams in the generated sentence matching $n$-grams in the reference sentence. It goes from 0 to 1 with a higher score for more similar sentences. BLEU1 considers uni-grams only, while BLUE2 considers bi-grams. \n", "\n", + "- The [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) evaluation metric is a sort of **recall** metric, measuring the proportion of $n$-grams in the reference sentence that are matched by $n$-grams in the generated sentence. It goes from 0 to 1 with a higher score for more similar sentences. ROUGE1 considers uni-grams only, while ROUGE2 considers bi-grams.\n", "\n", - "We will use [sequence-evaluate](https://pypi.org/project/sequence-evaluate/) to to compute the scores.\n", + "\n", + "We will use [evaluate](https://github.com/huggingface/evaluate/tree/main) to to compute the scores.\n", "Earlier in the notebook, you created a train and eval dataset. Now it's time to take some of the eval data. You will use the questions to get a response from our tuned model, and the answers we will use as a reference:\n", "- **Candidates**: Answers generated by the tuned model.\n", - "- **References**: Original answers that we will use to compare\n" + "- **References**: Original answers that we will use to compare\n", + "\n" ] }, { @@ -661,91 +495,16 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "metadata": { "id": "LKMmIH0XaalO" }, - "outputs": [ - { - "data": { - "text/html": [ - "

      \n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
      input_textoutput_text
      787Changing some values in a row of pd.DataFrame ...<p>You may want to try this usage of <code>loc...
      254Split Large csv File into multiple files depen...<p>Here is one approach.</p>\\n<pre><code>fn = ...
      458When to use the bitwise and operator (&)?<p>I ...<p>As the comments mentioned, <code>num&amp;1<...
      142Pandas can't select index range as string date...<p>The index dates were strings instead of dat...
      42Scale with Kivy ScatterLayout Doesn't Behave a...<p>You have to scale the size of the <code>Sca...
      \n", - "
      " - ], - "text/plain": [ - " input_text \\\n", - "787 Changing some values in a row of pd.DataFrame ... \n", - "254 Split Large csv File into multiple files depen... \n", - "458 When to use the bitwise and operator (&)?

      I ... \n", - "142 Pandas can't select index range as string date... \n", - "42 Scale with Kivy ScatterLayout Doesn't Behave a... \n", - "\n", - " output_text \n", - "787

      You may want to try this usage of loc... \n", - "254

      Here is one approach.

      \\n
      fn = ...  \n",
      -       "458  

      As the comments mentioned, num&1<... \n", - "142

      The index dates were strings instead of dat... \n", - "42

      You have to scale the size of the Sca... " - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "# you can change the number of rows you want to use\n", "EVAL_ROWS = 60\n", - "\n", + "INPUT_LIMIT = 10000 # characters\n", + "evaluation = evaluation[evaluation.input_text.apply(len) <= INPUT_LIMIT]\n", "evaluation = evaluation.head(EVAL_ROWS)\n", "evaluation.head()" ] @@ -759,7 +518,7 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -779,7 +538,7 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ @@ -788,85 +547,9 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "

      \n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
      candidatereference
      787The error is because you are trying to change ...<p>You may want to try this usage of <code>loc...
      254```python\\nimport csv\\n\\ndef split_csv(input_f...<p>Here is one approach.</p>\\n<pre><code>fn = ...
      458The bitwise and operator (&) is used to perfor...<p>As the comments mentioned, <code>num&amp;1<...
      142The problem is that the index is a `DatetimeIn...<p>The index dates were strings instead of dat...
      42The problem is that you are not setting the <c...<p>You have to scale the size of the <code>Sca...
      \n", - "
      " - ], - "text/plain": [ - " candidate \\\n", - "787 The error is because you are trying to change ... \n", - "254 ```python\\nimport csv\\n\\ndef split_csv(input_f... \n", - "458 The bitwise and operator (&) is used to perfor... \n", - "142 The problem is that the index is a `DatetimeIn... \n", - "42 The problem is that you are not setting the You may want to try this usage of loc... \n", - "254

      Here is one approach.

      \\n
      fn = ...  \n",
      -       "458  

      As the comments mentioned, num&1<... \n", - "142

      The index dates were strings instead of dat... \n", - "42

      You have to scale the size of the Sca... " - ] - }, - "execution_count": 55, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "eval_df.head()" ] @@ -875,55 +558,32 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The function in the next cell computes a number of metrics (Rouge, Blue, etc.) useful to indicate whether two texts have the same meaning. It averages these scores over all the reference answers and those generated by our tuned model, giving scores that can serve as performance metrics for our model." + "The function in the next cell computes the uni-gram BLUE and ROUGE scores. It averages these scores over all the reference answers and those generated by our tuned model, giving scores that can serve as performance metrics for our model." ] }, { "cell_type": "code", - "execution_count": 56, + "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def compute_scores(eval_data):\n", - " evaluator = SeqEval()\n", - " reference = eval_data.reference.tolist()\n", - " candidate = eval_data.candidate.tolist()\n", - " return evaluator.evaluate(reference, candidate, verbose=False)" + " predictions = eval_data.candidate.tolist()\n", + " references = eval_data.reference.tolist()\n", + " rouge_value = rouge.compute(predictions=predictions, references=references)[\n", + " \"rouge1\"\n", + " ]\n", + " bleu_value = blue.compute(predictions=predictions, references=references)[\n", + " \"bleu\"\n", + " ]\n", + " return {\"rouge\": rouge_value, \"blue\": bleu_value}" ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": null, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{'bleu_1': 0.11048626784959818,\n", - " 'bleu_2': 0.049981933337872736,\n", - " 'bleu_3': 0.029993136777644553,\n", - " 'bleu_4': 0.019459490228121507,\n", - " 'rouge_1_precision': 0.14123178063031,\n", - " 'rouge_1_recall': 0.24180495796915416,\n", - " 'rouge_1_f1': 0.15766779691203167,\n", - " 'rouge_2_precision': 0.03205610888282679,\n", - " 'rouge_2_recall': 0.057176760321351904,\n", - " 'rouge_2_f1': 0.03629933742380242,\n", - " 'rouge_l_precision': 0.12927872916655078,\n", - " 'rouge_l_recall': 0.22340454177548244,\n", - " 'rouge_l_f1': 0.14484886963077176,\n", - " 'inter_dist1': 0.0015513048048969688,\n", - " 'inter_dist2': 0.03906219163510438,\n", - " 'intra_dist1': 0.07823671070794777,\n", - " 'intra_dist2': 0.34727942840194825,\n", - " 'semantic_textual_similarity': 0.5892551027495285}" - ] - }, - "execution_count": 57, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ "compute_scores(eval_df)" ] From 884d21e05fe1886a3dd44526025108b00a96fba2 Mon Sep 17 00:00:00 2001 From: BenoitDherin Date: Fri, 22 Sep 2023 17:48:49 +0000 Subject: [PATCH 7/7] precommit --- Makefile | 1 + .../solutions/vertex_llm_tuning.ipynb | 89 +++++-------------- requirements-without-deps.txt | 11 +++ requirements.txt | 2 +- 4 files changed, 33 insertions(+), 70 deletions(-) create mode 100644 requirements-without-deps.txt diff --git a/Makefile b/Makefile index 23b36076..02f27d51 100644 --- a/Makefile +++ b/Makefile @@ -32,6 +32,7 @@ install: @pip install --user -U pip @pip install --user "Cython<3" @pip install --user -r requirements.txt + @pip install --user --no-deps -r requirements-without-deps.txt @./scripts/setup_on_jupyterlab.sh @pre-commit install @sudo apt-get -y install graphviz diff --git a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb index 204a5e6b..ec038808 100644 --- a/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb +++ b/notebooks/vertex_genai/solutions/vertex_llm_tuning.ipynb @@ -50,57 +50,6 @@ "## Setup" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you have your version of `google-cloud-aiplatform` is lower than `1.33.0`, please run the next cell:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!pip install -q --upgrade --user google-cloud-aiplatform" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We will also need the [evaluate library](https://github.com/huggingface/evaluate/tree/main) to assess the performance of our tuned model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!pip install -q --upgrade --user evaluate rouge-score" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The next cell will now restart the kernel to load the previously installed libraties" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import IPython\n", - "\n", - "app = IPython.Application.instance()\n", - "app.kernel.do_shutdown(True)" - ] - }, { "cell_type": "code", "execution_count": null, @@ -116,7 +65,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -131,7 +80,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -184,7 +133,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -207,7 +156,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "metadata": { "id": "9VTaovLtaalI" }, @@ -233,7 +182,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "metadata": { "id": "aXqBwSwaaalJ" }, @@ -401,7 +350,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "metadata": {}, "outputs": [], "source": [ @@ -422,7 +371,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 8, "metadata": { "id": "j66dr12taalO" }, @@ -444,7 +393,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 9, "metadata": { "id": "2ERbfPJPaalO" }, @@ -474,7 +423,7 @@ "\n", "\n", "Among other metrics we will compute the following two metrics that provide crude measures albeit automated of how two texts may have the same meaning: \n", - "- The [BLEU](https://en.wikipedia.org/wiki/BLEU) evaluation metric is a sort of **precision** metric, measuring the proportion of $n$-grams in the generated sentence matching $n$-grams in the reference sentence. It goes from 0 to 1 with a higher score for more similar sentences. BLEU1 considers uni-grams only, while BLUE2 considers bi-grams. \n", + "- The [BLEU](https://en.wikipedia.org/wiki/BLEU) evaluation metric is a sort of **precision** metric, measuring the proportion of $n$-grams in the generated sentence matching $n$-grams in the reference sentence. It goes from 0 to 1 with a higher score for more similar sentences. BLEU1 considers uni-grams only, while BLEU2 considers bi-grams. \n", "\n", "- The [ROUGE](https://en.wikipedia.org/wiki/ROUGE_(metric)) evaluation metric is a sort of **recall** metric, measuring the proportion of $n$-grams in the reference sentence that are matched by $n$-grams in the generated sentence. It goes from 0 to 1 with a higher score for more similar sentences. ROUGE1 considers uni-grams only, while ROUGE2 considers bi-grams.\n", "\n", @@ -495,7 +444,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 10, "metadata": { "id": "LKMmIH0XaalO" }, @@ -518,7 +467,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 11, "metadata": {}, "outputs": [], "source": [ @@ -538,7 +487,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "metadata": {}, "outputs": [], "source": [ @@ -547,7 +496,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 13, "metadata": {}, "outputs": [], "source": [ @@ -558,30 +507,32 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The function in the next cell computes the uni-gram BLUE and ROUGE scores. It averages these scores over all the reference answers and those generated by our tuned model, giving scores that can serve as performance metrics for our model." + "The function in the next cell computes the uni-gram BLEU and ROUGE scores. It averages these scores over all the reference answers and those generated by our tuned model, giving scores that can serve as performance metrics for our model." ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def compute_scores(eval_data):\n", " predictions = eval_data.candidate.tolist()\n", " references = eval_data.reference.tolist()\n", + " rouge = evaluate.load(\"rouge\")\n", + " bleu = evaluate.load(\"bleu\")\n", " rouge_value = rouge.compute(predictions=predictions, references=references)[\n", " \"rouge1\"\n", " ]\n", - " bleu_value = blue.compute(predictions=predictions, references=references)[\n", + " bleu_value = bleu.compute(predictions=predictions, references=references)[\n", " \"bleu\"\n", " ]\n", - " return {\"rouge\": rouge_value, \"blue\": bleu_value}" + " return {\"rouge\": rouge_value, \"bleu\": bleu_value}" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 15, "metadata": {}, "outputs": [], "source": [ diff --git a/requirements-without-deps.txt b/requirements-without-deps.txt new file mode 100644 index 00000000..65a4f59f --- /dev/null +++ b/requirements-without-deps.txt @@ -0,0 +1,11 @@ +# Frameworks to evaluate text generation +# as well as their dependencies +evaluate==0.4.0 +rouge-score==0.1.2 +nltk==3.8.1 +rouge-score==0.1.2 +datasets==2.14.5 +huggingface-hub==0.17.2 +multiprocess==0.70.15 +responses==0.18.0 +xxhash==3.3.0 diff --git a/requirements.txt b/requirements.txt index f6535f62..fb1c9aa4 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,5 @@ # Requirements for asl-ml-immersion repository -google-cloud-aiplatform==1.26.0 +google-cloud-aiplatform==1.33.1 google-cloud-pipeline-components==1.0.44 pyyaml==5.3.1 kfp==1.8.22