-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tuning Palm on Vertex AI #395
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
google-cloud-aiplatform
is included in the requirements.txt, but newer versions are required? If so, please add an information about the version requirement in markdown.
(Or I think it's ok to update the requirements.txt itself if it doesn't affect other dependencies.)
And it seems sequence-evaluate
is not very widely used? (6 stars.)
How about switching it to more popular one like Hugging Face evaluate
library?
https://github.com/huggingface/evaluate/tree/main
https://medium.com/@sthanikamsanthosh1994/understanding-bleu-and-rouge-score-for-nlp-evaluation-1ab334ecadcb
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Done!
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this bigquery magic is simpler to do the same thing.
%%bigquery df
https://notebook.community/GoogleCloudPlatform/python-docs-samples/notebooks/tutorials/bigquery/BigQuery%20query%20magic
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Done!
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you include the explanation on the key name conventions (input_text
and output_text
) since it seems they are fixed name?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! I also added a description of the JSONL format needed to tune.
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to include the expected training time.
(More than a couple of hours?)
Also, adding an info that the job runs on Vertex AI Pipelines would be helpful.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! For 100 steps, it's 1h. For 500, it's 3:30 for me. Done!
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems us-central1
is also another option now (but using different environment)
https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-text-models
Also, if the training is in europe-west4
, maybe the bucket should also be in the same region as a best practice? (although the data is small in this case)
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! It seems to work with all in us-central1. Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the LLM tuning part of the pipeline just failed. It doesnt seem to work if not set to europe-west4 (as opposed to us-central1) quite yet. This is certainly temporary, but will keep it this way for now
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blue -> BLEU (usual in upper case)
Rouge -> ROUGE
I think the BLEU and ROUGE should be elaborated a bit more. The descriptions looks almost the same.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Done!
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comment as the one on the top. I think it's better to use more popular library (which will usually be well-maintained).
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree. Good catch. Done!
@@ -0,0 +1,1001 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the original notebook specifies 100 training steps, but do you think we should use 500?
Given it takes a lot of times, I think it's better to keep it small if it can get reasonable result with it.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted to 100. Agreed. Done!
Tuning Vertex Palm requires the newest SDK, so I bumped the
google-cloud-aiplatform
from1.26.0
to1.33.1
. This has had no side effect on any of the already installed packages (seepip list
diff below).The notebook also requires the
evaluate
library from huggingface to evaluate the tuned model output. Importing it the standard way has for side effects to install new packages (harmless) but it also upgrades thedill
library to a version that's incompatible with the version Apache Beam requires, which is an issue. This is valid for the latest version ofevaluate
and the previous version too, becausedill
is not pinned in their setup.py.To solve that, I created a separate
requirments-without-deps.txt
that allows us to install packages without having their dependency installed or upgraded (which means also we need to list them by hand in that same file).make install
now install both what's inrequirements.txt
(with their dependecy) and what's inrequirements-without-deps.txt
(without their dependencies).To test the notebook, run
make install
from this branch. You shouldn't see any conflict.The diff between our previous setup and the new one if given by the diff below: