Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example of importing a CSV to the tuning cookbook #293

Open
pcoet opened this issue Sep 17, 2024 · 2 comments
Open

Add an example of importing a CSV to the tuning cookbook #293

pcoet opened this issue Sep 17, 2024 · 2 comments
Labels
good first issue Good for newcomers status:triaged Issue/PR triaged to the corresponding sub-team type:feature request New feature request/enhancement

Comments

@pcoet
Copy link
Member

pcoet commented Sep 17, 2024

Description of the feature request:

Add an example of importing a CSV using pandas.read_csv to the tuning cookbook: https://github.com/google-gemini/cookbook/blob/main/quickstarts/Tuning.ipynb

AI Studio accepts csv files as datasets for tuning jobs. The tuning cookbook currently demonstrates creating a tuned model using an list of dictionaries. This is a request for an additional sample demonstrating importing a training dataset from a local csv file using pandas.read_csv, to mirror the functionality of importing a csv file through AI Studio.

What problem are you trying to solve with this feature?

Help users understand how to tune a model using data from a CSV.

Any other information you'd like to share?

No response

@pcoet
Copy link
Member Author

pcoet commented Sep 17, 2024

This could also be a new notebook based on the existing tuning notebook.

@markmcd
Copy link
Member

markmcd commented Sep 18, 2024

I don't think this necessarily needs a new guide, at least not for Python. genai.create_tuned_model() can take a CSV file handle or string filename as the training_data arg.

So this

genai.create_tuned_model(
    training_data=[
        { ... }, { ... }, etc
    ], ...
)

becomes

genai.create_tuned_model(
    training_data='my_file.csv', ...
)

It supports JSON files too.

@markmcd markmcd added good first issue Good for newcomers type:feature request New feature request/enhancement status:triaged Issue/PR triaged to the corresponding sub-team labels Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers status:triaged Issue/PR triaged to the corresponding sub-team type:feature request New feature request/enhancement
Projects
None yet
Development

No branches or pull requests

2 participants