Enable gemini context caching #3207

yeounoh · 2024-07-24T23:05:35Z

Why are these changes needed?

Gemini model API introduced a new context caching feature that caches the prompt prefix. This PR implements enabled this new feature in GeminiClient to help reduce the cost of using the latest gemini models. Note that this is a gemini specific feature and used for caching the prompt prefix, not agent's input and output.

Related issue number

Addresses/closes #3038

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

yeounoh · 2024-07-24T23:07:25Z

@yeounoh please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]
Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

@microsoft-github-policy-service agree

codecov-commenter · 2024-07-26T01:04:36Z

Codecov Report

Attention: Patch coverage is 43.47826% with 26 lines in your changes missing coverage. Please review.

Project coverage is 13.82%. Comparing base (6279247) to head (37cd8b5).
Report is 47 commits behind head on main.

Files	Patch %	Lines
autogen/oai/gemini.py	43.47%	26 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #3207       +/-   ##
===========================================
- Coverage   32.90%   13.82%   -19.09%     
===========================================
  Files          94       97        +3     
  Lines       10235    10849      +614     
  Branches     2193     2488      +295     
===========================================
- Hits         3368     1500     -1868     
- Misses       6580     9313     +2733     
+ Partials      287       36      -251

Flag	Coverage Δ
unittests	`13.78% <43.47%> (-19.12%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yeounoh · 2024-07-26T17:12:41Z

It's failing to build due to missing google package. I will add a notebook to demonstrate the usage, before review.

BeibinLi · 2024-07-28T17:20:39Z

@yeounoh In your test case, you can move the "import" line into the existing "try...catch..." clause.

ekzhu · 2024-10-01T04:50:06Z

@yeounoh Is this PR ready to be reviewed?

rysweet · 2024-10-12T00:56:11Z

Hi @yeounoh - we've rebased and updated this for you. there are a couple of conflicts still. If you think this is ready for review please update to resolve the conflicts and then we will review.

yeounoh added 4 commits July 24, 2024 10:48

Refactoring GeminiClient.

7e38951

Add GeminiContextCache

49df911

Create gemini model with context caching

3207461

Add a unit test for cost calculation

491b67e

yeounoh marked this pull request as draft July 24, 2024 23:05

Add another unit test

d38df67

yeounoh force-pushed the main branch from 051f1b6 to d38df67 Compare July 25, 2024 07:17

yeounoh marked this pull request as ready for review July 25, 2024 07:18

Hk669 requested a review from BeibinLi July 25, 2024 13:26

sonichi requested review from HongleiZhuang and joshkyh July 25, 2024 18:04

sonichi and others added 3 commits July 25, 2024 11:09

Merge branch 'main' into main

11cc2e7

Linting

aef4231

Merge branch 'main' into main

37cd8b5

qingyun-wu mentioned this pull request Jul 26, 2024

[Roadmap]: Google Integrations #3147

Open

yeounoh marked this pull request as draft July 26, 2024 17:12

qingyun-wu had a problem deploying to openai1 August 25, 2024 01:00 — with GitHub Actions Failure

ekzhu changed the base branch from main to 0.2 October 2, 2024 18:27

jackgerrits added the 0.2 Issues which were filed before re-arch to 0.4 label Oct 4, 2024

rysweet added the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable gemini context caching #3207

Enable gemini context caching #3207

yeounoh commented Jul 24, 2024 •

edited

Loading

yeounoh commented Jul 24, 2024

codecov-commenter commented Jul 26, 2024 •

edited

Loading

yeounoh commented Jul 26, 2024

BeibinLi commented Jul 28, 2024

ekzhu commented Oct 1, 2024

rysweet commented Oct 12, 2024

Enable gemini context caching #3207

Are you sure you want to change the base?

Enable gemini context caching #3207

Conversation

yeounoh commented Jul 24, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

yeounoh commented Jul 24, 2024

codecov-commenter commented Jul 26, 2024 • edited Loading

Codecov Report

yeounoh commented Jul 26, 2024

BeibinLi commented Jul 28, 2024

ekzhu commented Oct 1, 2024

rysweet commented Oct 12, 2024

yeounoh commented Jul 24, 2024 •

edited

Loading

codecov-commenter commented Jul 26, 2024 •

edited

Loading