Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate Limit being hit consistently #123

Closed
cfieandres opened this issue May 17, 2024 · 9 comments
Closed

Rate Limit being hit consistently #123

cfieandres opened this issue May 17, 2024 · 9 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@cfieandres
Copy link

After updating to 1.0.0, plandex has consistently hit openai's rate limit, which makes it so that I need to do "plandex c" frequently.
Is there a way to have plandex wait till not rate limited and continue by itself?

🚨 Server error
→ Error starting reply stream
→ Error, status code
→ 429, message
→ Rate limit reached for gpt-4o in organization xxxxxxxxxx on tokens per min (TPM)
→ Limit 30000, Used 23193, Requested 15447. Please try again in 17.28s. Visit https://platform.openai.com/account/rate-limits to learn more.

@danenania
Copy link
Contributor

Parsing the rate limit error and waiting accordingly is a good idea. I'll look into it.

@benrosenblum
Copy link

Worth noting that after you have spent $50 total on your account, the limit increases to 450k tpm.

@atljoseph
Copy link

Same here. How to tell amount of usage by model in the set? Checked OpenAI usage and nothing alarming there.

@atljoseph
Copy link

atljoseph commented May 19, 2024

if you are running it locally, you can try changing the retry waitBackoff func to apply 5 second additional for each numRetry.

I got a lot less of them that way. For heavier things, might could use 10 sec.

Lastly, this might be a great use case for balancing between anthropic and OpenAI “agents”.

@danenania
Copy link
Contributor

danenania commented May 19, 2024

@atljoseph I decreased the backoff a bit in the last release so I may need to revert that or make it configurable. Or just parse the error message and wait accordingly as @cfieandres suggested. My token limit is quite high from building/testing Plandex so I'm not getting any of these errors. It’s helpful to know what backoff is working for you at a lower limit--thanks.

@danenania danenania added enhancement New feature or request good first issue Good for newcomers labels May 19, 2024
@atljoseph
Copy link

atljoseph commented May 19, 2024 via email

@atljoseph
Copy link

atljoseph commented May 19, 2024

Well, it does sometimes hit this error. After a few retried 429 logs in server is when i noticed it... immediately after the subsequent successful retry... No idea if these events are connected or not.

Getting listenStream - Stream chunk missing function call. in build line nums step.
Using GPT-4o. Maybe a network thing? IDK, but it burnt a few dollars LOL (was a tall order i asked of it).

The, it leads to Could not find replacement in original file while viewing the changes, and then it exits. After that point, something bad goes wrong with my terminal, and i can't see anything that is typed (but when hitting enter, it sure executes). Have ran into this at least 5 times.

@danenania
Copy link
Contributor

As of server/1.0.1 (already deployed on cloud), when hitting OpenAI rate limits, Plandex will now parse error messages that include a recommended wait time and automatically wait that long before retrying, up to 30 seconds

@appreciated
Copy link
Contributor

@danenania Would you mind increasing the limit to 60 seconds?
I hit the limit like 5 times today with ~45ish seconds. Or maybe allow to configure it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

5 participants