fix: increased max_tokens to avoid truncating responses #87

gentlementlegen · 2024-08-20T01:16:02Z

Resolves #84

QA:
ubiquibot/sandbox#21 (comment)
ubiquity/cloudflare-deploy-action#6 (comment)

0x4007 · 2024-08-20T04:09:47Z

src/parser/content-evaluator-module.ts

  async _evaluateComments(
    specification: string,
    comments: { id: number; comment: string }[]
  ): Promise<RelevancesByOpenAi> {
    const prompt = this._generatePrompt(specification, comments);
+    const maxTokens = await this._calculateMaxTokens(prompt);


You need to make sure to also include a buffer on top because the amount of tokens that you're sending in is always less than what's being sent out.

What's being sent out has additional tokens that counts against the quota of the whole transaction. Basically the json response itself costs additional tokens.

When you say include a buffer on top, it would mean make max_tokens actually smaller than it is?

If I am understanding your code correctly, this reminds me of my first go-around at the implementation. If I recall correctly: I counted the tokens of what I was sending out. However, when you tell ChatGPT that we are using 1000 tokens max, when it replies, its response also counts against the token limit. We can estimate how many tokens will be added to its response, but I don't recall a way to know exactly how much.

Remember that billing works for what comes out as well, not just what goes in, which is why these limits make sense for most use cases for ChatGPT. You will need to estimate how many extra tokens will be included in the output, possibly by counting the amount of comments and tokenizing a dummy array.

I checked v1 and it seems to be a fixed number that is within the configuration:
https://github.com/ubiquity/ubiquibot/blob/626cae8fcf5b60ec1ccb49df8dd2f250df046ff9/src/types/configuration-types.ts#L117
From what I understand about max_tokens is that the model supports up to 4096 tokens, which indeed are from the prompt + response. So the bigger the prompt, the smaller the response available tokens are.

So calculation is: 4096 - prompt_tokens = max_tokens.

Technically, this argument can be omitted and would default to max all the time.

https://community.openai.com/t/can-i-set-max-tokens-for-chatgpt-turbo/81207

I checked v1 and it seems to be a fixed number that is within the configuration: https://github.com/ubiquity/ubiquibot/blob/626cae8fcf5b60ec1ccb49df8dd2f250df046ff9/src/types/configuration-types.ts#L117 From what I understand about max_tokens is that the model supports up to 4096 tokens, which indeed are from the prompt + response. So the bigger the prompt, the smaller the response available tokens are.

So calculation is: 4096 - prompt_tokens = max_tokens.

Technically, this argument can be omitted and would default to max all the time.

https://community.openai.com/t/can-i-set-max-tokens-for-chatgpt-turbo/81207

My concern is briefly discussed here. It has been a long time since I worked on this problem, and the models are always evolving so I am not confident in my knowledge. However if we set max length, perhaps the model will try to pad the reply to fill the full response. I would be most confident if we pre-calculate the token length by encoding a dummy array, unless there is definitive proof otherwise that setting max length will not cause problems.

However, when you tell ChatGPT that we are using 1000 tokens max, when it replies, its response also counts against the token limit.

I realize I didn't quite finish explaining clearly, but we start with 1000 in this example, and the result might be 1200 (200 in the response.)

If we undershoot the result will be truncated and error will occur again. Should we just omit this parameter so it is always maxed? I do not know how the padding occurs either, and I do not know what is the best solution. According to ChatGpt itself this is the way to calculate max-tokens.

Generate a dummy response and encode it to estimate the length of the response. I am not sure how many max significant digits exist per element in the array, but you should estimate with a worse case scenario.

gentlementlegen · 2024-08-20T15:21:06Z

QA: https://github.com/ubiquibot/conversation-rewards/actions/runs/10474375714/

0x4007

Definitely use gpt-4o-2024-08-06

src/parser/content-evaluator-module.ts

Keyrxng · 2024-08-20T21:12:48Z

Definitely use gpt-4o-2024-08-06

I'm certain that the way they operate is the base model version is always the most up to date. I.E gpt-4o is whatever the latest version is. This was the case when I was working on AI tasks a while back and the docs confirm this is still the case but it's been a while and this may be outdated.

Continuous model upgrades
gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, and gpt-3.5-turbo point to their respective latest model version.

gentlementlegen · 2024-08-21T02:11:46Z

@Keyrxng I think this is true only if you update the package as well.

Keyrxng · 2024-08-21T02:21:31Z

@Keyrxng I think this is true only if you update the package as well.

I was going to also suggest using Latest for the package but that might be too dangerous I feel they move pretty quickly over there and break things often

gentlementlegen added 4 commits August 20, 2024 09:44

chore: bumped packages

8b55fbf

fix: increased max tokens for ChatGpt based on the prompt

6132345

chore: reduced max tokens to 4096

afd39a8

chore: test changing top_p

fc291ef

gentlementlegen marked this pull request as ready for review August 20, 2024 01:57

gentlementlegen requested review from 0x4007 and Keyrxng August 20, 2024 01:57

gentlementlegen marked this pull request as draft August 20, 2024 02:00

gentlementlegen marked this pull request as ready for review August 20, 2024 02:05

gentlementlegen mentioned this pull request Aug 20, 2024

CI: refactor pull_request_target to pull_request ubiquity/cloudflare-deploy-action#6

Closed

0x4007 requested changes Aug 20, 2024

View reviewed changes

gentlementlegen marked this pull request as draft August 20, 2024 07:54

gentlementlegen added 6 commits August 20, 2024 23:49

chore: changed max token calculation

4e315ad

chore: added debug logs

d3144a0

chore: added debug logs

15fa9f6

chore: added debug logs

9857818

chore: added debug logs

1b82e33

chore: removed debug

a696656

gentlementlegen marked this pull request as ready for review August 20, 2024 15:21

gentlementlegen requested a review from 0x4007 August 20, 2024 15:21

chore: removed unused await

303e92f

0x4007 requested changes Aug 20, 2024

View reviewed changes

src/parser/content-evaluator-module.ts Outdated Show resolved Hide resolved

src/parser/content-evaluator-module.ts Outdated Show resolved Hide resolved

chore: updated open AI version

a39fa23

gentlementlegen requested a review from 0x4007 August 20, 2024 16:28

0x4007 approved these changes Aug 20, 2024

View reviewed changes

gentlementlegen merged commit 31cb560 into ubiquity-os-marketplace:development Aug 20, 2024
6 checks passed

ubiquityos bot mentioned this pull request Aug 20, 2024

Unexpected end of JSON input error #84

Closed

gentlementlegen deleted the fix/truncated-response branch August 20, 2024 16:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: increased max_tokens to avoid truncating responses #87

fix: increased max_tokens to avoid truncating responses #87

gentlementlegen commented Aug 20, 2024 •

edited

Loading

0x4007 Aug 20, 2024

gentlementlegen Aug 20, 2024

0x4007 Aug 20, 2024

gentlementlegen Aug 20, 2024

0x4007 Aug 20, 2024

gentlementlegen Aug 20, 2024

0x4007 Aug 20, 2024 •

edited

Loading

gentlementlegen commented Aug 20, 2024

0x4007 left a comment

Keyrxng commented Aug 20, 2024

gentlementlegen commented Aug 21, 2024

Keyrxng commented Aug 21, 2024

fix: increased max_tokens to avoid truncating responses #87

fix: increased max_tokens to avoid truncating responses #87

Conversation

gentlementlegen commented Aug 20, 2024 • edited Loading

0x4007 Aug 20, 2024

Choose a reason for hiding this comment

gentlementlegen Aug 20, 2024

Choose a reason for hiding this comment

0x4007 Aug 20, 2024

Choose a reason for hiding this comment

gentlementlegen Aug 20, 2024

Choose a reason for hiding this comment

0x4007 Aug 20, 2024

Choose a reason for hiding this comment

gentlementlegen Aug 20, 2024

Choose a reason for hiding this comment

0x4007 Aug 20, 2024 • edited Loading

Choose a reason for hiding this comment

gentlementlegen commented Aug 20, 2024

0x4007 left a comment

Choose a reason for hiding this comment

Keyrxng commented Aug 20, 2024

gentlementlegen commented Aug 21, 2024

Keyrxng commented Aug 21, 2024

gentlementlegen commented Aug 20, 2024 •

edited

Loading

0x4007 Aug 20, 2024 •

edited

Loading