-
-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incomplete response with streaming #251
Comments
Thanks for sharing on here! I don't have gpt-4 access and I don't know what your system prompt is, with gpt-3.5 I'm not able to reproduce this so far:
Are you able to print out all chunks and see if they are coming through correctly? |
Hopefully you get access soon @alexrudall. In the meantime I've added a spec and tested this scenario in #252. |
@rmontgomery429 thanks, that's really helpful! Seems like you couldn't reproduce the issue either? |
@alexrudall Correct. |
@deikka can you print all the chunks you're getting and share here? We can't reproduce this, it's a strange one |
It's super strange... I'm receiving these incomplete chunks. Perhaps there's an installed gem causing interference? As it appears: According to applications innovative services human, public safety resource management,, and example, an provide information and contact analyze data from mobile and to understand behavior patterns include community linking mobile social. |
Hey folks, I was just about to throw in the towel when I tried commenting out the helicone part. Now, everything is working like a charm! require "openai"
OpenAI.configure do |config|
config.access_token = Rails.application.credentials.dig(:openai, :access_token)
config.organization_id = Rails.application.credentials.dig(:openai, :organization_id)
# config.uri_base = "https://oai.hconeai.com/" # Optional
config.request_timeout = 240 # Optional
end |
Legend, that’s really good to know. Will try and reproduce in a test and then we can feed back to Helicone if the issue is on their end. Or you fancy doing it in your PR @rmontgomery429 ? 😎 |
@alexrudall my suggestion would be to merge that PR given that it confirms GPT-4 streaming is working and there are no specs for that now and address any changes required to accommodate Helicone in a separate PR. |
Maybe @chitalian @ScottMktn have some insight they can share that would help troubleshoot this issue. |
Never mind @alexrudall, I didn't realize that Helicone was in the readme so it's less orthogonal that I thought. I tried to reproduce the issue but to no avail. I've updated my PR to include those request specs as well. |
Thanks for your efforts! Will try also :)
…On Sat, 29 Apr 2023 at 20:52, Ryan Montgomery ***@***.***> wrote:
Never mind @alexrudall <https://github.com/alexrudall>, I didn't realize
that Helicone was in the readme so it's less orthogonal that I thought. I
tried to reproduce the issue but to no avail. I've updated my PR to include
those request specs as well.
—
Reply to this email directly, view it on GitHub
<#251 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABWXYXU53BWDSXA5BY7THALXDUFJVANCNFSM6AAAAAAXPBWLHI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This is a simple testing case. When I use stream, the prompt is "please output 1 to 100". The content returned by Ruby OpenAI will lose some data. In order to ensure that data is not overwritten incorrectly, I used the function of kredis to maintain order. def stream_proc(message:)
answer = Kredis.list "message-#{message.id}"
proc do |chunk, _bytesize|
new_content = chunk.dig("choices", 0, "delta", "content")
answer << new_content if new_content
message.update(content: answer.elements.join("")) if new_content
end |
I can also contribute to "it's helicone" in my experience of this bug. Using Alex's example from the readme. Commenting/uncommenting OpenAI.configure do |config|
# config.uri_base = "https://oai.hconeai.com/"
config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
config.organization_id = ENV.fetch("OPENAI_ORGANIZATION")
config.extra_headers = {
"Helicone-Auth" => "Bearer #{ENV.fetch("HELICONE_API_KEY")}",
"Helicone-Cache-Enabled" => "true"
}
end
task scratch: :environment do
client = OpenAI::Client.new
client.chat(
parameters: {
model: "gpt-3.5-turbo", # Required.
messages: [{role: "user", content: "Describe a character called Anna!"}], # Required.
temperature: 0.7,
stream: proc do |chunk, _bytesize|
print chunk.dig("choices", 0, "delta", "content")
end
}
)
end |
Hi! I was able to recreate it locally and fixed it by setting this header: "helicone-stream-force-format" => "true" Let me know if this works, I will look into a fix that does not require that header. |
@colegottdank thanks, that's very helpful! @ScotterC does that work for you? |
Yup. On a small scale test and larger implementation it works. I'd love to better understand what's happening here @colegottdank |
Hey family, adding the header works perfectly for me too. I think we can consider this issue resolved, don't you think? |
It looks like the effect is here. Seems it... queues chunks into an array? It seems like it either fixes or mostly fixes it. If you join the Helicone Discord here and then go to this thread you can see any more discussion that comes out of this or request for updates. I've also added a note to our README here to use this flag with Helicone for now. But yeah our work here is done, thanks for raising @deikka + others and for the fix @colegottdank! client = OpenAI::Client.new(
access_token: "access_token_goes_here",
uri_base: "https://oai.hconeai.com/",
request_timeout: 240,
extra_headers: {
"Helicone-Auth": "Bearer HELICONE_API_KEY", # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
"helicone-stream-force-format" => "true", # Use this with Helicone otherwise streaming drops chunks # https://github.com/alexrudall/ruby-openai/issues/251
}
) |
The reason this issue manifests itself when accessing OpenAI through Helicone (or any other proxy-like intermediary) could be that the completion JSON chunks from OpenAI are being buffered/joined/split at non-JSON boundaries during transit. #332 should fix it. |
I have a chat system working perfectly. When I updated to version 4 of the gem, everything went well (it works normally). However, when I add the 'stream' option to the OpenAI API call, the content of the response is incomplete.
This code works normally:
However, when adding the streaming option:
Screenshot with responses:
Any clue as to why it could happen?
The text was updated successfully, but these errors were encountered: