Incomplete response with streaming #251

deikka · 2023-04-28T10:34:06Z

I have a chat system working perfectly. When I updated to version 4 of the gem, everything went well (it works normally). However, when I add the 'stream' option to the OpenAI API call, the content of the response is incomplete.

This code works normally:

response = openai_client.chat(
  parameters: {
    model: "gpt-4",
    messages: [
      {role: "system", content: system_prompt},
      {role: "user", content: user_prompt}
    ],
    temperature: 0.4,
    user: "user_#{user_id}"
  }
)
puts "RESPONSE: #{response}"
response.dig("choices", 0, "message", "content")

However, when adding the streaming option:

response = openai_client.chat(
  parameters: {
    model: "gpt-4",
    messages: [
      {role: "system", content: system_prompt},
      {role: "user", content: user_prompt}
    ],
    temperature: 0.4,
    stream: proc do |chunk, _bytesize|
      new_content = chunk.dig("choices", 0, "delta", "content")
      if new_content
        answer.content = (answer.content || "") + new_content
        answer.save!
      end
    end,
    user: "user_#{user_id}"
  }
)
puts "RESPONSE: #{response}"
response.dig("choices", 0, "message", "content")

Screenshot with responses:

Any clue as to why it could happen?

Rails 7.0.4.3
Ruby 3.2.1
OS: macOS
Browser ARC/Chrome

alexrudall · 2023-04-28T12:13:30Z

Thanks for sharing on here! I don't have gpt-4 access and I don't know what your system prompt is, with gpt-3.5 I'm not able to reproduce this so far:

openai_client.chat(
  parameters: {
    model: "gpt-3.5-turbo",
    messages: [
      {role: "system", content: "Answer politely."},
      {role: "user", content: "What is the maximum weight allowed in the closed drawer or in combination of inside and on top of the open drawer?"}
    ],
    temperature: 0.4,
    stream: proc do |chunk, _bytesize|
      print chunk.dig("choices", 0, "delta", "content")
    end,
  }
)
#=> I'm sorry, but I don't have enough information to answer your question. Could you please specify which drawer or product you are referring to?

Are you able to print out all chunks and see if they are coming through correctly?

rmontgomery429 · 2023-04-28T13:04:35Z

Hopefully you get access soon @alexrudall. In the meantime I've added a spec and tested this scenario in #252.

alexrudall · 2023-04-28T13:05:59Z

@rmontgomery429 thanks, that's really helpful! Seems like you couldn't reproduce the issue either?

rmontgomery429 · 2023-04-28T13:08:02Z

@alexrudall Correct.

alexrudall · 2023-04-28T13:31:30Z

@deikka can you print all the chunks you're getting and share here? We can't reproduce this, it's a strange one

deikka · 2023-04-28T21:51:50Z

It's super strange... I'm receiving these incomplete chunks. Perhaps there's an installed gem causing interference?

As it appears:
For this question: What are some potential applications of SCI?

According to applications innovative services human, public safety resource management,, and example, an provide information and contact analyze data from mobile and to understand behavior patterns include community linking mobile social.

deikka · 2023-04-28T22:12:27Z

Hey folks, I was just about to throw in the towel when I tried commenting out the helicone part. Now, everything is working like a charm!

require "openai"

OpenAI.configure do |config|
  config.access_token = Rails.application.credentials.dig(:openai, :access_token)
  config.organization_id = Rails.application.credentials.dig(:openai, :organization_id)
  # config.uri_base = "https://oai.hconeai.com/" # Optional
  config.request_timeout = 240 # Optional
end

alexrudall · 2023-04-29T03:13:23Z

Legend, that’s really good to know. Will try and reproduce in a test and then we can feed back to Helicone if the issue is on their end. Or you fancy doing it in your PR @rmontgomery429 ? 😎

rmontgomery429 · 2023-04-29T12:00:08Z

@alexrudall my suggestion would be to merge that PR given that it confirms GPT-4 streaming is working and there are no specs for that now and address any changes required to accommodate Helicone in a separate PR.

rmontgomery429 · 2023-04-29T12:01:59Z

Maybe @chitalian @ScottMktn have some insight they can share that would help troubleshoot this issue.

rmontgomery429 · 2023-04-29T12:52:31Z

Never mind @alexrudall, I didn't realize that Helicone was in the readme so it's less orthogonal that I thought. I tried to reproduce the issue but to no avail. I've updated my PR to include those request specs as well.

alexrudall · 2023-04-29T13:05:53Z

Thanks for your efforts! Will try also :)

…

On Sat, 29 Apr 2023 at 20:52, Ryan Montgomery ***@***.***> wrote: Never mind @alexrudall <https://github.com/alexrudall>, I didn't realize that Helicone was in the readme so it's less orthogonal that I thought. I tried to reproduce the issue but to no avail. I've updated my PR to include those request specs as well. — Reply to this email directly, view it on GitHub <#251 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABWXYXU53BWDSXA5BY7THALXDUFJVANCNFSM6AAAAAAXPBWLHI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jicheng1014 · 2023-05-07T17:08:40Z

This is a simple testing case. When I use stream, the prompt is "please output 1 to 100". The content returned by Ruby OpenAI will lose some data.

In order to ensure that data is not overwritten incorrectly, I used the function of kredis to maintain order.

  def stream_proc(message:)

    answer = Kredis.list "message-#{message.id}"
    proc do |chunk, _bytesize|
      new_content = chunk.dig("choices", 0, "delta", "content")

      answer << new_content if new_content

      message.update(content: answer.elements.join("")) if new_content
    end

ScotterC · 2023-08-15T14:53:59Z

I can also contribute to "it's helicone" in my experience of this bug. Using Alex's example from the readme. Commenting/uncommenting uri_base produces the difference. I'm getting significant drop in chunks though, like every 3rd or 4th chunk is gone.

OpenAI.configure do |config|
  # config.uri_base = "https://oai.hconeai.com/"
  config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
  config.organization_id = ENV.fetch("OPENAI_ORGANIZATION")
  config.extra_headers = {
    "Helicone-Auth" => "Bearer #{ENV.fetch("HELICONE_API_KEY")}",
    "Helicone-Cache-Enabled" => "true"
  }
end

task scratch: :environment do
  client = OpenAI::Client.new
  client.chat(
    parameters: {
      model: "gpt-3.5-turbo", # Required.
      messages: [{role: "user", content: "Describe a character called Anna!"}], # Required.
      temperature: 0.7,
      stream: proc do |chunk, _bytesize|
                print chunk.dig("choices", 0, "delta", "content")
              end
    }
  )
end

colegottdank · 2023-08-15T22:37:26Z

Hi! I was able to recreate it locally and fixed it by setting this header:

"helicone-stream-force-format" => "true"

Let me know if this works, I will look into a fix that does not require that header.

alexrudall · 2023-08-16T12:05:04Z

@colegottdank thanks, that's very helpful! @ScotterC does that work for you?

ScotterC · 2023-08-16T17:29:47Z

Yup. On a small scale test and larger implementation it works. I'd love to better understand what's happening here @colegottdank
Here's the source https://github.com/Helicone/helicone/blob/5f1e190f84f01068649b11b014bac9f51bb5a5b5/worker/src/lib/HeliconeHeaders.ts#L105

deikka · 2023-08-16T21:48:58Z

Hey family, adding the header works perfectly for me too. I think we can consider this issue resolved, don't you think?
Or would you prefer to wait for an explanation on how it works?

alexrudall · 2023-08-17T10:09:41Z

It looks like the effect is here. Seems it... queues chunks into an array? It seems like it either fixes or mostly fixes it. If you join the Helicone Discord here and then go to this thread you can see any more discussion that comes out of this or request for updates.

I've also added a note to our README here to use this flag with Helicone for now. But yeah our work here is done, thanks for raising @deikka + others and for the fix @colegottdank!

client = OpenAI::Client.new(
    access_token: "access_token_goes_here",
    uri_base: "https://oai.hconeai.com/",
    request_timeout: 240,
    extra_headers: {
      "Helicone-Auth": "Bearer HELICONE_API_KEY", # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
      "helicone-stream-force-format" => "true", # Use this with Helicone otherwise streaming drops chunks # https://github.com/alexrudall/ruby-openai/issues/251
    }
)

atesgoral · 2023-09-27T03:04:29Z

The reason this issue manifests itself when accessing OpenAI through Helicone (or any other proxy-like intermediary) could be that the completion JSON chunks from OpenAI are being buffered/joined/split at non-JSON boundaries during transit. #332 should fix it.

deikka changed the title ~~Response with streaming activated~~ Incomplete response with streaming activated Apr 28, 2023

deikka changed the title ~~Incomplete response with streaming activated~~ Incomplete response with streaming Apr 28, 2023

rmontgomery429 mentioned this issue Apr 28, 2023

Add a spec to test the client with GPT-4 #252

Closed

3 tasks

alexrudall closed this as completed Aug 17, 2023

atesgoral mentioned this issue Sep 27, 2023

Better SSE parsing #332

Closed

rohitpaulk mentioned this issue Dec 19, 2023

helicone-stream-force-format: true seems to cause JSON::ParserError when using Azure OpenAI #411

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete response with streaming #251

Incomplete response with streaming #251

deikka commented Apr 28, 2023

alexrudall commented Apr 28, 2023

rmontgomery429 commented Apr 28, 2023

alexrudall commented Apr 28, 2023

rmontgomery429 commented Apr 28, 2023

alexrudall commented Apr 28, 2023

deikka commented Apr 28, 2023 •

edited

Loading

deikka commented Apr 28, 2023

alexrudall commented Apr 29, 2023

rmontgomery429 commented Apr 29, 2023 •

edited

Loading

rmontgomery429 commented Apr 29, 2023

rmontgomery429 commented Apr 29, 2023

alexrudall commented Apr 29, 2023 via email

jicheng1014 commented May 7, 2023

ScotterC commented Aug 15, 2023

colegottdank commented Aug 15, 2023

alexrudall commented Aug 16, 2023

ScotterC commented Aug 16, 2023

deikka commented Aug 16, 2023 •

edited

Loading

alexrudall commented Aug 17, 2023 •

edited

Loading

atesgoral commented Sep 27, 2023

Incomplete response with streaming #251

Incomplete response with streaming #251

Comments

deikka commented Apr 28, 2023

alexrudall commented Apr 28, 2023

rmontgomery429 commented Apr 28, 2023

alexrudall commented Apr 28, 2023

rmontgomery429 commented Apr 28, 2023

alexrudall commented Apr 28, 2023

deikka commented Apr 28, 2023 • edited Loading

deikka commented Apr 28, 2023

alexrudall commented Apr 29, 2023

rmontgomery429 commented Apr 29, 2023 • edited Loading

rmontgomery429 commented Apr 29, 2023

rmontgomery429 commented Apr 29, 2023

alexrudall commented Apr 29, 2023 via email

jicheng1014 commented May 7, 2023

ScotterC commented Aug 15, 2023

colegottdank commented Aug 15, 2023

alexrudall commented Aug 16, 2023

ScotterC commented Aug 16, 2023

deikka commented Aug 16, 2023 • edited Loading

alexrudall commented Aug 17, 2023 • edited Loading

atesgoral commented Sep 27, 2023

deikka commented Apr 28, 2023 •

edited

Loading

rmontgomery429 commented Apr 29, 2023 •

edited

Loading

deikka commented Aug 16, 2023 •

edited

Loading

alexrudall commented Aug 17, 2023 •

edited

Loading