feat: Update OpenAI spec to include image url in message content #113

bhimrazy · 2024-05-23T17:26:24Z

Before submitting

Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

What does this PR do?

Fixes #107.

Add support for images to be included in chat messages, similar to gpt-4o.

lantiga · 2024-05-23T17:39:24Z

Thanks for the PR @bhimrazy!
Can you add a test and a minimal end to end example on the readme?

williamFalcon · 2024-05-23T18:49:09Z

@bhimrazy sick! super excited to try this.

@lantiga @lantiga do we have a guide or something to show how to add the test and example?

lantiga · 2024-05-23T19:10:51Z

no, good point /cc @aniketmaurya

@bhimrazy for now you can take inspiration from:

https://github.com/Lightning-AI/LitServe/blob/main/tests/test_specs.py for unit tests
https://github.com/Lightning-AI/LitServe/blob/main/tests/e2e/test_e2e.py#L90 for the end to end test
https://github.com/Lightning-AI/LitServe/blob/main/README.md?plain=1#L548 for the README examples

aniketmaurya · 2024-05-23T19:15:38Z

thank you for the PR @bhimrazy! as Luca mentioned, you can take inspiration from the existing LitSpec test cases.

Maybe you can try sending the request with image content to the server and check that it is able to parse and doesn't break.

{
 "model": "lit",
  "messages": [
     {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ],
        }
  ]
}

bhimrazy · 2024-05-23T19:30:39Z

Thanks, @lantiga, @williamFalcon, and @aniketmaurya for all of these feedback.
I will go through the given examples and add the test cases and e2e example.

aniketmaurya · 2024-05-23T19:38:08Z

awesome @bhimrazy!! don't hesitate to reach out if you need any help.

bhimrazy · 2024-05-24T09:22:18Z

Hi @aniketmaurya @lantiga , Could you please help me through the addition of end-to-end documentation to the README?
I have prepared a draft version of it, included below.

LitServe's OpenAISpec also enables capability to handle images in the input. Below is an example of how to set this up using LitServe.

import litserve as ls
from litserve.specs.openai import ChatMessage

class OpenAISpecLitAPI(ls.LitAPI):
    def setup(self, device):
        self.model = None

    def predict(self, x):
        yield {"role": "assistant", "content": "This is a generated output"}

    def encode_response(self, output: dict) -> ChatMessage:
        yield ChatMessage(role="assistant", content="This is a custom encoded output")


if __name__ == "__main__":
    server = ls.LitServer(OpenAISpecLitAPI(), spec=ls.OpenAISpec())
    server.run(port=8000)

In this case, predict is expected to take an input with the following shape:

Text Input Example:

[{"role": "system", "content": "You are a helpful assistant."},
 {"role": "user", "content": "Hello there"},
 {"role": "assistant", "content": "Hello, how can I help?"},
 {"role": "user", "content": "What is the capital of Australia?"}]

Mixed Text and Image Input Example:

[{"role": "system", "content": "You are a helpful assistant."},
 {
 "role": "user", 
 "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ]

},
 {"role": "assistant", "content": "A wooden boardwalk through a green field under a blue sky."},
 {"role": "user", "content": "How is the weather depicted in the image?"}]

The above server can be queried using a standard OpenAI client:

import requests

response = requests.post("http://127.0.0.1:8000/v1/chat/completions", json={
    "model": "my-gpt2",
    "stream": False,  # You can stream chunked response by setting this True
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ]
      }
    ]
  })

aniketmaurya · 2024-05-24T10:19:59Z

looks good @bhimrazy! would be nice if you can show that image content could be processed in predict or decode_request step. For example:

    def predict(self, x):
        if isinstance(x["content"], list):
                # do something with image url
                image_url = x["content"][1]["image_url"]
                yield {"role": "assistant", "content": "the image describes nature and bla bla..."}
        else:
            yield {"role": "assistant", "content": "This is a generated output."}

bhimrazy · 2024-05-24T11:17:42Z

looks good @bhimrazy! would be nice if you can show that image content could be processed in predict or decode_request step. For example:

    def predict(self, x):
        if isinstance(x["content"], list):
                # do something with image url
                image_url = x["content"][1]["image_url"]
                yield {"role": "assistant", "content": "the image describes nature and bla bla..."}
        else:
            yield {"role": "assistant", "content": "This is a generated output."}

Sure Thanks!

bhimrazy · 2024-05-24T17:58:14Z

Hi @lantiga, The PR is ready for review.
Thank you!

lantiga · 2024-05-24T20:06:02Z

Awesome @bhimrazy, reviewing now!

lantiga

Looks great!

src/litserve/specs/openai.py

README.md

lantiga · 2024-05-24T20:48:31Z

Awesome job @bhimrazy, let's see what CI thinks and then we're ready to merge!

lantiga · 2024-05-24T20:58:05Z

Let's goo! Merged 🚀

williamFalcon · 2024-05-25T00:33:31Z

congrats @bhimrazy!
solid contribution

feat: Update OpenAI spec to include image url in message content

a1a6f07

bhimrazy requested a review from lantiga as a code owner May 23, 2024 17:26

bhimrazy added 5 commits May 24, 2024 12:56

Merge branch 'main' into feat/add-image-input-support-in-openai-spec

bc6f7b9

feat: Add fixture with support for image URL in OpenAI message content

74e2aa9

feat: Add test for OpenAI spec with image input

0bc9bbb

feat: Add test for OpenAI spec with image input

fcf20ca

fix: update model name

491d425

feat: Adds e2e example for the OpenAI spec with image input

2f27ef8

lantiga approved these changes May 24, 2024

View reviewed changes

lantiga added 6 commits May 24, 2024 16:47

Update src/litserve/specs/openai.py

f00b54d

Update src/litserve/specs/openai.py

f942839

Update src/litserve/specs/openai.py

f962bea

Update README.md

c5d5ca4

Update README.md

ec035cc

Update README.md

258147d

lantiga merged commit 0e8915c into Lightning-AI:main May 24, 2024
17 checks passed

bhimrazy deleted the feat/add-image-input-support-in-openai-spec branch May 25, 2024 01:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Update OpenAI spec to include image url in message content #113

feat: Update OpenAI spec to include image url in message content #113

bhimrazy commented May 23, 2024 •

edited

Loading

lantiga commented May 23, 2024

williamFalcon commented May 23, 2024

lantiga commented May 23, 2024

aniketmaurya commented May 23, 2024 •

edited

Loading

bhimrazy commented May 23, 2024 •

edited

Loading

aniketmaurya commented May 23, 2024

bhimrazy commented May 24, 2024

aniketmaurya commented May 24, 2024 •

edited

Loading

bhimrazy commented May 24, 2024

bhimrazy commented May 24, 2024

lantiga commented May 24, 2024

lantiga left a comment

lantiga commented May 24, 2024

lantiga commented May 24, 2024

williamFalcon commented May 25, 2024

feat: Update OpenAI spec to include image url in message content #113

feat: Update OpenAI spec to include image url in message content #113

Conversation

bhimrazy commented May 23, 2024 • edited Loading

What does this PR do?

lantiga commented May 23, 2024

williamFalcon commented May 23, 2024

lantiga commented May 23, 2024

aniketmaurya commented May 23, 2024 • edited Loading

bhimrazy commented May 23, 2024 • edited Loading

aniketmaurya commented May 23, 2024

bhimrazy commented May 24, 2024

aniketmaurya commented May 24, 2024 • edited Loading

bhimrazy commented May 24, 2024

bhimrazy commented May 24, 2024

lantiga commented May 24, 2024

lantiga left a comment

Choose a reason for hiding this comment

lantiga commented May 24, 2024

lantiga commented May 24, 2024

williamFalcon commented May 25, 2024

bhimrazy commented May 23, 2024 •

edited

Loading

aniketmaurya commented May 23, 2024 •

edited

Loading

bhimrazy commented May 23, 2024 •

edited

Loading

aniketmaurya commented May 24, 2024 •

edited

Loading