Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI v1 Chat Completions API #171

Merged
merged 8 commits into from
Jan 10, 2024
Merged

OpenAI v1 Chat Completions API #171

merged 8 commits into from
Jan 10, 2024

Conversation

tgaddair
Copy link
Contributor

@tgaddair tgaddair commented Jan 10, 2024

Closes #145.

Usage:

Python:

from openai import OpenAI

openai_api_key = "EMPTY"
openai_api_base = "http://127.0.0.1:8080/v1"
client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

resp = client.chat.completions.create(
    model="",
    messages=[
        {
            "role": "system",
            "content": "You are a friendly chatbot who always responds in the style of a pirate",
        },
        {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
    ],
    max_tokens=100,
)
print(resp)

Streaming:

messages = client.chat.completions.create(
    model="",
    messages=[
        {
            "role": "system",
            "content": "You are a friendly chatbot who always responds in the style of a pirate",
        },
        {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
    ],
    max_tokens=100,
    stream=True,
)

for message in messages:
    print(message)

REST:

curl http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "",
  "messages": [
  {
      "role": "system",
      "content": "You are a friendly chatbot who always responds in the style of a pirate"
  },
  {
      "role": "user",
      "content": "How many helicopters can a human eat in one sitting?"
  }
  ],
  "max_tokens": 100
}'

Finally, if the LoRA adapter has its own tokenizer and chat template, that will be used instead of the base model chat template:

resp = client.chat.completions.create(
    model="alignment-handbook/zephyr-7b-dpo-lora",
    messages=[
        {
            "role": "system",
            "content": "You are a friendly chatbot who always responds in the style of a pirate",
        },
        {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
    ],
    max_tokens=100,
)
print("Response:", resp[0].choices[0].text)

@tgaddair tgaddair marked this pull request as ready for review January 10, 2024 06:08
@tgaddair tgaddair merged commit a90d443 into main Jan 10, 2024
1 check passed
@tgaddair tgaddair deleted the chat-completions branch January 10, 2024 18:35
@tgaddair tgaddair restored the chat-completions branch January 10, 2024 19:01
@tgaddair tgaddair deleted the chat-completions branch January 10, 2024 19:09
@prd-tuong-nguyen
Copy link

hi @tgaddair does it work for local adapter?
With original endpoint, I can add {adapter_source: "local"} to load adapter from local directory.
How can I do it via OpenAI API?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenAI compatible API
2 participants