Skip to content

Commit

Permalink
feat: gpt-4 and gpt-4 32k services (#456)
Browse files Browse the repository at this point in the history
* feat: gpt-4 and gpt-4 32k services

* fix: add to universal

* fix: add params
  • Loading branch information
dilyararimovna authored May 12, 2023
1 parent b6ad4e0 commit 1f3ff38
Show file tree
Hide file tree
Showing 11 changed files with 193 additions and 7 deletions.
16 changes: 9 additions & 7 deletions MODELS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@

Here you may find a list of models that currently available for use in Generative Assistants.

| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description |
|--------------------------|--------------------------|---------------------------------------------------------------------|--------------------------|---------------------------|---------------------------|--------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multulingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| Open-Assistant SFT-1 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. |
| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description |
|---------------------------|--------------------------|----------------------------------------------------------------------|--------------------------|---------------------------|---------------------------|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multulingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| Open-Assistant SFT-1 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. |
| GPT-4 | openai-api-gpt4 | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 8,192 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
| GPT-4 32K | openai-api-gpt4-32k | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 32,768 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. Same capabilities as the base gpt-4 mode but with 4x the context length. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. |
12 changes: 12 additions & 0 deletions assistant_dists/universal_prompted_assistant/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,18 @@ services:
- "./common:/src/common"
ports:
- 8131:8131
openai-api-gpt4:
volumes:
- "./services/openai_api_lm:/src"
- "./common:/src/common"
ports:
- 8159:8159
openai-api-gpt4-32k:
volumes:
- "./services/openai_api_lm:/src"
- "./common:/src/common"
ports:
- 8160:8160
dff-universal-prompted-skill:
volumes:
- "./skills/dff_universal_prompted_skill:/src"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ services:
WAIT_HOSTS: "sentseg:8011, ranking-based-response-selector:8002, combined-classification:8087,
sentence-ranker:8128,
transformers-lm-gptj:8130, transformers-lm-oasst12b:8158, openai-api-chatgpt:8145, openai-api-davinci3:8131,
openai-api-gpt4:8159, openai-api-gpt4-32k:8160,
dff-universal-prompted-skill:8147"
WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-1000}

Expand Down Expand Up @@ -164,6 +165,46 @@ services:
reservations:
memory: 100M

openai-api-gpt4:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8159
SERVICE_NAME: openai_api_gpt4
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8159
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M

openai-api-gpt4-32k:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8160
SERVICE_NAME: openai_api_gpt4_32k
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8160
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M

dff-universal-prompted-skill:
env_file: [ .env ]
build:
Expand Down
28 changes: 28 additions & 0 deletions components/jkdhfgkhgodfiugpojwrnkjnlg.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: openai_api_gpt4
display_name: GPT-4
component_type: Generative
model_type: NN-based
is_customizable: false
author: [email protected]
description: A multilingual instruction-based large language model
which is capable of code generation and other complex tasks.
More capable than any GPT-3.5 model, able to do more complex tasks,
and optimized for chat. Paid.
You must provide your OpenAI API key to use the model.
Your OpenAI account will be charged according to your usage.
ram_usage: 100M
gpu_usage: null
group: services
connector:
protocol: http
timeout: 20.0
url: http://openai-api-gpt4:8159/respond
dialog_formatter: null
response_formatter: null
previous_services: null
required_previous_services: null
state_manager_method: null
tags: null
endpoint: respond
service: services/openai_api_lm/service_configs/openai-api-gpt4
date_created: '2023-04-16T09:45:32'
27 changes: 27 additions & 0 deletions components/oinfjkrbnfmhkfsjdhfsd.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: openai_api_gpt4_32k
display_name: GPT-4 32k
component_type: Generative
model_type: NN-based
is_customizable: false
author: [email protected]
description: A multilingual instruction-based large language model
which is capable of code generation and other complex tasks.
Same capabilities as the base gpt-4 mode but with 4x the context length.
Paid. You must provide your OpenAI API key to use the model.
Your OpenAI account will be charged according to your usage.
ram_usage: 100M
gpu_usage: null
group: services
connector:
protocol: http
timeout: 20.0
url: http://openai-api-gpt4-32k:8160/respond
dialog_formatter: null
response_formatter: null
previous_services: null
required_previous_services: null
state_manager_method: null
tags: null
endpoint: respond
service: services/openai_api_lm/service_configs/openai-api-gpt4-32k
date_created: '2023-04-16T09:45:32'
2 changes: 2 additions & 0 deletions services/openai_api_lm/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@
DEFAULT_CONFIGS = {
"text-davinci-003": json.load(open("generative_configs/openai-text-davinci-003.json", "r")),
"gpt-3.5-turbo": json.load(open("generative_configs/openai-chatgpt.json", "r")),
"gpt-4": json.load(open("generative_configs/openai-chatgpt.json", "r")),
"gpt-4-32k": json.load(open("generative_configs/openai-chatgpt.json", "r")),
}


Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
SERVICE_PORT: 8160
SERVICE_NAME: openai_api_gpt4_32k
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: openai-api-gpt4-32k
endpoints:
- respond
compose:
env_file:
- .env
build:
args:
SERVICE_PORT: 8160
SERVICE_NAME: openai_api_gpt4_32k
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8160
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M
volumes:
- ./services/openai_api_lm:/src
- ./common:/src/common
ports:
- 8160:8160
proxy: null
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
SERVICE_PORT: 8159
SERVICE_NAME: openai_api_gpt4
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
31 changes: 31 additions & 0 deletions services/openai_api_lm/service_configs/openai-api-gpt4/service.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: openai-api-gpt4
endpoints:
- respond
compose:
env_file:
- .env
build:
args:
SERVICE_PORT: 8159
SERVICE_NAME: openai_api_gpt4
PRETRAINED_MODEL_NAME_OR_PATH: gpt-4
CUDA_VISIBLE_DEVICES: '0'
FLASK_APP: server
context: .
dockerfile: ./services/openai_api_lm/Dockerfile
command: flask run -h 0.0.0.0 -p 8159
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 100M
reservations:
memory: 100M
volumes:
- ./services/openai_api_lm:/src
- ./common:/src/common
ports:
- 8159:8159
proxy: null
2 changes: 2 additions & 0 deletions skills/dff_universal_prompted_skill/scenario/response.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
"http://transformers-lm-oasst12b:8158/respond": [],
"http://openai-api-chatgpt:8145/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"],
"http://openai-api-davinci3:8131/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"],
"http://openai-api-gpt4:8159/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"],
"http://openai-api-gpt4-32k:8160/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"],
}


Expand Down

0 comments on commit 1f3ff38

Please sign in to comment.