diff --git a/MODELS.md b/MODELS.md index cff49d98ac..6e69e8ad9a 100644 --- a/MODELS.md +++ b/MODELS.md @@ -2,10 +2,12 @@ Here you may find a list of models that currently available for use in Generative Assistants. -| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description | -|--------------------------|--------------------------|---------------------------------------------------------------------|--------------------------|---------------------------|---------------------------|--------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. | -| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. | -| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multulingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | -| Open-Assistant SFT-1 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. | +| model name | container name | model link | open-source? | size (billion parameters) | GPU usage | max tokens (prompt + response) | description | +|---------------------------|--------------------------|----------------------------------------------------------------------|--------------------------|---------------------------|---------------------------|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| BLOOMZ 7B | transformers-lm-bloomz7b | [link](https://huggingface.co/bigscience/bloomz-7b1) | yes | 7.1B | 33GB | 2,048 tokens | An open-source multilingual instruction-based large language model (46 languages). NB: free of charge. This model is up and running on our servers and can be used for free. | +| GPT-J 6B | transformers-lm-gptj | [link](https://huggingface.co/EleutherAI/gpt-j-6b) | yes | 6B | 25GB | 2,048 tokens | An open-source English-only large language model which is NOT fine-tuned for instruction following and NOT capable of code generation. NB: free of charge. This model is up and running on our servers and can be used for free. | +| GPT-3.5 | openai-api-davinci3 | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,097 tokens | A multulingual instruction-based large language model which is capable of code generation. Unlike ChatGPT, not optimised for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| ChatGPT | openai-api-chatgpt | [link](https://platform.openai.com/docs/models/gpt-3-5) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 4,096 tokens | Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| Open-Assistant SFT-1 12B | transformers-lm-oasst12b | [link](https://huggingface.co/OpenAssistant/oasst-sft-1-pythia-12b) | yes | 12B | 26GB (half-precision) | 5,120 tokens | An open-source English-only instruction-based large language model which is NOT good at answering math and coding questions. NB: free of charge. This model is up and running on our servers and can be used for free. | +| GPT-4 | openai-api-gpt4 | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 8,192 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | +| GPT-4 32K | openai-api-gpt4-32k | [link](https://platform.openai.com/docs/models/gpt-4) | no (paid access via API) | supposedly, 175B | - (cannot be run locally) | 32,768 tokens | A multilingual instruction-based large language model which is capable of code generation and other complex tasks. Same capabilities as the base gpt-4 mode but with 4x the context length. NB: paid. You must provide your OpenAI API key to use the model. Your OpenAI account will be charged according to your usage. | diff --git a/assistant_dists/universal_prompted_assistant/dev.yml b/assistant_dists/universal_prompted_assistant/dev.yml index 6698e3d31d..57314e6a1a 100644 --- a/assistant_dists/universal_prompted_assistant/dev.yml +++ b/assistant_dists/universal_prompted_assistant/dev.yml @@ -54,6 +54,18 @@ services: - "./common:/src/common" ports: - 8131:8131 + openai-api-gpt4: + volumes: + - "./services/openai_api_lm:/src" + - "./common:/src/common" + ports: + - 8159:8159 + openai-api-gpt4-32k: + volumes: + - "./services/openai_api_lm:/src" + - "./common:/src/common" + ports: + - 8160:8160 dff-universal-prompted-skill: volumes: - "./skills/dff_universal_prompted_skill:/src" diff --git a/assistant_dists/universal_prompted_assistant/docker-compose.override.yml b/assistant_dists/universal_prompted_assistant/docker-compose.override.yml index caa5a70fd1..686ec711db 100644 --- a/assistant_dists/universal_prompted_assistant/docker-compose.override.yml +++ b/assistant_dists/universal_prompted_assistant/docker-compose.override.yml @@ -5,6 +5,7 @@ services: WAIT_HOSTS: "sentseg:8011, ranking-based-response-selector:8002, combined-classification:8087, sentence-ranker:8128, transformers-lm-gptj:8130, transformers-lm-oasst12b:8158, openai-api-chatgpt:8145, openai-api-davinci3:8131, + openai-api-gpt4:8159, openai-api-gpt4-32k:8160, dff-universal-prompted-skill:8147" WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-1000} @@ -164,6 +165,46 @@ services: reservations: memory: 100M + openai-api-gpt4: + env_file: [ .env ] + build: + args: + SERVICE_PORT: 8159 + SERVICE_NAME: openai_api_gpt4 + PRETRAINED_MODEL_NAME_OR_PATH: gpt-4 + context: . + dockerfile: ./services/openai_api_lm/Dockerfile + command: flask run -h 0.0.0.0 -p 8159 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 100M + reservations: + memory: 100M + + openai-api-gpt4-32k: + env_file: [ .env ] + build: + args: + SERVICE_PORT: 8160 + SERVICE_NAME: openai_api_gpt4_32k + PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k + context: . + dockerfile: ./services/openai_api_lm/Dockerfile + command: flask run -h 0.0.0.0 -p 8160 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 100M + reservations: + memory: 100M + dff-universal-prompted-skill: env_file: [ .env ] build: diff --git a/components/jkdhfgkhgodfiugpojwrnkjnlg.yml b/components/jkdhfgkhgodfiugpojwrnkjnlg.yml new file mode 100644 index 0000000000..ce20bde833 --- /dev/null +++ b/components/jkdhfgkhgodfiugpojwrnkjnlg.yml @@ -0,0 +1,28 @@ +name: openai_api_gpt4 +display_name: GPT-4 +component_type: Generative +model_type: NN-based +is_customizable: false +author: publisher@deeppavlov.ai +description: A multilingual instruction-based large language model + which is capable of code generation and other complex tasks. + More capable than any GPT-3.5 model, able to do more complex tasks, + and optimized for chat. Paid. + You must provide your OpenAI API key to use the model. + Your OpenAI account will be charged according to your usage. +ram_usage: 100M +gpu_usage: null +group: services +connector: + protocol: http + timeout: 20.0 + url: http://openai-api-gpt4:8159/respond +dialog_formatter: null +response_formatter: null +previous_services: null +required_previous_services: null +state_manager_method: null +tags: null +endpoint: respond +service: services/openai_api_lm/service_configs/openai-api-gpt4 +date_created: '2023-04-16T09:45:32' diff --git a/components/oinfjkrbnfmhkfsjdhfsd.yml b/components/oinfjkrbnfmhkfsjdhfsd.yml new file mode 100644 index 0000000000..0d5c44200e --- /dev/null +++ b/components/oinfjkrbnfmhkfsjdhfsd.yml @@ -0,0 +1,27 @@ +name: openai_api_gpt4_32k +display_name: GPT-4 32k +component_type: Generative +model_type: NN-based +is_customizable: false +author: publisher@deeppavlov.ai +description: A multilingual instruction-based large language model + which is capable of code generation and other complex tasks. + Same capabilities as the base gpt-4 mode but with 4x the context length. + Paid. You must provide your OpenAI API key to use the model. + Your OpenAI account will be charged according to your usage. +ram_usage: 100M +gpu_usage: null +group: services +connector: + protocol: http + timeout: 20.0 + url: http://openai-api-gpt4-32k:8160/respond +dialog_formatter: null +response_formatter: null +previous_services: null +required_previous_services: null +state_manager_method: null +tags: null +endpoint: respond +service: services/openai_api_lm/service_configs/openai-api-gpt4-32k +date_created: '2023-04-16T09:45:32' diff --git a/services/openai_api_lm/server.py b/services/openai_api_lm/server.py index ce8b4b7827..cc30279d1b 100644 --- a/services/openai_api_lm/server.py +++ b/services/openai_api_lm/server.py @@ -26,6 +26,8 @@ DEFAULT_CONFIGS = { "text-davinci-003": json.load(open("generative_configs/openai-text-davinci-003.json", "r")), "gpt-3.5-turbo": json.load(open("generative_configs/openai-chatgpt.json", "r")), + "gpt-4": json.load(open("generative_configs/openai-chatgpt.json", "r")), + "gpt-4-32k": json.load(open("generative_configs/openai-chatgpt.json", "r")), } diff --git a/services/openai_api_lm/service_configs/openai-api-gpt4-32k/environment.yml b/services/openai_api_lm/service_configs/openai-api-gpt4-32k/environment.yml new file mode 100644 index 0000000000..ed4954db75 --- /dev/null +++ b/services/openai_api_lm/service_configs/openai-api-gpt4-32k/environment.yml @@ -0,0 +1,5 @@ +SERVICE_PORT: 8160 +SERVICE_NAME: openai_api_gpt4_32k +PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k +CUDA_VISIBLE_DEVICES: '0' +FLASK_APP: server diff --git a/services/openai_api_lm/service_configs/openai-api-gpt4-32k/service.yml b/services/openai_api_lm/service_configs/openai-api-gpt4-32k/service.yml new file mode 100644 index 0000000000..69a3b398c8 --- /dev/null +++ b/services/openai_api_lm/service_configs/openai-api-gpt4-32k/service.yml @@ -0,0 +1,31 @@ +name: openai-api-gpt4-32k +endpoints: +- respond +compose: + env_file: + - .env + build: + args: + SERVICE_PORT: 8160 + SERVICE_NAME: openai_api_gpt4_32k + PRETRAINED_MODEL_NAME_OR_PATH: gpt-4-32k + CUDA_VISIBLE_DEVICES: '0' + FLASK_APP: server + context: . + dockerfile: ./services/openai_api_lm/Dockerfile + command: flask run -h 0.0.0.0 -p 8160 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 100M + reservations: + memory: 100M + volumes: + - ./services/openai_api_lm:/src + - ./common:/src/common + ports: + - 8160:8160 +proxy: null diff --git a/services/openai_api_lm/service_configs/openai-api-gpt4/environment.yml b/services/openai_api_lm/service_configs/openai-api-gpt4/environment.yml new file mode 100644 index 0000000000..f3cf8147a8 --- /dev/null +++ b/services/openai_api_lm/service_configs/openai-api-gpt4/environment.yml @@ -0,0 +1,5 @@ +SERVICE_PORT: 8159 +SERVICE_NAME: openai_api_gpt4 +PRETRAINED_MODEL_NAME_OR_PATH: gpt-4 +CUDA_VISIBLE_DEVICES: '0' +FLASK_APP: server diff --git a/services/openai_api_lm/service_configs/openai-api-gpt4/service.yml b/services/openai_api_lm/service_configs/openai-api-gpt4/service.yml new file mode 100644 index 0000000000..898c870fbc --- /dev/null +++ b/services/openai_api_lm/service_configs/openai-api-gpt4/service.yml @@ -0,0 +1,31 @@ +name: openai-api-gpt4 +endpoints: +- respond +compose: + env_file: + - .env + build: + args: + SERVICE_PORT: 8159 + SERVICE_NAME: openai_api_gpt4 + PRETRAINED_MODEL_NAME_OR_PATH: gpt-4 + CUDA_VISIBLE_DEVICES: '0' + FLASK_APP: server + context: . + dockerfile: ./services/openai_api_lm/Dockerfile + command: flask run -h 0.0.0.0 -p 8159 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 100M + reservations: + memory: 100M + volumes: + - ./services/openai_api_lm:/src + - ./common:/src/common + ports: + - 8159:8159 +proxy: null diff --git a/skills/dff_universal_prompted_skill/scenario/response.py b/skills/dff_universal_prompted_skill/scenario/response.py index ee7e20d18f..cddb3d82f8 100644 --- a/skills/dff_universal_prompted_skill/scenario/response.py +++ b/skills/dff_universal_prompted_skill/scenario/response.py @@ -32,6 +32,8 @@ "http://transformers-lm-oasst12b:8158/respond": [], "http://openai-api-chatgpt:8145/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"], "http://openai-api-davinci3:8131/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"], + "http://openai-api-gpt4:8159/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"], + "http://openai-api-gpt4-32k:8160/respond": ["OPENAI_API_KEY", "OPENAI_ORGANIZATION"], }