Skip to content

Commit

Permalink
Add helm charts for deploy ChatQnA (#68)
Browse files Browse the repository at this point in the history
* Add helm charts for deploy ChatQnA

Co-authored-by: Yu1 Lu <[email protected]>
Signed-off-by: Dolpher Du <[email protected]>
  • Loading branch information
yongfengdu and leslieluyu authored Jun 4, 2024
1 parent 782c975 commit 20dce6b
Show file tree
Hide file tree
Showing 75 changed files with 2,680 additions and 3 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/scripts/e2e/chart_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,17 @@

LOG_PATH=.

USER_ID=$(whoami)
CHART_MOUNT=/home/$USER_ID/charts-mnt
IMAGE_REPO=${OPEA_IMAGE_REPO:-amr-registry.caas.intel.com/aiops}
function init_codegen() {
# executed under path helm-charts/codegen
# init var
USER_ID=$(whoami)
CHART_MOUNT=/home/$USER_ID/charts-mnt
MODELREPO=m-a-p
MODELNAME=OpenCodeInterpreter-DS-6.7B
MODELID=$MODELREPO/$MODELNAME
MODELDOWNLOADID=models--$MODELREPO--$MODELNAME
# IMAGE_REPO is $OPEA_IMAGE_REPO, or else ""
IMAGE_REPO=${OPEA_IMAGE_REPO:-amr-registry.caas.intel.com/aiops}

### PREPARE MODEL
# check if the model is already downloaded
Expand Down
23 changes: 23 additions & 0 deletions helm-charts/chatqna/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
18 changes: 18 additions & 0 deletions helm-charts/chatqna/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: chatqna
description: The Helm chart to deploy ChatQnA
type: application
dependencies:
- name: llm-uservice
version: "0.1.0"
- name: embedding-usvc
version: "0.1.0"
- name: reranking-usvc
version: "0.1.0"
- name: retriever-usvc
version: "0.1.0"
version: 0.1.0
appVersion: "1.0.0"
26 changes: 26 additions & 0 deletions helm-charts/chatqna/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# ChatQnA

Helm chart for deploying ChatQnA service.

## Installing the Chart

To install the chart, run the following:

```console
export HFTOKEN="insert-your-huggingface-token-here"
export MODELDIR="/mnt"
export MODELNAME="Intel/neural-chat-7b-v3-3"
helm install chatqna chatqna --set llm-uservice.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --set llm-uservice.tgi.volume=${MODELDIR} --set llm-uservice.tgi.LLM_MODEL_ID=${MODELNAME}
# To use Gaudi device
# helm install chatqna chatqna --set llm-uservice.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values chatqna/gaudi-values.yaml
```

## Values

| Key | Type | Default | Description |
| ------------------------------------- | ------ | ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| image.repository | string | `"opea/chatqna:latest"` | |
| service.port | string | `"8888"` | |
| llm-uservice.HUGGINGFACEHUB_API_TOKEN | string | `""` | Your own Hugging Face API token |
| llm-uservice.tgi.LLM_MODEL_ID | string | `"Intel/neural-chat-7b-v3-3"` | Models id from https://huggingface.co/, or predownloaded model directory |
| llm-uservice.tgi.volume | string | `"/mnt"` | Cached models directory, tgi will not download if the model is cached here. The "volume" will be mounted to container as /data directory |
23 changes: 23 additions & 0 deletions helm-charts/chatqna/charts/embedding-usvc/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
13 changes: 13 additions & 0 deletions helm-charts/chatqna/charts/embedding-usvc/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: embedding-usvc
description: The Helm chart for deploying embedding as microservice
type: application
dependencies:
- name: tei
version: "0.1.0"
version: 0.1.0
# The embedding microservice server version
appVersion: "1.0.0"
25 changes: 25 additions & 0 deletions helm-charts/chatqna/charts/embedding-usvc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# embedding-usvc

Helm chart for deploying embedding microservice.

embedding-usvc depends on TEI, refer to tei for more config details.

## Installing the Chart

To install the chart, run the following:

```console
$ export MODELDIR="/mnt"
$ export MODELNAME="BAAI/bge-base-en-v1.5"
$ helm install embedding embedding-usvc --set tei.volume=${MODELDIR} --set tei.EMBEDDING_MODEL_ID=${MODELNAME}
```

## Values

| Key | Type | Default | Description |
| ---------------------- | ------ | ----------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| image.repository | string | `"opea/embedding-tei:latest"` | |
| service.port | string | `"6000"` | |
| tei.EMBEDDING_MODEL_ID | string | `"BAAI/bge-base-en-v1.5"` | Models id from https://huggingface.co/, or predownloaded model directory |
| tei.port | string | `"6000"` | Hugging Face Text Embedding Inference service port |
| tei.volume | string | `"/mnt"` | Cached models directory, tgi will not download if the model is cached here. The "volume" will be mounted to container as /data directory |
23 changes: 23 additions & 0 deletions helm-charts/chatqna/charts/embedding-usvc/charts/tei/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
10 changes: 10 additions & 0 deletions helm-charts/chatqna/charts/embedding-usvc/charts/tei/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: tei
description: The Helm chart for HuggingFace Text Embedding Inference Server
type: application
version: 0.1.0
# The HF TEI version
appVersion: "1.2"
32 changes: 32 additions & 0 deletions helm-charts/chatqna/charts/embedding-usvc/charts/tei/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# tei

Helm chart for deploying Hugging Face Text Generation Inference service.

## Installing the Chart

To install the chart, run the following:

```console
$ cd ${GenAIInfro_repo}/helm-charts/common
$ export MODELDIR=/mnt/model
$ export MODELNAME="BAAI/bge-base-en-v1.5"
$ helm install tei tei --set volume=${MODELDIR} --set EMBEDDING_MODEL_ID=${MODELNAME}
```

By default, the tei service will downloading the "BAAI/bge-base-en-v1.5" which is about 1.1GB.

If you already cached the model locally, you can pass it to container like this example:

MODELDIR=/mnt/model

MODELNAME="/data/BAAI/bge-base-en-v1.5"

## Values

| Key | Type | Default | Description |
| ------------------ | ------ | ------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| EMBEDDING_MODEL_ID | string | `"BAAI/bge-base-en-v1.5"` | Models id from https://huggingface.co/, or predownloaded model directory |
| volume | string | `"/mnt/model"` | Cached models directory, tei will not download if the model is cached here. The "volume" will be mounted to container as /data directory |
| image.repository | string | `"ghcr.io/huggingface/text-embeddings-inference"` | |
| image.tag | string | `"cpu-1.2"` | |
| service.port | string | `"6006"` | The service port |
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
1. Get the application IP or URL by running these commands:
{{- if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "tei.fullname" . }})
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "tei.fullname" . }}'
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "tei.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
echo http://$SERVICE_IP:{{ .Values.service.port }}
{{- else if contains "ClusterIP" .Values.service.type }}
export tei_svc_ip=$(kubectl get svc --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "tei.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].spec.clusterIP}") && echo ${tei_svc_ip}
{{- end }}

2. Use this command to verify tei service:
curl ${tei_svc_ip}:6006/embed \
-X POST \
-d '{"inputs":"What is Deep Learning?"}' \
-H 'Content-Type: application/json'
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "tei.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
If release name contains chart name it will be used as a full name.
*/}}
{{- define "tei.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}

{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "tei.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}

{{/*
Common labels
*/}}
{{- define "tei.labels" -}}
helm.sh/chart: {{ include "tei.chart" . }}
{{ include "tei.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

{{/*
Selector labels
*/}}
{{- define "tei.selectorLabels" -}}
app.kubernetes.io/name: {{ include "tei.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}

{{/*
Create the name of the service account to use
*/}}
{{- define "tei.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "tei.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "tei.fullname" . }}
labels:
{{- include "tei.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
{{- include "tei.selectorLabels" . | nindent 6 }}
template:
metadata:
{{- with .Values.podAnnotations }}
annotations:
{{- toYaml . | nindent 8 }}
{{- end }}
labels:
{{- include "tei.selectorLabels" . | nindent 8 }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
env:
- name: MODEL_ID
value: {{ .Values.EMBEDDING_MODEL_ID | quote }}
- name: PORT
value: "80"
- name: http_proxy
value: {{ .Values.global.http_proxy }}
- name: https_proxy
value: {{ .Values.global.https_proxy }}
- name: no_proxy
value: {{ .Values.global.no_proxy }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
volumeMounts:
- mountPath: /data
name: model-volume
- mountPath: /dev/shm
name: shm
ports:
- name: http
containerPort: 80
protocol: TCP
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumes:
- name: model-volume
hostPath:
path: {{ .Values.volume }}
type: Directory
- name: shm
emptyDir:
medium: Memory
sizeLimit: {{ .Values.shmSize }}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
Loading

0 comments on commit 20dce6b

Please sign in to comment.