RFC: Add OPEA deployment design #10

ftian1 · 2024-05-17T09:47:01Z

This RFC is used to discuss how to deploy OPEA examples on cloud env.

dbkinder

Some grammar and spelling edits

community/rfcs/24-05-17-001-OPEA-Deployment-Design.md

dbkinder

Broken link (missing file)

community/rfcs/24-05-17-001-OPEA-Deployment-Design.md

irisdingbj · 2024-05-24T17:06:33Z

community/rfcs/24-05-17-001-OPEA-Deployment-Design.md

+
+    ```yaml
+    opea_micro_services:
+      embedding:


We need a way to let user define different environment variables. Like there should have space to let user choose the model they want to use, etc.

mkbhanda · 2024-05-24T21:42:51Z

Jian Feng, Dolpher, Avinash, please track this RFC to ensure it meets our known use cases and to ensure helm charts for component microservices can meet its needs.

Signed-off-by: Tian, Feng <[email protected]>

Signed-off-by: irisdingbj <[email protected]>

Signed-off-by: Tian, Feng <[email protected]>

mkbhanda · 2024-05-24T21:26:01Z

community/rfcs/opea_deploy_workflow.png

While a programmatic way to convert the pipeline description to a deployment on docker or kubernetes are options, we should also consider a script or binary that does the same. convert -i pipeline-def.{.yaml|.py} -o pipeline.yaml -arg-list

mkbhanda · 2024-05-24T21:28:24Z

community/rfcs/opea_deploy_process.png

in the rectangle "Kubernetes Manifests ..", let us also include "yaml" that is generic, covers using the yaml as input to a Kubernetes CRD.

mkbhanda · 2024-05-24T21:40:15Z

community/rfcs/24-05-17-001-OPEA-Deployment-Design.md

+        endpoint: /v1/chat/completions
+        port: 9000
+
+    opea_mega_service:


For Kubernetes, line 81 may be a more than an adequate representation given the microservices connector custom resource definition to define a pipeline. We do not need to define ports etc explicitly. Please see https://kserve.github.io/website/latest/modelserving/inference_graph/image_pipeline/#deploy-inferencegraph . This RFC has not touched on conditionals. I was thinking of a language that explicitly calls out a version number, and allows specifying service name and a set of key value pairs, where keys could be "path", "auth-token" "temperature" as examples. {mega-service :version 0.1 {sequence {embedding :path /v1/e :image-d some-registry-path} {retriever :path /v1/r} {llm :api-token hfuilsadty894 :url chatGPT4-url}}} type thing. This supports invoking external services. If no container image provided, the system could search for and use in its context for helm charts/yaml for a service with the same name.

I am wondering if we should make a grammar like language for pipelines versus a yaml like language. The former might be more friendly for non-cloud folks. We can always write a tool to convert to yaml, fill in things like version, etc.

mkbhanda · 2024-05-28T22:55:22Z

community/rfcs/24-05-17-001-OPEA-Deployment-Design.md

+
+
+**Alternatives Considered**
+


@irisdingbj would be good to list that you explored what KServe, ArgoCD etc used and the gaps therein, for completeness, like one covering only inference while RAG has difference kinds of services.

mkbhanda · 2024-06-11T05:35:20Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+                endpoint: /generate
+              isDownstreamService: true 
+```
+There should be an available `gmconnectors.gmc.opea.io` CR named `chatqna` under the namespace `gmcsample`, showing below：


add expansion for CR (Custom resource) and a link to documentation.

@irisdingbj were we thinking eventually of a tool to ease creating this yaml down the road. if yes, we could mention the same here and stick a feature request issue to address later on.

mkbhanda · 2024-06-11T05:36:15Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+
+And the user can access the application pipeline via the value of `URL` field in above.
+
+The whole deployment process illustrated by the diagram below.


s/illustrated/is illustrated/

mkbhanda · 2024-06-11T05:39:15Z

community/rfcs/opea_deploy_process_v2.png

s/composer/compose in the figure.

We will not create helm charts or kubernetes manifests for the GenAI applications, only for the GenAI componeonts

Signed-off-by: irisdingbj <[email protected]>

mkbhanda

LGTM

dbkinder · 2024-09-15T20:27:12Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

@@ -0,0 +1,203 @@
+**Author**


Please follow example rfc template in rfc_template.txt (same directory)

Suggested change

**Author**

# 24-05-17-OPEA-001-Deployment-Design

## Authors

dbkinder · 2024-09-15T20:27:30Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+
+[ftian1](https://github.com/ftian1), [lvliang-intel](https://github.com/lvliang-intel), [hshen14](https://github.com/hshen14), [mkbhanda](https://github.com/mkbhanda), [irisdingbj](https://github.com/irisdingbj), [KfreeZ](https://github.com/kfreez), [zhlsunshine](https://github.com/zhlsunshine) **Edit Here to add your id**
+
+**Status**


Suggested change

**Status**

## Status

dbkinder · 2024-09-15T20:27:45Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+
+Under Review
+
+**Objective**


Suggested change

**Objective**

## Objective

dbkinder · 2024-09-15T20:27:58Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+Have a clear and good design for users to deploy their own GenAI applications on docker or Kubernetes environment.
+
+
+**Motivation**


Suggested change

**Motivation**

## Motivation

dbkinder · 2024-09-15T20:29:53Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+
+The proposed OPEA deployment workflow is
+
+<a target="_blank" href="opea_deploy_workflow.png">


Put all the images into the assets folder, and use normal markdown syntax to reference them instead of raw HTML

Suggested change

<a target="_blank" href="opea_deploy_workflow.png">

![OPEA deploy workflow](assets/opea_deploy_workflow.png)

dbkinder · 2024-09-15T20:31:13Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+</a>
+
+
+**Alternatives Considered**


Suggested change

**Alternatives Considered**

## Alternatives Considered

dbkinder · 2024-09-15T20:31:30Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+[Kserve](https://github.com/kserve/kserve): has provided [InferenceGraph](https://kserve.github.io/website/0.9/modelserving/inference_graph/), however it only supports inference service and lack of deployment support.
+
+
+**Compatibility**


Suggested change

**Compatibility**

## Compatibility

dbkinder · 2024-09-15T20:31:46Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+
+n/a
+
+**Miscs**


Suggested change

**Miscs**

## Miscs

dbkinder · 2024-09-15T20:34:35Z

community/rfcs/24-05-17-OPEA-001-Deployment-Design.md

+The whole deployment process illustrated by the diagram below.
+
+<a target="_blank" href="opea_deploy_process.png">
+  <img src="opea_deploy_process_v1.png" alt="Deployment Process" width=480 height=310>


Put all the images into the assets folder, and use normal markdown syntax to reference them instead of raw HTML

Suggested change

<img src="opea_deploy_process_v1.png" alt="Deployment Process" width=480 height=310>

![deployment process](assets/opea_deploy_process_v1.png)

You've also got extra images for v0 and v2 that aren't referenced, so delete them when you can.

hshen14 changed the title ~~Add OPEA deployment design RFC~~ Draft RFC: Add OPEA deployment design May 21, 2024

hshen14 added the draft label May 21, 2024

dbkinder requested changes May 21, 2024

View reviewed changes

community/rfcs/24-05-17-001-OPEA-Deployment-Design.md Outdated Show resolved Hide resolved

irisdingbj reviewed May 24, 2024

View reviewed changes

hshen14 removed the draft label May 27, 2024

hshen14 changed the title ~~Draft RFC: Add OPEA deployment design~~ RFC: Add OPEA deployment design May 27, 2024

irisdingbj force-pushed the deploy_rfc branch from 23b033b to 0c430f5 Compare May 28, 2024 22:05

ftian1 and others added 4 commits May 30, 2024 09:28

Add OPEA deployment design RFC

a8b4a9c

Signed-off-by: Tian, Feng <[email protected]>

Add development process for OPEA GenAI application deployment

a0a6e03

Signed-off-by: Tian, Feng <[email protected]>

add kubernetes deployment info

028989a

Signed-off-by: irisdingbj <[email protected]>

Revise the OPEA deployment design doc

823aaa0

Signed-off-by: Tian, Feng <[email protected]>

ftian1 force-pushed the deploy_rfc branch from cf301e7 to 823aaa0 Compare May 30, 2024 01:49

mkbhanda reviewed Jun 11, 2024

View reviewed changes

irisdingbj added 2 commits June 12, 2024 04:28

update kubernetes deployment info

7c48ab1

Signed-off-by: irisdingbj <[email protected]>

remove v2 pic

86a9e23

Signed-off-by: irisdingbj <[email protected]>

mkbhanda approved these changes Jun 11, 2024

View reviewed changes

dbkinder requested changes Sep 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Add OPEA deployment design #10

RFC: Add OPEA deployment design #10

ftian1 commented May 17, 2024

dbkinder left a comment

dbkinder left a comment

irisdingbj May 24, 2024

ftian1 May 30, 2024

mkbhanda commented May 24, 2024

mkbhanda May 24, 2024

mkbhanda May 24, 2024

mkbhanda May 24, 2024

mkbhanda May 28, 2024

mkbhanda May 28, 2024

mkbhanda Jun 11, 2024

mkbhanda Jun 11, 2024

mkbhanda Jun 11, 2024

mkbhanda left a comment

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024

dbkinder Sep 15, 2024


		And the user can access the application pipeline via the value of `URL` field in above.

		The whole deployment process illustrated by the diagram below.

-**Author**
+# 24-05-17-OPEA-001-Deployment-Design
+## Authors


		[ftian1](https://github.com/ftian1), [lvliang-intel](https://github.com/lvliang-intel), [hshen14](https://github.com/hshen14), [mkbhanda](https://github.com/mkbhanda), [irisdingbj](https://github.com/irisdingbj), [KfreeZ](https://github.com/kfreez), [zhlsunshine](https://github.com/zhlsunshine) Edit Here to add your id

		Status

		Have a clear and good design for users to deploy their own GenAI applications on docker or Kubernetes environment.


		Motivation


		The proposed OPEA deployment workflow is

		<a target="_blank" href="opea_deploy_workflow.png">

	<a target="_blank" href="opea_deploy_workflow.png">
	![OPEA deploy workflow](assets/opea_deploy_workflow.png)

		[Kserve](https://github.com/kserve/kserve): has provided [InferenceGraph](https://kserve.github.io/website/0.9/modelserving/inference_graph/), however it only supports inference service and lack of deployment support.


		Compatibility

	<img src="opea_deploy_process_v1.png" alt="Deployment Process" width=480 height=310>
	![deployment process](assets/opea_deploy_process_v1.png)


		n/a

		Miscs

RFC: Add OPEA deployment design #10

Are you sure you want to change the base?

RFC: Add OPEA deployment design #10

Conversation

ftian1 commented May 17, 2024

dbkinder left a comment

Choose a reason for hiding this comment

dbkinder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkbhanda commented May 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkbhanda left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment