-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add OPEA deployment design #10
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some grammar and spelling edits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broken link (missing file)
|
||
```yaml | ||
opea_micro_services: | ||
embedding: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a way to let user define different environment variables. Like there should have space to let user choose the model they want to use, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
Jian Feng, Dolpher, Avinash, please track this RFC to ensure it meets our known use cases and to ensure helm charts for component microservices can meet its needs. |
Signed-off-by: Tian, Feng <[email protected]>
Signed-off-by: Tian, Feng <[email protected]>
Signed-off-by: irisdingbj <[email protected]>
Signed-off-by: Tian, Feng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While a programmatic way to convert the pipeline description to a deployment on docker or kubernetes are options, we should also consider a script or binary that does the same. convert -i pipeline-def.{.yaml|.py} -o pipeline.yaml -arg-list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the rectangle "Kubernetes Manifests ..", let us also include "yaml" that is generic, covers using the yaml as input to a Kubernetes CRD.
endpoint: /v1/chat/completions | ||
port: 9000 | ||
|
||
opea_mega_service: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Kubernetes, line 81 may be a more than an adequate representation given the microservices connector custom resource definition to define a pipeline. We do not need to define ports etc explicitly. Please see https://kserve.github.io/website/latest/modelserving/inference_graph/image_pipeline/#deploy-inferencegraph . This RFC has not touched on conditionals. I was thinking of a language that explicitly calls out a version number, and allows specifying service name and a set of key value pairs, where keys could be "path", "auth-token" "temperature" as examples. {mega-service :version 0.1 {sequence {embedding :path /v1/e :image-d some-registry-path} {retriever :path /v1/r} {llm :api-token hfuilsadty894 :url chatGPT4-url}}} type thing. This supports invoking external services. If no container image provided, the system could search for and use in its context for helm charts/yaml for a service with the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if we should make a grammar like language for pipelines versus a yaml like language. The former might be more friendly for non-cloud folks. We can always write a tool to convert to yaml, fill in things like version, etc.
|
||
|
||
**Alternatives Considered** | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@irisdingbj would be good to list that you explored what KServe, ArgoCD etc used and the gaps therein, for completeness, like one covering only inference while RAG has difference kinds of services.
endpoint: /generate | ||
isDownstreamService: true | ||
``` | ||
There should be an available `gmconnectors.gmc.opea.io` CR named `chatqna` under the namespace `gmcsample`, showing below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add expansion for CR (Custom resource) and a link to documentation.
@irisdingbj were we thinking eventually of a tool to ease creating this yaml down the road. if yes, we could mention the same here and stick a feature request issue to address later on.
|
||
And the user can access the application pipeline via the value of `URL` field in above. | ||
|
||
The whole deployment process illustrated by the diagram below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/illustrated/is illustrated/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/composer/compose in the figure.
We will not create helm charts or kubernetes manifests for the GenAI applications, only for the GenAI componeonts
Signed-off-by: irisdingbj <[email protected]>
Signed-off-by: irisdingbj <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -0,0 +1,203 @@ | |||
**Author** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow example rfc template in rfc_template.txt (same directory)
**Author** | |
# 24-05-17-OPEA-001-Deployment-Design | |
## Authors |
|
||
[ftian1](https://github.com/ftian1), [lvliang-intel](https://github.com/lvliang-intel), [hshen14](https://github.com/hshen14), [mkbhanda](https://github.com/mkbhanda), [irisdingbj](https://github.com/irisdingbj), [KfreeZ](https://github.com/kfreez), [zhlsunshine](https://github.com/zhlsunshine) **Edit Here to add your id** | ||
|
||
**Status** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Status** | |
## Status |
|
||
Under Review | ||
|
||
**Objective** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Objective** | |
## Objective |
Have a clear and good design for users to deploy their own GenAI applications on docker or Kubernetes environment. | ||
|
||
|
||
**Motivation** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Motivation** | |
## Motivation |
|
||
The proposed OPEA deployment workflow is | ||
|
||
<a target="_blank" href="opea_deploy_workflow.png"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put all the images into the assets folder, and use normal markdown syntax to reference them instead of raw HTML
<a target="_blank" href="opea_deploy_workflow.png"> | |
![OPEA deploy workflow](assets/opea_deploy_workflow.png) |
</a> | ||
|
||
|
||
**Alternatives Considered** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Alternatives Considered** | |
## Alternatives Considered |
[Kserve](https://github.com/kserve/kserve): has provided [InferenceGraph](https://kserve.github.io/website/0.9/modelserving/inference_graph/), however it only supports inference service and lack of deployment support. | ||
|
||
|
||
**Compatibility** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Compatibility** | |
## Compatibility |
|
||
n/a | ||
|
||
**Miscs** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Miscs** | |
## Miscs |
The whole deployment process illustrated by the diagram below. | ||
|
||
<a target="_blank" href="opea_deploy_process.png"> | ||
<img src="opea_deploy_process_v1.png" alt="Deployment Process" width=480 height=310> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put all the images into the assets folder, and use normal markdown syntax to reference them instead of raw HTML
<img src="opea_deploy_process_v1.png" alt="Deployment Process" width=480 height=310> | |
![deployment process](assets/opea_deploy_process_v1.png) |
You've also got extra images for v0 and v2 that aren't referenced, so delete them when you can.
This RFC is used to discuss how to deploy OPEA examples on cloud env.