-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LLM Pipeline #137
Add LLM Pipeline #137
Conversation
dcdbe27
to
1f4c952
Compare
ad44973
to
87bfe3f
Compare
@rickstaa I have reviewed this and confirmed it works. Code needed to be rebased with new code gen updates from recent SDK releases. @kyriediculous can update this PR or we can move to the other PR. Some brief research provided there are other implementations to serve LLM pipelines which was also briefly discussed with @kyriediculous. Settled on alternative implementations can be researched and tested if the need arises from user feedback. LLM SPE will continue to support and enhance this pipeline to suite the network requirements for the LLM pipeline as the network evolves. Notes from review/testing:
There was only a couple small changes I made in addition to the changes needed to rebase this PR:
|
All comments have been addressed and commit history has been cleaned up
|
runner/app/routes/util.py
Outdated
@@ -58,6 +58,11 @@ class TextResponse(BaseModel): | |||
chunks: List[chunk] = Field(..., description="The generated text chunks.") | |||
|
|||
|
|||
class LlmResponse(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this LLMResponse
since LLm is a abreviation.
worker/worker.go
Outdated
@@ -341,7 +341,7 @@ func (w *Worker) LLM(ctx context.Context, req LlmLlmPostFormdataRequestBody) (in | |||
if err != nil { | |||
return nil, err | |||
} | |||
return w.handleNonStreamingResponse(resp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also apply the following commits done by @ad-astra-video to your branch?
We're planning to move the other pipelines into their own Docker containers to make the interface more generic and the base container leaner. With that in mind, I think it would make sense to also include the LLM pipeline in this new setup. The orchestrator experience will remain unchanged once PR #200 is merged.
Could you also take a look at this commit to see if it is needed?
Lastly, please add SDK tags where you think they fit, using this commit by @ad-astra-video as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the small naming comment everything else can also be addressed in seperate pull request so approved for now 👍.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the intention of the commits but I have several points of feedback on them. Usually PR feedback is left to the author to complete, and commits aren't added outside of the realm of the author.
-
240e16d -> bench.py is also in the root folder, it's also not a runtime piece of code but a dev-tool script. Either move both or neither.
-
b6a790d -> I don't really see what this accomplishes other than nuking CI ? Models can be downloaded separately regardless
-
9e1c48a -> Probably not the most idiomatic way to do this, but fine for now. Slipped through the cracks because I mainly tested with external container.
-
9461530 -> I made the changes, but this naming convention isn't good, Studio should not say what our naming conventions should be, I know i'm repeating myself here.
Overall there are too many breaking changes being made in this repo that affect open PRs and there needs to be a better culture in getting PRs across the finish line and respecting the dynamics of author and reviewer.
@@ -36,7 +36,8 @@ var containerHostPorts = map[string]string{ | |||
"image-to-video": "8200", | |||
"upscale": "8300", | |||
"audio-to-text": "8400", | |||
"segment-anything-2": "8500", | |||
"llm": "8500", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kyriediculous Could we keep the order the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLM was added before segment anything and has been using port 8005 prior to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m out of the office and don’t have time for a deep dive, but I took a quick look to keep things moving. I’ve left a few comments, but overall, it looks good to merge 🎉.
runner/app/routes/llm.py
Outdated
@router.post("/llm", | ||
response_model=LLMResponse, responses=RESPONSES, operation_id="genLLM",) | ||
@router.post("/llm/", response_model=LLMResponse, responses=RESPONSES, include_in_schema=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add metadata for OpenAPI docs&SDK? Here's a suggestion to make it easy:
@router.post("/llm", | |
response_model=LLMResponse, responses=RESPONSES, operation_id="genLLM",) | |
@router.post("/llm/", response_model=LLMResponse, responses=RESPONSES, include_in_schema=False) | |
@router.post( | |
"/llm", | |
response_model=LLMResponse, | |
responses=RESPONSES, | |
operation_id="llm", | |
description="Generate text using a language model.", | |
summary="LLM", | |
tags=["generate"], | |
openapi_extra={"x-speakeasy-name-override": "llm"}, | |
) | |
@router.post( | |
"/llm/", | |
response_model=LLMResponse, | |
responses=RESPONSES, | |
include_in_schema=False, | |
) |
I changed the operation_id
to just llm
since Rick mentioned you wanted not to have the gen
prefix here. As I mentioned in Discord, this operation_id
doesn't matter for the SDK so feel free to skip it in your API if you think it's more important than the consistency.
No description provided.