Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate ensembler image building with Turing API #170

Conversation

deadlycoconuts
Copy link
Contributor

@deadlycoconuts deadlycoconuts commented Feb 22, 2022

Context

In PR #164 and most notably PR #165, a real-time ensembler engine was introduced to process live ensembling requests, using an ensembler that is predefined in the mlflow pyfunc flavour. This engine can then be run in a container to serve real-time ensembling requests. However, the creation of such a container is a manual process which required the user to build a Docker image, followed by spinning up his/hero own container from the template files provided.

Hence, this PR introduces a process to automatically allow Turing API to orchestrate the Docker image building process when a user specifies a pre-defined pyfunc ensembler (identified by its given project_id and ensembler_id) when creating a Turing Router. Turing API then pushes the newly created image to a specified registry, and saves the image name as part of the configuration of the said Turing Router, which will then use this image to run a container containing the real-time ensembler web service with the specified pyfunc ensembler.

Features

  • Modification to the existing router creation/update process that builds a Docker image of the real-time ensembler web service with the ensembling logic given by the specified pyfunc ensembler
  • Introduction of a new field pyfunc_config/PyfuncConfig that contains resource request and timeout information regarding a pyfunc ensembler to be automatically deployed as a containerised web service

Fixes

  • Minor bug fixes to the Dockerfiles and makefiles specified for the web service
  • Changing fo the port that the ensembler web service listens on from 8080 to 8083, since Turing API already listens on 8080

Main Modifications

  • api/turing/api/deployment_controller.go - Addition of steps to orchestrate the building of the web service image when deploying a new router (version)
  • api/turing/api/appcontext.go - Addition of an EnsemblerServiceBuilder to the appContext
  • api/turing/api/request/request.go - Addition of checks to verify HTTP requests to Turing API that specify pyfunc ensemblers contain a valid project_id and ensembler_id (one that points to a pyfunc ensembler that has already been stored in Turing API)
  • api/turing/models/ensembler.go - Addition of a new field for pyfunc_config/PyfuncConfig
  • api/turing/imagebuilder/ensembler.go - Addition of new utility functions to create ensembler builder and service names
  • api/config-dev.yaml - Addition of new configuration for the web service building
  • api/db-migrations/* - Addition of SQL scripts for up/down migrations due to addition of pyfunc_config/PyfuncConfig
  • api/api/* - Changes to the OpenAPI specs from the addition of pyfunc_config/PyfuncConfig
  • infra/charts/turing/values.yaml - Addition of new default configuration for web service building to helm charts

@deadlycoconuts deadlycoconuts marked this pull request as ready for review February 25, 2022 11:49
@deadlycoconuts deadlycoconuts requested a review from a team February 25, 2022 11:49
Copy link
Collaborator

@krithika369 krithika369 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed summary, @deadlycoconuts ! I took a quick look and left some comments / questions. You may also want to get another set of eyes on this MR.

Could you also help clean up the README? There is a section showing the config-dev.yaml contents. But I think it can be removed, with a link to the config file, which you have already updated with the required configs for this feature.

Lastly, I imagine there will some user docs being updated, for the Pyfunc ensembler creation and router deployment (similar to batch ensemblers, both for the public docs and our internal docs; although batch docs are currently only internal - we can sort that out later). Will this be a part of the UI card? If not, please capture this in the backlog.

Thank you!

api/api/openapi.bundle.yaml Show resolved Hide resolved
api/api/openapi.bundle.yaml Show resolved Hide resolved
api/config-dev.yaml Show resolved Hide resolved
api/db-migrations/000010_add_py_func_ref_config.down.sql Outdated Show resolved Hide resolved
api/turing/api/deployment_controller.go Outdated Show resolved Hide resolved
api/turing/api/deployment_controller.go Outdated Show resolved Hide resolved
}

// Verify if the ensembler given by its ProjectID and EnsemblerID exist
ensembler, err := ensemblersSvc.FindByID(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - can the ensembler be deleted while it is being referenced by a router? Can it be deleted at all (I can't seem to find the delete method in the codebase, for the real or the batch ensemblers). May be @ashwinath has some context too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope there doesn't seem to be an API endpoint for deleting ensemblers. Not sure if this was a conscious design choice though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.. I'm not sure either. Could you create a card for this (delete for batch / real-time ensemblers) in the backlog, just to track it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember why there isn't a delete ensembler API for this or maybe I wasn't kept in the loop. Ensemblers I believe was done by Roman where as I did the orchestration. There is a delete for the orchestration part.

api/turing/cluster/servicebuilder/router.go Outdated Show resolved Hide resolved
api/turing/cluster/servicebuilder/router.go Outdated Show resolved Hide resolved
api/turing/models/ensembler.go Outdated Show resolved Hide resolved
@deadlycoconuts deadlycoconuts force-pushed the automate_ensembler_image_building branch from 2dd0eb8 to 8a383c1 Compare March 2, 2022 08:42
@@ -551,7 +551,7 @@ paths:
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_200'
$ref: '#/components/schemas/IdObject'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up all these stuff in the latest update

@@ -199,7 +199,7 @@ func (sb *clusterSvcBuilder) buildRouterEnvs(
{Name: envEnricherTimeout, Value: ver.Enricher.Timeout},
}...)
}
if ver.Ensembler != nil && ver.Ensembler.Type == models.EnsemblerDockerType {
if ver.HasDockerConfig() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified this to make the intention of the check explicit

experimentConfig json.RawMessage,
eventsCh *EventChannel,
) (string, error) {
var endpoint string

// If pyfunc ensembler is specified as an ensembler service, build/retrieve its image
if pyfuncEnsembler != nil {
err := ds.buildEnsemblerServiceImage(pyfuncEnsembler, project, routerVersion, eventsCh)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeploymentService now does the weightlighting for image building.

@@ -551,7 +551,7 @@ paths:
content:
application/json:
schema:
$ref: '#/components/schemas/inline_response_200'
$ref: '#/components/schemas/IdObject_1'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is automatically generated by the Makefile right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's right. I corrected the source openapi specs that was generated with make bundle-openapi

@deadlycoconuts
Copy link
Contributor Author

Thanks a lot for all the comments everyone! I'll rebase this on @leonlnj 's latest changes and it'll be good to go!

@deadlycoconuts deadlycoconuts force-pushed the automate_ensembler_image_building branch from 0e06c66 to f2767fc Compare March 7, 2022 07:22
@deadlycoconuts deadlycoconuts merged commit 44d4782 into caraml-dev:main Mar 7, 2022
@deadlycoconuts deadlycoconuts deleted the automate_ensembler_image_building branch March 8, 2022 02:48
@deadlycoconuts deadlycoconuts self-assigned this Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants