Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nop Ensembler Config #192

Merged
merged 7 commits into from
Apr 21, 2022
Merged

Conversation

krithika369
Copy link
Collaborator

@krithika369 krithika369 commented Apr 20, 2022

Background

The default_route_id configuration on the Turing routers has proven to be confusing to some of our users. It also only plays a direct role when the ensembler chosen is of type "none" or "standard" - it acts as the final response and fallback route in these 2 scenarios, respectively. When the custom ensembler (in the form of "docker" or "pyfunc" configuration) is used, this property of the route is passed down to the ensembler but does not play a critical role, since the user is free to set up the ensembling logic in any manner (eg: simply compare the route name instead of the default route property). Thus, in this PR and in the future ones, the concept of the default route will gradually be phased out to the end user (in the SDK / UI).

Note: As of this PR, the Default Route configuration still remains but will be overridden by the Final Response Route chosen for Nop Ensemblers.

Changes

UI

When the user selects "None" ensembler, a dropdown to select the final response route will be displayed.

Screenshot 2022-04-20 at 8 56 06 AM

Screenshot 2022-04-20 at 8 53 21 AM

Screenshot 2022-04-20 at 8 54 18 AM

To facilitate this, the Router and RouterVersion data used in the Turing components will have to always be parsed and constructed from the raw API response data (to copy the default route as the Nop ensembler's final response route). Similarly, when the data is posted to the API server, the reverse must be done (i.e., copy the Nop ensembler's final response route as the router's default route). This is required because the API does not introduce any changes for the configuration of the final response route and we continue to rely on the router's default route to achieve the intended behavior. (See ui/src/services/router/TuringRouter.js.)

Main Changes:

  • ui/src/services/ensembler/NopEnsembler.js - Add Nop Ensembler type
  • ui/src/services/router/TuringRouter.js, ui/src/services/version/RouterVersion.js - Handle the conversion between the Nop ensembler's final response route and the router's default route.
  • ui/src/router/components/form/components/nop_config/NopConfigFormGroup.js - Nop Ensembler form panel

SDK

The same logic described for the UI above is also implemented on the Python SDK, so the user may provide the Nop ensembler configuration like below. If provided, it will override the default route.

    ensembler = NopRouterEnsemblerConfig(
        final_response_route_id="test-route"
    )

Main changes:

  • sdk/turing/router/config/router_ensembler_config.py - Introduction of NopRouterEnsemblerConfig as a child class of RouterEnsemblerConfig
  • sdk/turing/router/config/router_config.py - Handle the conversion between the Nop ensembler's final response route and the router's default route.

More details on the changes are added as comments.

@@ -454,8 +453,8 @@ def generic_router_version(
error="NONE",
image="test.io/just-a-test/turing-router:0.0.0-build.0",
routes=[generic_route for _ in range(2)],
default_route="http://models.internal/default",
default_route_id="control",
default_route=generic_route.endpoint,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests were originally using a default route id that wasn't a part of the routes themselves. Now, the _verify_default_route_exists method added to sdk/turing/router/config/router_config.py will prevent that, so the route must be valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, thanks a lot for adding this so more upfront checks are performed before sending any requests to the API!

useEffect(() => {
fetchEventsOnce();
}, [fetchEventsOnce]);

useEffect(() => {
if (routerStatus === Status.PENDING) {
if (router.status === Status.PENDING) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such changes are required here and in other places because, the router / version objects are now the parsed objects rather than the raw response. So, for example, the status property of the router here is not merely a string, but an Enum object (ref).

const interval = setInterval(fetchRouterDetails, 5000);
return () => clearInterval(interval);
}
}, [router.status, fetchRouterDetails]);

useEffect(() => {
// Parse router details
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time the router details change, parse the data for use and store in the local state.

@krithika369 krithika369 changed the title Draft: Nop Ensembler Config Nop Ensembler Config Apr 20, 2022
@krithika369 krithika369 requested a review from a team April 20, 2022 00:50

@dataclass
class RouterEnsemblerConfig:
"""
Class to create a new RouterEnsemblerConfig

:param type: type of the ensembler; must be one of {'standard', 'docker'}
:param type: type of the ensembler; must be one of {'nop', 'standard', 'docker', 'pyfunc'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting that pyfunc had been left out!

@@ -19,14 +19,14 @@ Check out [samples](./samples) for examples on how to use Turing SDK.

#### Prerequisites

* Python >= 3.8
* Python 3.7.*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! (as well as the other non-updated makefile commands)

InvalidRouteException
)
])
def test_set_router_config_with_invalid_default_route(actual, default_route_id, expected, request):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extreme nit: can we rename default_route_id to something along the lines of invalid_route_id or not_the_default_route_id to make it a little clearer that the route in that variable is actually invalid?

sdk/turing/router/config/router_config.py Show resolved Hide resolved
for route in self.routes:
if route.id == default_route_id:
return
raise turing.router.config.route.InvalidRouteException(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're already importing Route via from turing.router.config.route import Route, perhaps we could just import InvalidRouteException and use it here too?

sdk/turing/router/config/router_config.py Outdated Show resolved Hide resolved
@krithika369
Copy link
Collaborator Author

Thanks for the review and the comments, all! I'll be merging this PR once the pipeline succeeds.

@krithika369 krithika369 merged commit 9276cd3 into caraml-dev:main Apr 21, 2022
@krithika369 krithika369 deleted the none_ensembler_config branch April 21, 2022 08:11
krithika369 added a commit that referenced this pull request Jun 6, 2022
* Update workflows to use Python 3.7 for ensembler engines and sdk (#190)

* Update workflows to use Python 3.7 for ensembler engines and sdk

* Add none return option for config in Router SDK class

* Update text display settings to display entire image name

* Set container image name to overflow instead of being truncated

* Include response headers in logs (#191)

* Update proto and classes

* Add response headers and refactor naming of response bodies

* Refactor logging methods

* Fix tests

* Fix kafka tests

* Fix kafka tests

* Fix line breaks

* Fix line breaks

* Fix line breaks

* Fix typo in error message

* Refactor response fields

* Refactor response headers as map of strings

* Refactor tests to use map of strings to represent headers

* Fix non deterministic serialisation of hashmap in tests

* Refactor log handler

* Remove debug statement

* Refactor HTTP header formatting into a helper function

* Fix kafka protobuf test to use JSON as means of comparison

* Rename body in proto to response to avoid breaking changes

* Refactor all tests to use response instead of body to refer to request body

* Fix lint import suggestion

* Remove debug statements

* Rename variables in http header formatting helper function

* Fix BQ marshalling issues

* Fix lint import suggestion

* Fix lint comments

* Minor fixes for experiment engine configs in the helm chart (#193)

Co-authored-by: Krithika Sundararajan <[email protected]>

* Nop Ensembler Config (#192)

* UI changes for nop ensembler config

* Make handling of router / version status consistent

* SDK changes for default route

* Correct the default route id in unit tests

* Add tests for the nop ensembler config

* Update sample code and doc

* Add PR comments

Co-authored-by: Krithika Sundararajan <[email protected]>

* [BUG] Fix undeploy router error (#194)

* Fix bugs in sample SDK script

* Refactor UndeployRouterVersion to take in cleanup flag

* Update DeploymentService mock class

* Fix bug in deployment controller test

* Refactor if else blocks

* Add tests for IsKnativeServiceInNamespace and IsSecretinNamespace

* Rename test method to make it consistent with the method tested

* Refactor k8s deletion methods into separate methods

* Fix bug in deleting deployments and services

* Fix typo in router name

* Rename default route from nothing to control

* Make undeploying a pending router status a cleanup job

* Refactor code to use ignoreNotFound flag

* Fix go mod file

* Bugfix: Turing API should process experiment engine passkey only if client selection enabled (#196)

* Bugfix: Passkey should not be processed if client selection disabled

* Update hardcoded sample plugin to use experiment variables, consistent with the runner

* Update RPC plugin example and docs

* Correct numbering in doc and plugin name change

* Add debug message

* Update Deployment controller to consider if client selection enabled

* Add another unit test case for TestIsClientSelectionEnabled

Co-authored-by: Krithika Sundararajan <[email protected]>

* Add Fallback Response Route Config for Standard Ensemblers (#197)

* UI changes for standard ensembler Fallback response

* Add validation for fallback resonse route id

* Update docs for the Standard Ensembler config

* Routing stragey changes for default route handling

* SDK changes for fallback response route id

* Amend user docs

Co-authored-by: Krithika Sundararajan <[email protected]>

* Remove the default route configuration (#198)

* Remove unused default route property

* Router: Make the default_route_id only required for DefaultTuringRoutingStrategy

* Make Default Route ID optional for the Turing Router creater / update API

* Update e2e tests

* UI: Update router view / edit to stop handling default route explicitly

* UI: Exclude routes with traffic rules in the final/fallback response options

* SDK: deprecate the default_route_id config

* SDK: Remove default route id from samples

* Update user docs

* Update OpenAPI bundle

* Address PR comments

Co-authored-by: Krithika Sundararajan <[email protected]>

* Adding SDK support for Python 3.8 and 3.9 (#199)

* Update SDK / engines to support multiple Python versions

* Pin cloudpickle at 2.0.0

* Introduce Python Version on the Pyfunc ensembler config

* Update SDK unit tests

* Update Github workflows

* Update chart values

* Update docs, unit tests

* Update SDK CI workflows to test on all Python versions

* Pin protobuf version at 3.20.1

* Address PR comments

Co-authored-by: Krithika Sundararajan <[email protected]>

* Bugfix: Clear default route Id from custom ensembler configs (#200)

* Miscellaneous bug fixes

* Add unit tests for the SDK changes, address PR comments

* Bugfix: Regression in display of container configs for Pyfunc

* Add type annotation to class methods

Co-authored-by: Krithika Sundararajan <[email protected]>

* Update chart version for app release (#201)

Co-authored-by: Krithika Sundararajan <[email protected]>

* Add dynamic loading of Experiment Engine config (#202)

* Add dynamic loading of exp engine config

* Address PR comments

* Add useEffect rerender

* Address PR comments

* Simplify conditional logic

* Attempt to fix yarn install error

Co-authored-by: Ewe Zi Yi <[email protected]>
Co-authored-by: Krithika Sundararajan <[email protected]>
Co-authored-by: Terence Lim <[email protected]>
krithika369 added a commit that referenced this pull request Jun 9, 2022
* Upgrade Go version/Knative/Spark Operator/Kubernetes Client/Docker Compose for dev environment. (#183)

* Bump versions for all k8s based libs

* Use proper context for each scenario.

* Upgrade virtual service version.

* Update k3d version to kubernetes 1.22.

* Upgrade Go to 1.16.

* Fix linting errors.

* Update k3d flags.

* Upgrade knative/istio/spark-operator versions in cluster init.

* Update default versions in cluster-init, change e2e test to new api

* Fail fast if default environment is wrong. Extra logging.

* Fix unit tests.

* Addressed PR comments.

* Pull requests to be run on any target branch.

* Upgrade to go 1.18, upgrade linter.

* Upgrade experiment and router to go 1.18.

* Update PR comments.

* Parameterise Go and Go Linter Versions.

* Update documentation with new versions and ports.

* Merge from main -> knative-upgrade branch (#203)

* Update workflows to use Python 3.7 for ensembler engines and sdk (#190)

* Update workflows to use Python 3.7 for ensembler engines and sdk

* Add none return option for config in Router SDK class

* Update text display settings to display entire image name

* Set container image name to overflow instead of being truncated

* Include response headers in logs (#191)

* Update proto and classes

* Add response headers and refactor naming of response bodies

* Refactor logging methods

* Fix tests

* Fix kafka tests

* Fix kafka tests

* Fix line breaks

* Fix line breaks

* Fix line breaks

* Fix typo in error message

* Refactor response fields

* Refactor response headers as map of strings

* Refactor tests to use map of strings to represent headers

* Fix non deterministic serialisation of hashmap in tests

* Refactor log handler

* Remove debug statement

* Refactor HTTP header formatting into a helper function

* Fix kafka protobuf test to use JSON as means of comparison

* Rename body in proto to response to avoid breaking changes

* Refactor all tests to use response instead of body to refer to request body

* Fix lint import suggestion

* Remove debug statements

* Rename variables in http header formatting helper function

* Fix BQ marshalling issues

* Fix lint import suggestion

* Fix lint comments

* Minor fixes for experiment engine configs in the helm chart (#193)

Co-authored-by: Krithika Sundararajan <[email protected]>

* Nop Ensembler Config (#192)

* UI changes for nop ensembler config

* Make handling of router / version status consistent

* SDK changes for default route

* Correct the default route id in unit tests

* Add tests for the nop ensembler config

* Update sample code and doc

* Add PR comments

Co-authored-by: Krithika Sundararajan <[email protected]>

* [BUG] Fix undeploy router error (#194)

* Fix bugs in sample SDK script

* Refactor UndeployRouterVersion to take in cleanup flag

* Update DeploymentService mock class

* Fix bug in deployment controller test

* Refactor if else blocks

* Add tests for IsKnativeServiceInNamespace and IsSecretinNamespace

* Rename test method to make it consistent with the method tested

* Refactor k8s deletion methods into separate methods

* Fix bug in deleting deployments and services

* Fix typo in router name

* Rename default route from nothing to control

* Make undeploying a pending router status a cleanup job

* Refactor code to use ignoreNotFound flag

* Fix go mod file

* Bugfix: Turing API should process experiment engine passkey only if client selection enabled (#196)

* Bugfix: Passkey should not be processed if client selection disabled

* Update hardcoded sample plugin to use experiment variables, consistent with the runner

* Update RPC plugin example and docs

* Correct numbering in doc and plugin name change

* Add debug message

* Update Deployment controller to consider if client selection enabled

* Add another unit test case for TestIsClientSelectionEnabled

Co-authored-by: Krithika Sundararajan <[email protected]>

* Add Fallback Response Route Config for Standard Ensemblers (#197)

* UI changes for standard ensembler Fallback response

* Add validation for fallback resonse route id

* Update docs for the Standard Ensembler config

* Routing stragey changes for default route handling

* SDK changes for fallback response route id

* Amend user docs

Co-authored-by: Krithika Sundararajan <[email protected]>

* Remove the default route configuration (#198)

* Remove unused default route property

* Router: Make the default_route_id only required for DefaultTuringRoutingStrategy

* Make Default Route ID optional for the Turing Router creater / update API

* Update e2e tests

* UI: Update router view / edit to stop handling default route explicitly

* UI: Exclude routes with traffic rules in the final/fallback response options

* SDK: deprecate the default_route_id config

* SDK: Remove default route id from samples

* Update user docs

* Update OpenAPI bundle

* Address PR comments

Co-authored-by: Krithika Sundararajan <[email protected]>

* Adding SDK support for Python 3.8 and 3.9 (#199)

* Update SDK / engines to support multiple Python versions

* Pin cloudpickle at 2.0.0

* Introduce Python Version on the Pyfunc ensembler config

* Update SDK unit tests

* Update Github workflows

* Update chart values

* Update docs, unit tests

* Update SDK CI workflows to test on all Python versions

* Pin protobuf version at 3.20.1

* Address PR comments

Co-authored-by: Krithika Sundararajan <[email protected]>

* Bugfix: Clear default route Id from custom ensembler configs (#200)

* Miscellaneous bug fixes

* Add unit tests for the SDK changes, address PR comments

* Bugfix: Regression in display of container configs for Pyfunc

* Add type annotation to class methods

Co-authored-by: Krithika Sundararajan <[email protected]>

* Update chart version for app release (#201)

Co-authored-by: Krithika Sundararajan <[email protected]>

* Add dynamic loading of Experiment Engine config (#202)

* Add dynamic loading of exp engine config

* Address PR comments

* Add useEffect rerender

* Address PR comments

* Simplify conditional logic

* Attempt to fix yarn install error

Co-authored-by: Ewe Zi Yi <[email protected]>
Co-authored-by: Krithika Sundararajan <[email protected]>
Co-authored-by: Terence Lim <[email protected]>

* Update CI specs

* Revert UI changes during merge

* Update CI specs

* Update e2e deployment timeout

* Remove WIP inline comment

Co-authored-by: Ashwin <[email protected]>
Co-authored-by: Ewe Zi Yi <[email protected]>
Co-authored-by: Krithika Sundararajan <[email protected]>
Co-authored-by: Terence Lim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants