This repository hosts the Kubernetes Python Operators for KServe (see CharmHub).
Upstream documentation can be found at https://kserve.github.io/website/0.8/
- Kubernetes 1.24
- istio-operators 1.16/stable
- knative-operators 1.8/stable
- Kubernetes cluster
NOTE: If you are using Microk8s, it is assumed you have run
microk8s enable dns storage rbac metallb:"10.64.140.43-10.64.140.49,192.168.0.105-192.168.0.111"
.
istio-pilot
andistio-ingressgateway
. See "Deploy dependencies" for deploy instructions.
MODEL_NAME="kserve"
DEFAULT_GATEWAY="kserve-gateway"
juju add-model ${MODEL_NAME}
kserve-operators require istio-operators to be deployed in the cluster. To correctly configure them, you can:
ISTIO_CHANNEL=1.16/stable
juju deploy istio-pilot --config default-gateway=${DEFAULT_GATEWAY} --channel ${ISTIO_CHANNEL} --trust
juju deploy istio-gateway istio-ingressgateway --config kind="ingress" --channel ${ISTIO_CHANNEL} --trust
juju relate istio-pilot istio-ingressgateway
For serverless operations kserve-operators depends on knative-serving. To correctly configure it, you can:
NOTE: these instructions assume you have deployed Microk8s and MetalLB is enabled. If your cloud configuration is different than this, please refer to knative-operators documentation.
KNATIVE_CHANNEL=1.8/stable
juju deploy knative-operator --channel ${KNATIVE_CHANNEL} --trust
juju deploy knative-serving --config namespace="knative-serving" --config istio.gateway.namespace=${MODEL_NAME} --config istio.gateway.name=${DEFAULT_GATEWAY} --channel ${KNATIVE_CHANNEL} --trust
kserve-operators support RawDeployment
mode to manage InferenceService
, which removes the KNative dependency and unlocks some of its limitations, like mounting multiple volumes. Please note this mode is not loaded with serverless capabilities, for that you'd need to deploy in Serverless
mode.
- Deploy
kserver-controller
juju deploy kserve-controller --channel <channel> --trust
- Relate
kserve-controller
andistio-pilot
juju relate istio-pilot:gateway-info kserve-controller:ingress-gateway
channel
is the available channels of the Charmed KServe:
- latest/edge
- 0.10/stable
kserve-operatos support Serveless
mode to manage event driven InferenceService
s, which enables autoscaling on demand, and supports scaling down to zero.
- Deploy
kserver-controller
juju deploy kserve-controller --channel <channel> --config deployment-mode="serverless" --trust
- Relate
kserve-controller
andistio-pilot
juju relate istio-pilot:gateway-info kserve-controller:ingress-gateway
- Relate
kserve-controller
andknative-serving
juju relate kserve-controller:local-gateway knative-serving:local-gateway
To deploy a simple example of an InferenceServer
, you can use the one provided in examples/
NOTE: this example is based on First InferenceService
- Create an
InferenceService
in a testing namespace
USER_NS="kserve-testing"
kubectl create ns ${USER_NS}
kubectl apply -f sklearn-iris.yaml -n${USER_NS}
- Check the
InferenceService
status
kubectl get inferenceservices sklearn-iris -n${USER_NS}
- Determine the URL for performing inference
- Using the
ClusterIP
NOTE: this method can only be used for performing inference within the cluster.
SERVICE_IP=$(kubectl get svc sklearn-iris-predictor-default -n${USER_NS} -ojsonpath='{.spec.clusterIP}')
INFERENCE_URL="${SERVICE_IP}/v1/models/sklearn-iris:predict"
- Using the
InferenceService
URL
kubectl get inferenceservice sklearn-iris -n${USER_NS}
# From the output, take the URL
INFERENCE_URL="${URL}/v1/models/sklearn-iris:predict"
- Perform inference
Create a file with the input request:
cat <<EOF > "./iris-input.json"
{
"instances": [
[6.8, 2.8, 4.8, 1.4],
[6.0, 3.4, 4.5, 1.6]
]
}
EOF
Now call the InferenceService
:
curl -v $INFERENCE_URL -d @iris-input.json
Expected output:
{"predictions": [1, 1]}
Canonical Charmed Kubeflow is a state of the art, fully supported MLOps platform that helps data scientists collaborate on AI innovation on any cloud from concept to production, offered by Canonical - the publishers of Ubuntu.
Charmed Kubeflow is free to use: the solution can be deployed in any environment without constraints, paywall or restricted features. Data labs and MLOps teams only need to train their data scientists and engineers once to work consistently and efficiently on any cloud – or on-premise.
Charmed Kubeflow offers a centralised, browser-based MLOps platform that runs on any conformant Kubernetes – offering enhanced productivity, improved governance and reducing the risks associated with shadow IT.
Learn more about deploying and using Charmed Kubeflow at https://charmed-kubeflow.io.
- Centralised, browser-based data science workspaces: familiar experience
- Multi user: one environment for your whole data science team
- NVIDIA GPU support: accelerate deep learning model training
- Apache Spark integration: empower big data driven model training
- Ideation to production: automate model training & deployment
- AutoML: hyperparameter tuning, architecture search
- Composable: edge deployment configurations available
- LDAP Authentication
- Jupyter Notebooks
- Work with Python and R
- Support for TensorFlow, Pytorch, MXNet, XGBoost
- TFServing, Seldon-Core
- Katib (autoML)
- Apache Spark
- Argo Workflows
- Kubeflow Pipelines
- Maintenance: Charmed Kubeflow offers up to two years of maintenance on select releases
- Optional 24/7 support available, contact us here for more information
- Optional dedicated fully managed service available, contact us here for more information or learn more about Canonical’s Managed Apps service.
- Portability: Charmed Kubeflow can be deployed on any conformant Kubernetes, on any cloud or on-premise
Please see the official docs site for complete documentation of the Charmed Kubeflow distribution.
KServe controller comes with a set of preconfigured images that are used in Kserve workloads. The default images are listed in default-custom-images.json
These images can be overridden in the charm configuration under custom_images in the charms/kserve-controller/config.yaml file. Whenever you leave the custom_images field empty in the config, the default images will be used (listed above). You can specify your own images with the config by filling one or multiple entries. The config accepts either YAML or JSON entries. For example.
juju config kserve-controller custom_images='{"configmap__agent": "custom:1.0", "serving_runtimes__lgbserver": "cuustom:2.1"}'
These images are being used in .j2 files under charms/kserve-controller/src/templates/.j2.
If you find a bug in our operator or want to request a specific feature, please file a bug here: https://github.com/canonical/dex-auth-operator/issues
Charmed Kubeflow is free software, distributed under the Apache Software License, version 2.0.
Canonical welcomes contributions to Charmed Kubeflow. Please check out our contributor agreement if you're interested in contributing to the distribution.
Security issues in Charmed Kubeflow can be reported through LaunchPad. Please do not file GitHub issues about security issues.