Automated User Acceptance Tests (UATs) are essential for evaluating the stability of Charmed Kubeflow, as well as catching issues early, and are intended to be an invaluable testing tool both pre-release and post-installation. They combine different components of Charmed Kubeflow in a way that gives us confidence that everything works as expected, and are meant to be used by end-users as well as developers alike.
Charmed Kubeflow UATs are broken down in test scenarios implemented as Python notebooks, which are
easy to share, understand, and maintain. We provide a standalone test suite included in tests
that users can run directly from inside a Notebook with pytest
, as well as a driver
that
automates the execution on an existing Kubeflow cluster. More details on running the tests can be
found in the Run the tests section.
Executing the UATs requires a deployed Kubeflow cluster. That said, the deployment and configuration steps are outside the scope of this project. In other words, the automated tests are going to assume programmatic access to a Kubeflow installation. Such a deployment consists (at the very least) of the following pieces:
- A Kubernetes cluster, e.g.
- MicroK8s
- Charmed Kubernetes
- EKS cluster
- AKS cluster
- Charmed Kubeflow deployed on top of it
- MLflow (optional) deployed alongside Kubeflow
For instructions on deploying and getting started with Charmed Kubeflow, we recommend that you start with this guide.
The UATs include tests that assume MLflow is installed alongside Kubeflow, which will otherwise fail. For instructions on deploying MLflow you can start with this guide, ignoring the EKS specific steps.
As mentioned before, when it comes to running the tests, you've got 2 options:
- Running the
tests
suite directly withpytest
inside a Jupyter Notebook - Running the tests on an existing cluster using the
driver
along with the provided automation
NOTE: Depending on the version of Charmed Kubeflow you want to test, make sure to checkout to the appropriate branch with git checkout
:
- Charmed Kubeflow 1.8 ->
track/1.8
- Charmed Kubeflow 1.7 ->
track/1.7
-
Create a new Notebook using the
jupyter-scipy
image:- Navigate to
Advanced options
>Configurations
- Select all available configurations in order for Kubeflow integrations to work as expected
- Launch the Notebook and wait for it to be created
- Navigate to
-
Start a new terminal session and clone this repo locally:
git clone https://github.com/canonical/charmed-kubeflow-uats.git
-
Navigate to the
tests
directory:cd charmed-kubeflow-uats/tests
-
Follow the instructions of the provided README.md to execute the test suite with
pytest
To run the tests, Python 3.8 and Tox must be installed on your system. If your default Python version is higher than 3.8, you can set up Python 3.8 with the following commands:
sudo add-apt-repository ppa:deadsnakes/ppa -y
sudo apt update -y
sudo apt install python3.8 python3.8-distutils python3.8-venv -y
Next, create a virtual environment with Python 3.8 and install Tox:
python3.8 -m venv venv
source venv/bin/activate
pip install tox
Next, clone this repo locally and navigate to the repo directory:
git clone https://github.com/canonical/charmed-kubeflow-uats.git
cd charmed-kubeflow-uats/
Then in order to run UATs, there are couple options:
In this case, tests are fetched from a remote commit of charmed-kubeflow-uats
repository. In order to define the commit, tests use the hash of the HEAD
, where the repository is checked out locally. This means that when you want to run tests from a specific branch, you need to check out to that branch and then run the tests. Note that if the locally checked out commit is not pushed to the remote repository, then tests will fail.
# assumes an existing `kubeflow` Juju model
tox -e uats-remote
This one works only when running the tests from the same node where the tests job is deployed (e.g. running from the same machine where the Microk8s cluster lives). In this case, the tests job instantiates a volume that is mounted to the local directory of the repository where tests reside. If unsure about your setup, use the -remote
option.
# assumes an existing `kubeflow` Juju model
tox -e uats-local
You can also run a subset of the provided tests using the --filter
option and passing a filter
that follows the same syntax as the pytest -k
option, e.g.
# run any test that doesn't contain 'kserve' in its name
tox -e uats-remote -- --filter "not kserve"
# run all tests containing 'kfp' or 'katib' in their name
tox -e uats-local -- --filter "kfp or katib"
This simulates the behaviour of running pytest -k "some filter"
directly on the test suite.
You can read more about the options provided by Pytest in the corresponding section of the
documentation.
In order to only run the Kubeflow-specific tests (i.e. no MLflow integration) you can use the
dedicated kubeflow
tox test environment:
# assumes an existing `kubeflow` Juju model
# run tests from the checked out commit after fetching them remotely
tox -e kubeflow-remote
# run tests from the local copy of the repo
tox -e kubeflow-local
In order to only run the tests that test integration with MLflow, you can use the
dedicated mlflow
tox test environment:
# assumes an existing `kubeflow` Juju model
# run tests from the checked out commit after fetching them remotely
tox -e mlflow-remote
# run tests from the local copy of the repo
tox -e mlflow-local
To be able to run UATs requiring KServe (e2e-wine, kserve, mlflow-kserve) behind proxy, first you need to configure kserve-controller
and knative-serving
charms to function behind proxy.
Note
For information on how to fill out the proxy config values, see the Running using Notebook > Prerequisites
section below.
- Set the
http-proxy
,https-proxy
, andno-proxy
configs inkserve-controller
charm
juju config kserve-controller http-proxy=<proxy_address>:<proxy_port> https-proxy=<proxy_address>:<proxy_port> no-proxy=<cluster cidr>,<service cluster ip range>,127.0.0.1,localhost,<nodes internal ip(s)>/24,<cluster hostname>,.svc,.local,.kubeflow
- Set the
http-proxy
,https-proxy
, andno-proxy
configs inknative-serving
charm
juju config knative-serving http-proxy=<proxy_address>:<proxy_port> https-proxy=<proxy_address>:<proxy_port> no-proxy=<cluster cidr>,<service cluster ip range>,127.0.0.1,localhost,<nodes internal ip(s)>/24,<cluster hostname>,.svc,.local
For Example:
juju config kserve-controller http-proxy=http://10.0.13.50:3128/ https-proxy=http://10.0.13.50:3128/ no-proxy=10.1.0.0/16,10.152.183.0/24,127.0.0.1,localhost,10.0.2.0/24,ip-10-0-2-157,.svc,.local,.kubeflow
juju config knative-serving http-proxy=http://10.0.13.50:3128/ https-proxy=http://10.0.13.50:3128/ no-proxy=10.1.0.0/16,10.152.183.0/24,127.0.0.1,localhost,10.0.2.0/24,ip-10-0-2-157,.svc,.local
Edit the PodDefault to replace the placeholders for:
http_proxy
andhttps_proxy
- The address and port of your proxy server, format should be<proxy_address>:<proxy_port>
no_proxy
- A comma separated list of items that should not be proxied. It is recommended to include the following:
<cluster cidr>,<service cluster ip range>,127.0.0.1,localhost,<nodes internal ip(s)>/24,<cluster hostname>,.svc,.local,.kubeflow
where,
-
<cluster cidr>
: you can get this value by running:cat /var/snap/microk8s/current/args/kube-proxy | grep cluster-cidr
-
<service cluster ip range>
: you can get this value by running:cat /var/snap/microk8s/current/args/kube-apiserver | grep service-cluster-ip-range
-
<nodes internal ip(s)>
: the Internal IP of the nodes where your cluster is running, you can get this value by running:microk8s kubectl get nodes -o wide
It is the
INTERNAL-IP
value -
<hostname>
: the name of your host on which the cluster is deployed, you can use thehostname
command to get it -
localhost
and127.0.0.1
are recommended to avoid proxying requests tolocalhost
-
.kubeflow
: is needed in theno-proxy
values to allow communication with the minio service.
To run the tests behind proxy using Notebook:
-
Login to the Dashboard and Create a Profile
-
Apply the PodDefault to your Profile's namespace, make sure you already followed the Prerequisites section to modify the PodDefault. Apply it with:
microk8s kubectl apply -f ./tests/proxy-poddefault.yaml -n <your_namespace>
-
Create a Notebook and from the
Advanced Options > Configurations
selectAdd proxy settings
, then clickLaunch
to start the Notebook. Wait for the Notebook to be Ready, then Connect to it. -
From inside the Notebook, start a new terminal session and clone this repo:
git clone https://github.com/canonical/charmed-kubeflow-uats.git
Open the
charmed-kubeflow-uats/tests
directory and for each.ipynb
test file there, open it and run the Notebook.Currently, the following tests are supported to run behind proxy:
- e2e-wine
- katib
- kfp_v2
- kserve
- mlflow
- mlflow-kserve
- mlflow-minio
- training
You can pass the --proxy
flag and set the values for proxies to the tox command and this should automatically apply the required changes to run behind proxy.
tox -e kubeflow-<local|remote> -- --proxy http_proxy="http_proxy:port" https_proxy="https_proxy:port" no_proxy="<cluster cidr>,<service cluster ip range>,127.0.0.1,localhost,<nodes internal ip(s)>/24,<cluster hostname>,.svc,.local,.kubeflow"
Any environment that can be used to access and configure the Charmed Kubeflow deployment is
considered a configured management environment. That is, essentially, any machine with kubectl
access to the underlying Kubernetes cluster. This is crucial, since the driver directly depends on
a Kubernetes Job to run the tests. More specifically, the driver
executes the following steps:
- Create a Kubeflow Profile (i.e.
test-kubeflow
) to run the tests in - Submit a Kubernetes Job (i.e.
test-kubeflow
) that runstests
The Job performs the following:- If a
-local
tox environment is run, then it mounts the localtests
directory to a Pod that usesjupyter-scipy
as the container image. Else (in-remote
tox environments), it creates an emptyDir volume which it syncs to the current commit that the repo is checked out locally, using a git-syncinitContainer
. - Install python dependencies specified in the requirements.txt
- Run the test suite by executing
pytest
- If a
- Wait until the Job completes (regardless of the outcome)
- Collect and report its logs, corresponding to the
pytest
execution oftests
- Cleanup (remove created Job and Profile)
With the current implementation we have to wait until the Job completes to fetch its logs. Of
course this makes for a suboptimal UX, since the user might have to wait long before they learn
about the outcome of their tests. Ideally, the Job logs should be streamed directly to the pytest
output, providing real-time insight. This is a known limitation that will be addressed in a future
iteration.