Skip to content
/ owca Public

OWCA - Orchestration-aware Workload Collocation Agent

License

Notifications You must be signed in to change notification settings

iwankgb/owca

Repository files navigation

OWCA - Orchestration-Aware Workload Collocation Agent

https://travis-ci.com/intel/owca.svg?branch=master

This software is pre-production and should not be deployed to production servers.

Orchestration-aware Workload Collocation Agent goal is to reduce interference between collocated tasks and increase tasks density while ensuring the quality of service for high priority tasks. Chosen approach is to enable real-time resource isolation management to ensure that high priority jobs meet their Service Level Objective (SLO) and best-effort jobs effectively utilize as many idle resources as possible.

Resource usage can be increased by:

  • collocating best effort and high priority tasks to exploit resources that are underutilized by high priority applications,
  • collocating tasks that do not compete for shared resources on the platform.

docs/overview.png

OWCA abstracts compute node, workloads, monitoring and resource allocation. An externally provided algorithm is responsible for allocating resources or anomaly detection logic. OWCA and the algorithm exchange information about current resource usage, isolation actuations or detected anomalies. OWCA stores information about detected anomalies, resource allocation and platform utilization metrics to a remote storage such as Kafka.

The diagram below puts OWCA in context of an example Mesos cluster and monitoring infrastructure:

docs/context.png

See OWCA Architecture 1.5.pdf for futher details.


OWCA is targeted at and tested on Centos 7.5.

Note: for full production installation please follow this detailed installation guide.

# Install required software.
sudo yum install epel-release -y
sudo yum install git python36 -y
python3.6 -m ensurepip --user
python3.6 -m pip install --user pipenv

# Clone the repository & build.
git clone https://github.com/intel/owca
cd owca
pipenv install --dev
pipenv shell
tox

# Run manually (alongside Mesos agent):
sudo dist/owca.pex --config configs/mesos_example.yaml --root

OWCA introduces simple but extensible mechanism to inject dependencies into classes and build complete software stack of components. OWCA main control loop is based on Runner base class that implements single run blocking method. Depending on Runner class used, the OWCA is run in different execution mode (e.g. detection, allocation).

Examples runners:

  • DetectionRunner implements the loop calling detect function in regular and configurable intervals. See detection API for details.
  • AllocationRunner (Work in progress) implements the loop calling allocate function in regular and configurable intervals. See allocation API for details.

Conceptually Runner reads a state of the system (both metrics and workloads), passes the information to external component (an algorithm), logs the algorithm input and output using implementation of Storage and allocates resources if instructed.

Following snippet is an example configuration of a runner:

runner: !SomeRunner
    node: !SomeNode
    callback_component: !ClassImplementingCallback
    storage: !SomeStorage

After starting OWCA with the above mentioned configuration, an instance of the class SomeRunner will be created. The instance's properties will be set to:

  • node - to an instance of SomeNode
  • callback_component - to an instance of ClassImplementingCallback
  • storage - to an instance of SomeStorage

Configuration mechanism allows to:

  • Create and configure complex python objects (e.g. DetectionRunner, MesosNode, KafkaStorage) using YAML tags.
  • Inject dependencies (with type checking support) into constructed objects using dataclasses annotations.
  • Register external classes using -r command line argument or by using owca.config.register decorator API.

See external detector example for more details.

Following built-in components are available:

The project contains Dockerfiles together with helper scripts aimed at preparation of reference workloads to be run on Mesos cluster using Aurora framework.

To enable anomaly detection algorithm validation the workloads are prepared to:

  • provide continuous stream of Application Performance Metrics using wrappers (all workloads),
  • simulate varying load (patches to generate sine-like pattern of requests per second are available for YCSB and rpc-perf ).

See workloads directory for list of supported applications and load generators.

About

OWCA - Orchestration-aware Workload Collocation Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published