Gain more insight in your scientific workflow management system and optimize workflow execution while minimizing costs.
Explore the project report for more detailed information »
View Demo »
This is a repository for the university project "Master Project: Distributed Systems - Monitoring of Scientific Workflows" attended during the summer term 2022 at the Technical University Berlin. In this project we should gain practical experience with so called Scientific Workflow Management Systems (SWMS) and extend existing ones with additional functionalities to give them extra value. In our subproject we extend the SWMS Apache Airflow monitoring capabilities with the following capabilities:
- make use of the Extended Berkeley Packet Filter (eBPF) to get low level kernel space information and process them to gain more insight into the Airflow Tasks
The project should use a semi realistic environment and is therefore settled in the kubernetes ecosystem to reflect the actual real world scenarios with huge workloads in highly distributed systems.
We try to make the deployment as simple as possible and therefore using kubernetes and helm to deploy a working prototype to the Google Cloud Plattform.
To get a copy up and running follow these simple example steps.
- Install kubectl
- Install Helm
- Install and configure gcloud CLI
Simply run ./setup.sh
in the cloned repository directory.
Run ./deprovision.sh
to prune all artificats and confirm with y
.
When deployed, the important services are passed through kubectl
to localhost and can be accessed through the following addresses:
- Airflow UI accessible via http://localhost:8080
- Grafana dashboard accessible via http://localhost:3000
The login credentials for the Airflow UI and Grafana are by default the following:
- Username: pjds
- Password: pjds
Distributed under the MIT License. See LICENSE for more information.