Gemini

About

Gemini is an efficient GPU resource sharing system with fine-grained control for Linux platforms.

It shares a NVIDIA GPU among multiple clients with specified resource constraint, and works seamlessly with any CUDA-based GPU programs. Besides, it is also work-conserving and with low overhead, so nearly no compute resource waste will happen.

Our implementation is based on "Gemini: Enabling Multi-Tenant GPU Sharing Based on Kernel Burst Estimation".

However, we extend the Gemini in a number of ways:

Compatible with Kubeshare
Multi GPU support. For instance, we offer the option to order multiple half-sized GPUs.

System Structure

Gemini consists of three parts: scheduler, pod manager and hook library.

scheduler (GPU device manager) (gem-schd): A daemon process managing token. Based on information provided in resource configuration file (resource-config.txt), scheduler determines whom to give token. Clients can launch CUDA kernels only when holding a valid token.
hook library (libgemhook.so.1): A library intercepting CUDA-related function calls. It utilizes the mechanism of LD_PRELOAD, which forces our hook library being loaded before any other dynamic linked libraries.
pod manager (gem-pmgr): A proxy for forwarding messages to applications/scheduler. It act as a client to scheduler, and every application sending requests to scheduler via this pod manager shares the token.

Currently we use TCP socket as the communication interface between components.

Build

Basically all components can be built with the following command:

make [CUDA_PATH=/path/to/cuda/installation] [PREFIX=/place/to/install] [DEBUG=1]

This command will install the built binaries in $(PREFIX)/bin and $(PREFIX)/lib. Default value for PREFIX is $(pwd)/...

Adding DEBUG=1 in above command will make hook library and executables outputs more scheduling details.

Usage

resource configuration file format

First line contains an integer N, indicating there are N clients.

The following N lines are of the format:

[ID] [REQUEST] [LIMIT] [GPU_MEM]

ID: name of the client (ASCII string less than 63 characters). We use this name as identifier of client, so this name must be unique.
REQUEST: minimum required ratio of GPU usage time (between 0 and 1).
LIMIT: maximum allowed ratio of GPU usage time (between 0 and 1).
GPU_MEM: maximum allowed GPU memory usage (in bytes).

Changes to this file will be monitored by gem-schd. After each change, scheduler will read this file again and update settings. (*Note that client must restart to get new memory limit)

Run

We provide two Python scripts under tools/ for launching scheduling system (launch-backend.py) (launches scheduler and pod managers) and applications (launch-command.py).

By default scheduler uses port 50051, and pod managers use ports starting from 50052 (50052, 50053, ...).

For more details, refer to those scripts and source code.

Contributors

jim90247 eee4017 ncy9371 kerwenwwer

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
src		src
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
resource-config.txt		resource-config.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemini

About

System Structure

Build

Usage

resource configuration file format

Run

Contributors

About

Releases 5

Packages

Contributors 2

Languages

License

NTHU-LSALAB/Gemini

Folders and files

Latest commit

History

Repository files navigation

Gemini

About

System Structure

Build

Usage

resource configuration file format

Run

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 2

Languages

Packages