-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to guide on Dask-Nebari config #79
Conversation
✅ Deploy Preview for nebari-docs ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
@HarshCasper @iameskild feel free to continue on this PR or close it. |
@HarshCasper could you look at integrating these docs into this how to guide? |
✅ Deploy Preview for nebari-dev ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
In this tutorial we will dive into the `Nebari-Dask` configuration details. Nebari config is essentially | ||
a `yaml` file which is at the heart of all things (most of them) related to configurations. | ||
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this tutorial we will dive into the `Nebari-Dask` configuration details. Nebari config is essentially | |
a `yaml` file which is at the heart of all things (most of them) related to configurations. | |
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file. | |
In this tutorial we will dive into the configuration requirements for running Daask on Nebari. The Nebari config (`qhub_config.yml`) is at the heart of all things (most of them) related to configurations. | |
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file. |
Before we dive deeper in configuration details, let's understand about how are the core configuration | ||
components. | ||
|
||
### Core components: | ||
|
||
- Dask-gateway | ||
- dask workers | ||
- Dask scheduler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we dive deeper in configuration details, let's understand about how are the core configuration | |
components. | |
### Core components: | |
- Dask-gateway | |
- dask workers | |
- Dask scheduler | |
There are three core configuration components necessary for setting up Dask on Nebari. | |
These are: | |
- [Dask-gateway](https://gateway.dask.org/): provides a secure, multi-tenant server for managing [Dask](https://dask.org/) clusters | |
- [Dask workers](https://distributed.dask.org/en/stable/worker.html): compute tasks as directed by the scheduler and store/serve results | |
- [Dask scheduler](https://docs.dask.org/en/stable/scheduler-overview.html): executes the task graph by coordinating task distribution |
a `yaml` file which is at the heart of all things (most of them) related to configurations. | ||
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file. | ||
|
||
### Basic infrastructure details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Basic infrastructure details | |
## Basic infrastructure details |
- dask workers | ||
- Dask scheduler | ||
|
||
### How to configure dask gateway profiles? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### How to configure dask gateway profiles? | |
## Configuring Dask Gateway profiles |
scheduler_cores: 6 | ||
scheduler_memory_limit: 30G | ||
scheduler_memory: 28G | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``` |
Next Steps
Now that you have your Nebari instance all set up to run Dask, check out our user guide on Working with Big Data using Dask!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just guessing at that link to the page, you should double check that works
|
||
- `limit` is the absolute max memory a given pod can consume. Suppose a process within the pod consumes more than the `limit` memory. In that case, the Linux OS will kill the process. `limit` is not used for scheduling purposes with Kubernetes. | ||
|
||
- `guarantee`: is the amount of memory the Kubernetes scheduler uses to place a given pod. In general, the `guarantee` will be less than the limit. Often the node itself has less available memory than the node specification. See this [guide from digital ocean](https://docs.digitalocean.com/products/kubernetes/#allocatable-memory), which generally applies to other clouds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `guarantee`: is the amount of memory the Kubernetes scheduler uses to place a given pod. In general, the `guarantee` will be less than the limit. Often the node itself has less available memory than the node specification. See this [guide from digital ocean](https://docs.digitalocean.com/products/kubernetes/#allocatable-memory), which generally applies to other clouds. | |
- `guarantee`: is the amount of memory the Kubernetes scheduler uses to place a given pod. In general, the `guarantee` will be less than the limit. Often the node itself has less available memory than the node specification. You may want to check out this [guide from digital ocean](https://docs.digitalocean.com/products/kubernetes/#allocatable-memory) which also generally applies to other clouds. |
mem_guarantee: 4G | ||
``` | ||
|
||
### How to configure dask scheduler? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### How to configure dask scheduler? | |
## Configuring the Dask Scheduler |
|
||
### How to configure dask scheduler? | ||
|
||
In a few instances, the dask worker node-group might be running on quite a large instance, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the dask scheduler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a few instances, the dask worker node-group might be running on quite a large instance, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the dask scheduler. | |
For analyses requiring heavy compute, there may be some situations where the Dask worker node-group might be running on quite a large cluster, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the Dask scheduler. |
### How to configure dask scheduler? | ||
|
||
In a few instances, the dask worker node-group might be running on quite a large instance, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the dask scheduler. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following is an example of a Dask worker configuration with the Scheduler resources modified. |
dask_worker: | ||
"Huge Worker": | ||
worker_cores_limit: 7 | ||
worker_cores: 6 | ||
worker_memory_limit: 30G | ||
worker_memory: 28G | ||
scheduler_cores_limit: 7 | ||
scheduler_cores: 6 | ||
scheduler_memory_limit: 30G | ||
scheduler_memory: 28G |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a jupyterlab profile?
The content in this PR is covered in #115. :) |
resolves #68
This PR is a work in progress. I am currently referring to this guide to pick bits and pieces.