Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to guide on Dask-Nebari config #79

Closed
wants to merge 3 commits into from
Closed

Conversation

sayantikabanik
Copy link
Contributor

resolves #68

This PR is a work in progress. I am currently referring to this guide to pick bits and pieces.

@netlify
Copy link

netlify bot commented Jun 30, 2022

Deploy Preview for nebari-docs ready!

Name Link
🔨 Latest commit 1b7704b
🔍 Latest deploy log https://app.netlify.com/sites/nebari-docs/deploys/6321d975bbe0b30009e09d42
😎 Deploy Preview https://deploy-preview-79--nebari-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@sayantikabanik
Copy link
Contributor Author

@HarshCasper @iameskild feel free to continue on this PR or close it.

@iameskild
Copy link
Member

@HarshCasper could you look at integrating these docs into this how to guide?

@netlify
Copy link

netlify bot commented Jul 19, 2022

Deploy Preview for nebari-dev ready!

Name Link
🔨 Latest commit 40db948
🔍 Latest deploy log https://app.netlify.com/sites/nebari-dev/deploys/62d6d025aa46d80009cc7abd
😎 Deploy Preview https://deploy-preview-79--nebari-dev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@trallard trallard requested a review from kcpevey July 26, 2022 14:15
@trallard trallard requested a review from iameskild July 28, 2022 15:06
@kcpevey kcpevey marked this pull request as ready for review September 8, 2022 13:15
@kcpevey kcpevey changed the title WIP- How to guide on Dask-Nebari config How to guide on Dask-Nebari config Sep 8, 2022
Comment on lines +8 to +10
In this tutorial we will dive into the `Nebari-Dask` configuration details. Nebari config is essentially
a `yaml` file which is at the heart of all things (most of them) related to configurations.
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this tutorial we will dive into the `Nebari-Dask` configuration details. Nebari config is essentially
a `yaml` file which is at the heart of all things (most of them) related to configurations.
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file.
In this tutorial we will dive into the configuration requirements for running Daask on Nebari. The Nebari config (`qhub_config.yml`) is at the heart of all things (most of them) related to configurations.
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file.

Comment on lines +14 to +21
Before we dive deeper in configuration details, let's understand about how are the core configuration
components.

### Core components:

- Dask-gateway
- dask workers
- Dask scheduler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Before we dive deeper in configuration details, let's understand about how are the core configuration
components.
### Core components:
- Dask-gateway
- dask workers
- Dask scheduler
There are three core configuration components necessary for setting up Dask on Nebari.
These are:
- [Dask-gateway](https://gateway.dask.org/): provides a secure, multi-tenant server for managing [Dask](https://dask.org/) clusters
- [Dask workers](https://distributed.dask.org/en/stable/worker.html): compute tasks as directed by the scheduler and store/serve results
- [Dask scheduler](https://docs.dask.org/en/stable/scheduler-overview.html): executes the task graph by coordinating task distribution

a `yaml` file which is at the heart of all things (most of them) related to configurations.
Our main focus in this tutorial will be the `profiles` & `dask_worker` section of the config file.

### Basic infrastructure details
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Basic infrastructure details
## Basic infrastructure details

- dask workers
- Dask scheduler

### How to configure dask gateway profiles?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### How to configure dask gateway profiles?
## Configuring Dask Gateway profiles

scheduler_cores: 6
scheduler_memory_limit: 30G
scheduler_memory: 28G
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```

Next Steps

Now that you have your Nebari instance all set up to run Dask, check out our user guide on Working with Big Data using Dask!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just guessing at that link to the page, you should double check that works


- `limit` is the absolute max memory a given pod can consume. Suppose a process within the pod consumes more than the `limit` memory. In that case, the Linux OS will kill the process. `limit` is not used for scheduling purposes with Kubernetes.

- `guarantee`: is the amount of memory the Kubernetes scheduler uses to place a given pod. In general, the `guarantee` will be less than the limit. Often the node itself has less available memory than the node specification. See this [guide from digital ocean](https://docs.digitalocean.com/products/kubernetes/#allocatable-memory), which generally applies to other clouds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `guarantee`: is the amount of memory the Kubernetes scheduler uses to place a given pod. In general, the `guarantee` will be less than the limit. Often the node itself has less available memory than the node specification. See this [guide from digital ocean](https://docs.digitalocean.com/products/kubernetes/#allocatable-memory), which generally applies to other clouds.
- `guarantee`: is the amount of memory the Kubernetes scheduler uses to place a given pod. In general, the `guarantee` will be less than the limit. Often the node itself has less available memory than the node specification. You may want to check out this [guide from digital ocean](https://docs.digitalocean.com/products/kubernetes/#allocatable-memory) which also generally applies to other clouds.

mem_guarantee: 4G
```

### How to configure dask scheduler?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### How to configure dask scheduler?
## Configuring the Dask Scheduler


### How to configure dask scheduler?

In a few instances, the dask worker node-group might be running on quite a large instance, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the dask scheduler.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In a few instances, the dask worker node-group might be running on quite a large instance, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the dask scheduler.
For analyses requiring heavy compute, there may be some situations where the Dask worker node-group might be running on quite a large cluster, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the Dask scheduler.

### How to configure dask scheduler?

In a few instances, the dask worker node-group might be running on quite a large instance, perhaps with 8 CPUs and 32 GB of memory (or more). In this case, you might also want to increase the resource levels associated with the dask scheduler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following is an example of a Dask worker configuration with the Scheduler resources modified.

Comment on lines +72 to +81
dask_worker:
"Huge Worker":
worker_cores_limit: 7
worker_cores: 6
worker_memory_limit: 30G
worker_memory: 28G
scheduler_cores_limit: 7
scheduler_cores: 6
scheduler_memory_limit: 30G
scheduler_memory: 28G
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a jupyterlab profile?

@kcpevey kcpevey added needs: changes 🧱 Review completed - some changes are needed before merging and removed needs: review 👀 labels Sep 14, 2022
@kcpevey kcpevey assigned iameskild and unassigned HarshCasper Oct 18, 2022
@pavithraes pavithraes self-assigned this Dec 22, 2022
@pavithraes
Copy link
Member

The content in this PR is covered in #115. :)

@pavithraes pavithraes closed this Jan 20, 2023
@kcpevey kcpevey deleted the how_to_guide branch August 4, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: documentation 📖 content: doc/how-to Diatiaxis - how to needs: changes 🧱 Review completed - some changes are needed before merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DOC] How To configure Dask on Nebari
6 participants