-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate / review / setup node selection logic #1970
Comments
Started looking at the code in the toleration injector controller. Will study it some more and post follow up questions here and request comments from Pat and other people |
Studied some alternative methods of node selection. There is one based on nodeSelector field and node labels, another one based on affinity field that allows for more expressive conditions on node labels and also distinguishes between required and preferential conditions, finally there is one based on node taints and corresponding tolerations. |
Pat, could you comment on what types of kubernetes workloads are expected to be openm workloads? How will they be distinguished from other workloads? Do we have a set of labels in mind that will be applied to the pod manifests? And do you prefer a particular node selection method to be used over others (ie toleration injection versus nodeSelector or affinity specified in pod specs)? |
Hi @jacek-dudek, I think there are two workloads to consider:
I think we would need a new openm/microsimulation specific label to target a d64 node pool, is that what you were thinking @Souheil-Yazji ? |
Just at an initial glance, it seems the best approach is to always have the users submit their Open M jobs as a separate workload. This will allow us to build the foundation for MPI jobs in the future, if that ever becomes functional. This would also limit the cost factor for users scaling larger notebooks to run jobs but then idle resources after.
Whether we use a node selector label or taint/toleration isn't very problematic. |
The two scenarios that I can see for users not submitting the jobs as a separate workload are:
|
@vexingly
|
I think notebooks should target intermittent workloads of ~4 CPU and should over provision / expect some slowness for multiple users, non-production runs more development and testing / configuring. When you say big-cpu do you mean the 72 core machines? Is that what we will use for the time being? They may not have enough memory for some users workloads, although the CPU's are sufficient. We will need more nodes for sure, I think each of the 4 projects have a quota of 200 CPU currently. |
The nodepools will be updated in #1967 but we will require some logic for node selection.
Existing logic for notebooks is here: https://github.com/StatCan/aaw-toleration-injector/blob/main/mutate.go
The text was updated successfully, but these errors were encountered: