-
-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make KubeCluster more configurable #26
Conversation
This allows us an easy escape hatch for the various configuration items that we will not directly support on the constructor
We didn't have it in some places (like env) and did in others, so let's standardize on *not* having them.
This allows it to bind on any available port
This is the kubernetes convention. Reversed order is docker label convention.
Forgot to git add
I'm concerned that the current approach requires more configuration in the notebook than the average science user will be comfortable with. For example the following code snippet is probably too much boiler plate for a science user to fully grok: cluster = KubeCluster(
 loop=loop,
 n_workers=0,
 env={"TEST": "HI"},
 extra_container_config={
 "env": [ {"name": "BOO", "value": "FOO" } ],
 "args": ["last-item"]
 }
 ) And yet it might also be necessary for everyone in a certain group (I expect research groups to have similar specs like this). I'm still a fan of optionally placing the entire PodSpec in a yaml file if possible. This conforms to a pre-existing specification (Kubernetes itself), and is easy for a group tech leader to configure and distribute to their group. |
daskernetes/core.py
Outdated
) | ||
return pod | ||
|
||
def _set_k8s_attribute(self, obj, attribute, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the only reference to self is to get out core_api. Perhaps this should be a standalone function that accepts an api object? That might also make it easier to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, I'll do that!
I think the serialize function there could be a static or classmethod, I'll file a bug upstream.
assert pod.spec.containers[0].image_pull_policy == 'IfNotReady' | ||
assert pod.spec.containers[0].security_context == { | ||
'runAsUser': 0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very glad to see this working well :)
@rsignell-usgs @rabernat you two are probably early users for this. The work here is central in allowing people to customize the environments of their dask clusters. This is probably the sort of thing you would do to help support your research groups. |
We don't expect much of this to look familiar to you today, but it would be good to have your input on the kinds of interactions that you would be comfortable learning. |
@mrocklin I don't disagree with you in any form or way :) I think it's only a question of factoring. We can easily build a YAML based object on top of this. I don't think we should use YAML in the core of the functionality we have here, because I think the UI for interacting with this should be made available in many ways:
IMO the 'right' factoring is to keep the core agnostic to how users who don't want deep configurability might want to use it (since I strongly suspect that's going to be a JupyterLab UI rather than YAML! Everyone I've asked to write YAML eventually hates it, and we get lots of indent related errors), and then work on the various GUI aspects. I think next step after this is to figure out both:
Hope that makes sense. |
As @yuvipanda suspected, I would like all of them, and likely would use the GUI with a JupyterLab extension the most. But for sure others would prefer the YAML... |
Yeah, sorry, I wasn't trying to suggest that users must write YAML in order to configure their Pods, merely that we're able to pass through PodSpecs as are commonly defined in other established forms (like standard yaml spec, or the
My thought is that this would just be the same as someone would define a PodSpec in a normal Kubernetes file. This way we know that our escape hatch is complete. |
@mrocklin right, I think that is a very valid concern. So the extremes of that as I can see are:
I think both these extremes are not ideal, and we should strike something of a middle ground - where the most basic setups are done by constructor arguments, hitting 60-80% of use cases, and then the rest has an escape hatch. This has worked well for us in kubespawner. Do you think this is a valid framework to think about it? If so, then the question becomes 'where is the line?'. IMO that line is fairly clear, and we're pretty much at it. We might want the ability to add annotations, but that should be it for the constructor. Everything else should go through the 'escape hatch'. You can pass in direct output of yaml.load to the escape hatch now and that should work... It's good to write this down however, and if you think this makes sense I can document this in the comments of the class. |
I'm trying to spin up on this to the point where I can give some feedback. But kubernetes is a new language to me. I understand that this PR adds options for customization of dask workers. Does it also allow custom conda / pip packages to be installed? If so, how does that fit in to the various options discussed above? |
@rabernat I think it's a little removed from there yet I think. For me, the most useful thing to know would be 'what would your ideal workflow look like?' |
@mrocklin upon more reflection, I wonder if:
Is actually not a bad idea at all. We can provide convenience methods on top of it (essentially what we have now in the KubeCluster constructor), but the core can be kept simple. |
Having this as an option would make me happy. Does this stop us from also accepting keywords that layer dictionaries on top of this base spec for particularly common cases? Which keywords to accept becomes a complex question of course, I'm just curious if it's feasible to have the best of both worlds. |
- Have a function, make_pod_spec that accepts a bunch of stuff and gives back a pod template - This allows us to add creating pod templates from YAML, JSON, etc in the future
This is useful because not everyone's host has same kinda python
@mrocklin ok, I now have the core KubeCluster object only accepting a pod template, and there's a convenience function for generating said template. We can easily add a convenience function for generating it from yaml as well. LMK how this looks! I got all tests to pass, although now you've to explicitly pass a '--worker-image' to pytest to tell it what image to use (since my local python is very different from the python used in dask upstream images). I think easy ways to improve here would be:
|
@mrocklin ok, I've added ability to read pod template info from YAML now. This code could use a little bit more clarity around naming and what not, but hopefully this approach makes both of us happy? :) |
I also wonder if I should just move the extra_config merging code to its own function, since then it can be chained with both yaml and non-yaml methods. We could also move the overrides back to KubeCluster constructor. |
daskernetes/core.py
Outdated
self.worker_labels['component'] = 'dask-worker' | ||
worker_pod_template.metadata.labels['dask.pydata.org/cluster-name'] = name | ||
worker_pod_template.metadata.labels['app'] = 'dask' | ||
worker_pod_template.metadata.labels['component'] = 'dask-worker' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we do a copy before mutating the given input?
Also, should we expect a Pod or a PodSpec? Do we want the user to specify a an ObjMeta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should accept Pod objects than PodSpec, since that's what all the k8s docs generally specify. And having users be able to do annotations / labels is also quite useful.
self.worker_image = worker_image | ||
self.worker_labels = (worker_labels or {}).copy() | ||
self.threads_per_worker = threads_per_worker | ||
self.env = dict(env) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to we can continue supporting these kinds of common configurations by modifying the template in the _make_pod
method. I don't know if this is desired, but it might be useful if there are parameters that we expect users to want to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to move the extra_* stuff to the constructor, and just have a 'common configuration' thing as another classmethod.
@@ -55,27 +52,29 @@ class KubeCluster(object): | |||
""" | |||
def __init__( | |||
self, | |||
worker_pod_template, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to accept this as either a kubernetes
library object, a dictionary, or a filename, and then normalize all options into a kubernetes object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made individual classmethods for these variants instead. I'm generally not a fan of doing polymorphism style stuff in Python constructors, since I thinks this is clearer.
daskernetes/objects.py
Outdated
pod_template = apiclient.deserialize( | ||
_FakeResponse(yaml.safe_load(f)), | ||
client.V1Pod | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to make this approach work before. A number of components of the kubernetes object were silently dropped. I had a bunch of hacks on top of this to get my use case to work. Those hacks were pretty fragile. https://github.com/mrocklin/daskernetes/blob/deploy/daskernetes/core.py#L338-L374
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular problem was my main blocker. I wasn't familiar enough with the kubernetes
library to easily find a path out. Most other issues I'm able to handle
This includes failures in the deserialize method that we saw when deploying pangeo.pydata.org
Add failing test for make_pod_from_dict
This makes things much clearer - you are either using python client objects and making an object yourself, or using k8s style JSON / YAML specs with a dictionary.
Ah, I found the source of my |
extra_container_config
andextra_pod_config
, which are dicts that are deep merged into thepod / container configuration