Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making RemoteKernelManager more independent #803

Closed
golf-player opened this issue Apr 22, 2020 · 11 comments
Closed

Making RemoteKernelManager more independent #803

golf-player opened this issue Apr 22, 2020 · 11 comments

Comments

@golf-player
Copy link

Original thread: jupyter/nbclient#26

Some config-related issues that crop up using RKM (RemoteKernelManager) for one-off executions with nbclient/nbconvert:

  • RKM depends on kernel_id for pod name and connection file. As discussed on the other thread, the concept of kernel_id is only in the context of MultiKernelManager, so maybe there could be an option for oneoffs to make essentially the same thing as a kernel_id, but call it differently to avoid confusion?
  • To make RKM usable without a JEG app or kernel_manager_factory, there's config that needs to be added that is common to JEG app and RemoteMappingKernelManager

@kevin-bates suggested using the kernel provider architecture here, but that seems to not be accepted, and there'd be some work to be done to get RKM to use the kernel provider. I'm gonna read through kernel providers to get some kind of understanding of how this would work.

Pasting in the code from the other thread here.

import os
import nbformat
from nbclient import NotebookClient
from enterprise_gateway.services.kernels.remotemanager import RemoteKernelManager, RemoteMappingKernelManager
from enterprise_gateway.enterprisegatewayapp import EnterpriseGatewayApp

eg = EnterpriseGatewayApp()
rmkm = RemoteMappingKernelManager(parent=eg,
                                  log=eg.log,
                                  connection_dir=eg.runtime_dir,)


class RKM(RemoteKernelManager):
    def __init__(self, **kwargs):
        kernel_id = rmkm.new_kernel_id()
        kwargs.update({
            "parent": rmkm,
            "log": rmkm.log,
            "kernel_name": "python_kubernetes",
            "connection_file": os.path.join(rmkm.connection_dir, "kernel-%s.json" % kernel_id)
        })
        super(RKM, self).__init__(**kwargs)
        self.kernel_id = kernel_id


with open("/home/ish/notebooks/Untitled.ipynb") as fp:
    test_notebook = nbformat.read(fp, as_version=4)

client = NotebookClient(nb=test_notebook, kernel_manager_class=RKM)

For me, ideally this would work by just setting the kernel_manager_class=RemoteKernelManager without the weird config stuff I'm forced to do otherwise.

I'm happy to do the work on this one btw, I'm just not sure how y'all prefer to make this sort of change; Do I just add a bunch of config to RKM and make fallback values for everything that's not automatically present? Since RKM expects to have a traitlets parent chain like JEG App -> RemteMappingKernelManager -> RKM, there's a lot of code accessing parent.parent which I would change to work with RKM's own config, which accepts the values from parent.parent if available.

@kevin-bates
Copy link
Member

@golf-player - thanks for opening this issue. I think this could be pretty cool.

I think it would be good to understand what aspects of configuration are used "under" RMK. I know most of it is global information that can also be overridden at the kernelspec level and, for this use case, it might make sense that we only support the kernelspec-specified options only. This implies we might want to tolerate "null-ness" in some of the lower areas where the global and local definitions are consulted.

We should identify which config items are not locally supported and determine if a tolerance to null is sufficient for those items.

Do you mind taking a look? I'm happy to guide and answer questions along the way.

@golf-player
Copy link
Author

golf-player commented Apr 23, 2020

@kevin-bates thanks for the reply

I'm don't understand what you mean by configuration used "under" RMK. Are you asking about a minimal set of config that fully configures RMK?

Which kernelspec are we talking about specifically? Different processproxies have different kernelspecs right? I'm currently basing everything I'm doing on the python in kubernetes kernelspec. All that needs is a kernel_id and response_address.

As for configuration required to get RMK to run, I'm able to run kernels with the following config object passed to RKM as parent:

config = Configurable()
config.parent = Configurable()
config.parent.port_range = "0..0"
config.parent.unauthorized_users = ["root"]
config.parent.authorized_users = ["ish"]
config.parent.impersonation_enabled = True
config.parent.max_kernels_per_user = -1

kernel_id had to be set to something (necessary to get things to run right, but it's not really a kernel_id; more of an identifier for kube pod name uniqueness and connection file.

It needs connection_file passed to it, which uses EG's runtime_dir config indirectly through the MappingKernelManager.

Logs are a good idea as well, but aren't necessary.

I would assume the first step to making it more independent would be to make it agnostic of the call hierarchy via the traitlet parent field ie: remove all references to parent.parent (which actually live in the processproxy code.

Q: What do you mean when you ask which config items are not locally supported?

@kevin-bates
Copy link
Member

Hi @golf-player - thanks for the great response.

Yes, this is what I'm driving at. Many of these config items can actually appear in the process-proxy stanza of the kernelspec and that's what I mean by locally supported.

For configuration items that cannot be supported locally, there's isn't really any "carrier" for those items. Of the items you list, all but impersonation_enabled and max_kernels_per_user are not available at the kernelspec (process-proxy) level. Of these, max_kernels_per_user doesn't apply in this case, and I believe impersonation_enabled is primarily used to ensure the user running the gateway server is not allowed to start kernels since impersonation requires the gateway server run as root authority. However, this value ultimately gets set into the kernel's env and this could also be done via the env stanza of the kernel.json file using EG_IMPERSONATION_ENABLED.

Looking through the various references to self.kernel_manager.parent and self.kernel_manager.parent.parent, I'm inclined to make each of the configuration references a property on RemoteKernelManager. Then, the reference in the code would change from this to something like this...

self.unauthorized_users = self.kernel_manager.unauthorized_users

where RMK is extended with something like this (and as appropriate for each item)...

    @property
    def unauthorized_users(self):
        if self.parent and self.parent.parent:
            unauthorized_users = self.parent.parent.authorized_users
        else:  # external mode
            unauthorized_users = os.getenv("RMK_UNAUTHORIZED_USERS", 'root').split(',')
        return unauthorized_users

This would alleviate all the null checks throughout.

How does that sound?

I think a bit more will be necessary for the "parent" references that result in access to a class like this:

    current_kernel_count = self.kernel_manager.parent.parent.kernel_session_manager.active_sessions(username)

will require a check of the property first...

    if self.kernel_manager.kernel_session_manager:
        current_kernel_count = self.kernel_manager.kernel_session_manager.active_sessions(username)

Regarding kernel-id, I think we need to do something similar to what you have above...

  • pass the result of new_kernel_id() as a kwarg ('kernel_id') into the RMK initializer.
  • update the initializer to use the optional kwarg.
  • update the base proxy initializer to tolerate self.kernel_manager.kernel_id's pre-existence. (I hate that we have to pull the kernel_id from the connection file!). With this, you shouldn't need to create the connection filename.
  • Not sure what else...

Sidebar:
Impersonation is typically used in Spark environments where kernels run in the same space (non-containerized) so that better isolation can occur. Are you successfully using impersonation on k8s/Spark?

@golf-player
Copy link
Author

golf-player commented Apr 24, 2020

I didn't know about the kernelspec proxy stanza supporting config, so that's great.
For most of the rest of it, it's in line with what I was thinking. I'm curious though why it makes more sense to extend RKM with @property vs using traitlets
Really appreciate the deep breakdown.

I'll be working on an MR tonight.

Outstanding question is how, in the case of a RKM instance being instantiated from nbclient, does the kernel_id get passed? Maybe if the kernel_id isn't passed or set, it's lazily generated or something? But that would require kernel_id generation code to reside in the RKM (but to me this isn't a big deal as it's just using uuid)

We're setting impersonation off; not sure why it's true above lol

@golf-player
Copy link
Author

Hmm another thing I forgot is that the "locally supported" config items are required by RKM, not by the kernel, so at the point they're required, the kernel config isn't in the picture yet. I could probably read it from the kernelspec manager, but that doesn't seem like a good way here.

@golf-player
Copy link
Author

#810

@kevin-bates
Copy link
Member

I could probably read it from the kernelspec manager, but that doesn't seem like a good way here.

This seems like an issue. The location of kernelspecs is also something the kernelspec manager determines and then you'd also need to load the file, reference the resources - things the kernelspec manager does. This is also where the command to launch the kernel comes from and is used by jupyter_client code.

@kevin-bates
Copy link
Member

It just occurred to me, after re-reading the opening description, that what is passed to nbclient is a class, not an instance of a class. This is worse than I thought since, as you show, the constructor is accessing other globally defined instances, etc.

I would not be opposed to defining a wrapper class that contains an instance of RemoteKernelManager and have this wrapper class's constructor instantiate a kernelspec manager, create the kernel_id, etc. then return the actual RemoteKernelManager instance - or define methods that forward to the contained instance. This way, that class could be passed to nbclient and its construtor would replace the need for the global instance. (I suspect you're already thinking this way and I'm just catching up. 😄)

I think the work to convert to properties or whatever at the RemoteKernelManager level would still be useful in this approach, although you could seed the parent attributes of the contained RemoteKernelManager instance in the wrapping class's constructor.

@golf-player
Copy link
Author

This seems like an issue.
I was wrong about this. It seems to be doing what you expect it to be doing and I misunderstood. I'll need to modify my code to get it to use max_kernels_per_user and the other one from the config if available.

the constructor is accessing other globally defined instances, etc.
I could have just as easily created the instances of JEG app, and the mappingkernelmanager inside the constructor. But in general, I don't want to have to make an instance of the JEG app when all I want is the RKM; same for the mapping kernelmanager.

I think using parent makes sense when there's some guarantees about what parent is. Any RKM is guaranteed to have either no parent or an RMKM-like object, right? In this case, there were problems because the code assumed a chain of JEG->RMKM->RKM. If we're making RKM independent, in this case, I think it should have 2 sets of behaviour: 1, when it's attached to an RMKM, and 2, when it's not (I mention something like this here: #810 (comment))

I would not be opposed to defining a wrapper class
Would this wrapper live in EG?

@golf-player
Copy link
Author

OK I messed up a bit. Now I get what you were worried about with traitlets. I completely misunderstood how that worked. I'm looking deeper to see if I can get the mixin to work, but if not, I'll have to change that to a bunch of @propertys.

Sorry, but can you hold off on code review for now?

@kevin-bates
Copy link
Member

Yeah, no worries at all.

golf-player pushed a commit to golf-player/enterprise_gateway that referenced this issue Apr 28, 2020
Make RemoteKernelManager (RKM) able to be run without a (grand)parent
EnterpriseGatewayApp (EGA) instance so it can be used by nbclient.

The config which RKM with EGA was being referred to using traitlet
lineage like `self.parent.parent.property_name`. Change both to inherit
from a Configurable mixin which contains all of EGA's previous config.
Link the attributes of the RKM instance with EGA if available to keep
old behaviour.

Modify other properties RKM uses that are not traits to become
`@property`s which fall back to sane defaults if running independently.

Change RKM to be able to generate a kernel id if necessary. Change
ProcessProxy to use provided kernel ids. Move kernel id generation
logic out of RemoteMappingKernelManager.

Resolves jupyter-server#803
golf-player pushed a commit to golf-player/enterprise_gateway that referenced this issue May 20, 2020
Make RemoteKernelManager (RKM) able to be run without a (grand)parent
EnterpriseGatewayApp (EGA) instance so it can be used by nbclient.

The config which RKM with EGA was being referred to using traitlet
lineage like `self.parent.parent.property_name`. Change both to inherit
from a Configurable mixin which contains all of EGA's previous config.
Link the attributes of the RKM instance with EGA if available to keep
old behaviour.

Modify other properties RKM uses that are not traits to become
`@property`s which fall back to sane defaults if running independently.

Change RKM to be able to generate a kernel id if necessary. Change
ProcessProxy to use provided kernel ids. Move kernel id generation
logic out of RemoteMappingKernelManager.

Resolves jupyter-server#803
golf-player pushed a commit to golf-player/enterprise_gateway that referenced this issue May 20, 2020
Make RemoteKernelManager (RKM) able to be run without a (grand)parent
EnterpriseGatewayApp (EGA) instance so it can be used by nbclient.

The config which RKM with EGA was being referred to using traitlet
lineage like `self.parent.parent.property_name`. Change both to inherit
from a Configurable mixin which contains all of EGA's previous config.
Link the attributes of the RKM instance with EGA if available to keep
old behaviour.

Modify other properties RKM uses that are not traits to become
`@property`s which fall back to sane defaults if running independently.

Change RKM to be able to generate a kernel id if necessary. Change
ProcessProxy to use provided kernel ids. Move kernel id generation
logic out of RemoteMappingKernelManager.

Resolves jupyter-server#803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants