[PROPOSAL] Parameterized Kernel Launch #38

kevin-bates · 2019-12-03T18:24:19Z

This proposal formalizes the changes that introduced launch parameters by defining kernel launch parameter metadata and how it is to be returned from kernel providers and interpreted by client applications. This feature is known as Parameterized Kernel Launch (a.k.a Parameterized Kernels). It includes 'launch' because many of the parameters really apply to the environment in which the kernel will run and are not actual parameters to the kernel. Things like memory, cpus, and gpus are examples of "environmental" parameters.

I'm using this repository as the primary location because this proposal relies on the Kernel Provider model introduced in this library. That said, the proposal affects other repositories, namely jupyter_server, jupyterlab, and notebook once jupyter_server is adopted as the primary backend server.

Launch Parameter Schema

The set of available launch parameters for a given kernel will be conveyed from the server to the client application via the kernel type information (formerly known as the kernelspec) as JSON returned from the /api/kernelspecs REST endpoint. When available, launch parameter metadata will be included within the existing metadata stanza under launch_parameter_schema, and will consist of JSON schema that describes each available parameter. Because this is pure JSON schema, this information can convey required values, default values, choice lists, etc. and be easily consumed by applications. (Although I'd prefer to avoid this, we could introduce a custom schema if we find the generic schema metadata is not sufficient.)

   "metadata": {
       "launch_parameter_schema": {
         "$schema": "http://json-schema.org/draft-07/schema#",
         "title": "Available parameters for kernel type 'Spark - Scala (Kubernetes)'",
         "properties": {
           "cpus": {"type": "number", "minimum": 0.5, "maximum": 8.0, "default": 4.0, "description": "The number of CPUs to use for this kernel"},
           "memory": {"type": "integer", "minimum": 2, "maximum": 1024, "default": 8, "description": "The number of GB to reserve for memory for this kernel"}
         },
         "required": ["cpus"]
       }
    }

Because the population of the metadata.launch_parameter_schema entry is a function of the provider, how the provider determines what to include is an implementation detail. The requirement is that metadata.launch_parameter_schema contain valid JSON schema. However, since nearly 100% of kernels today are based on kernelspec information located in kernel.json, this proposal will also address how the KernelSpecProvider goes about composing metadata.launch_parameter_schema and acting on the returned parameter values.

KernelSpecProvider Schema Population

I believe we should support two forms of population, referential and embedded.

Referential Schema Population

Referential schema population is intended for launch parameters that are shared across kernel configurations, typically the aforementioned "environmental" parameters. When the KernelSpecProvider loads the kernel.json file, it will look for a key under metadata named launch_parameter_schema_file. If the key exists and its value is an existing file, that file's contents will be loaded into a dictionary object.

Embedded Schema Population

Once the referential population step has taken place, the KernelSpecProvider will check if metadata.launch_parameter_schema exists and contains a value. If so, the KernelSpecProvider will load that value, then update the dictionary resulting from the referential population step. This allows per-kernel parameter information to override the shared parameter information. For example, some kernel types may require more cpus that aren't generally available to all kernel types.

KernelSpecProvider will then use the merged dictionaries from the two population steps as the value for metadata.launch_parameter_schema that is returned from its find_kernels() method and, ultimately, the /api/kernelspecs REST API. Any entry for metadata.launch_parameter_schema_file will not appear in the returned payload.

Client Applications

Parameter-aware applications that retrieve kernel type information from /api/kernelspecs will recognize the existence of any metadata.launch_parameter_schema values. When a kernel type is selected and contains launch parameter schema information, the application should construct a dialog from the schema that prompts for parameter values. Required values should be noted and default values should be pre-filled. (We will need to emphasize that all required values have reasonable defaults, but how that is handled is more a function of the kernel provider.)

Once the application has obtained the desired set of parameters, it will create an entry in the JSON body of the /api/kernels POST request that is a dictionary of name/value pairs. The key under which this set of pairs resides will be named launch_params. The kernels handler will then pass this dictionary to the framework, where the kernel provider launch method will act on it.

   "launch_params": {
       "cpus": 4,
       "memory": 512
    }

Note that applications that are unaware of launch_parameter_schema will still behave in a reasonable manner provided the kernel provider applies reasonable default values to any required parameters.

In addition, it would be beneficial if the set of parameter name/value pairs could be added into the notebook metadata so that subsequent launch attempts could use those values in the pre-filled dialog.

Kernel Provider Launch

Once the kernel provider launch method is called, the provider should validate the parameters and their values against the schema. Any validation errors should result in a failure to launch - although the decision to fail the launch will be a function of the provider. The provider will need to differentiate between "environmental" parameters and actual kernel parameters and apply the values appropriately. jupyter_kernel_mgmt will likely provide a helper method for validation.

Note: Since KernelSpecProvider will be the primary provider, at least initially, applications that wish to take advantage kernel launch parameters may want to create their own providers. Fortunately, we've provided a mechanism whereby KernelSpecProvider can be extended such that much of the discovery and launch machinery can be reused. In these cases, the kernel.json file would need to be prefixed with the new provider id so that KernelSpecProvider doesn't include those same kernel types in its set.

Virtual Kernel Types

One of the advantages of kernel launch parameters is that one could conceivably have a single kernel configured, yet allow for a plethora of configuration options based on the parameter values - as @rgbkrk points out here - since this facility essentially fabricates kernel types that, today, would require a separate type for each set of options.

References

#22
jupyter/jupyter_client#434
jupyter-server/enterprise_gateway#640
https://paper.dropbox.com/doc/Day-1-Kernels-jupyter_client-IPython-Notebook-server--ApyJEjYtqrjfoPg1QpbxZfcpAg-MyS7d8X4wkkhRQy7wClXY
#9

cc (based on inclusion in related threads): @takluyver @SylvainCorlay @Zsailer @lresende @rolweber @jasongrout @blink1073 @echarles @minrk @rgbkrk @MSeal @Carreau

The text was updated successfully, but these errors were encountered:

MSeal · 2019-12-09T18:48:44Z

I'm not ignoring this issue, but it'll take me a few days with the holidays going on to get some time to thoroughly review it.

Glad there's been progress on this front and a well organized write-up. I will give some deeper thoughts later.

takluyver · 2019-12-10T11:29:47Z

Should this be a JEP by itself? As you mention, it affects things beyond this repository.

blink1073 · 2019-12-10T16:22:45Z

👍 for a JEP

kevin-bates · 2019-12-10T18:21:43Z

I agree - it seems to fit the definition as a JEP. I've gone ahead and created a JEP for it. Please note that content has been added to the JEP that doesn't exist in this issue.

Closing issue in favor of JEP.

kevin-bates mentioned this issue Dec 10, 2019

[DRAFT] Parameterized Kernel Launch jupyter/enhancement-proposals#46

Closed

kevin-bates closed this as completed Dec 10, 2019

kevin-bates mentioned this issue Dec 17, 2019

Kernel gateway envs parameter jupyterlab/jupyterlab#7640

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Parameterized Kernel Launch #38

[PROPOSAL] Parameterized Kernel Launch #38

kevin-bates commented Dec 3, 2019 •

edited

Loading

MSeal commented Dec 9, 2019

takluyver commented Dec 10, 2019

blink1073 commented Dec 10, 2019

kevin-bates commented Dec 10, 2019

[PROPOSAL] Parameterized Kernel Launch #38

[PROPOSAL] Parameterized Kernel Launch #38

Comments

kevin-bates commented Dec 3, 2019 • edited Loading

Launch Parameter Schema

KernelSpecProvider Schema Population

Referential Schema Population

Embedded Schema Population

Client Applications

Kernel Provider Launch

Virtual Kernel Types

References

MSeal commented Dec 9, 2019

takluyver commented Dec 10, 2019

blink1073 commented Dec 10, 2019

kevin-bates commented Dec 10, 2019

kevin-bates commented Dec 3, 2019 •

edited

Loading