Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an extension for executing code in a jupyter kernel #22

Merged
merged 20 commits into from
Nov 25, 2018

Conversation

jbweston
Copy link
Member

@jbweston jbweston commented Aug 16, 2018

Co-Authored-By: Anton Akhmerov [email protected]

@SylvainCorlay as discussed on Gitter the other day.

You can try it out on the following sample rst file: https://pastebin.com/tG10Q9U0

The basic functionality is there:

  • execution is handled by a Jupyter kernel (currently this is hard-coded to a Python kernel)
  • support for the following output formats: images, HTML, Latex and plaintext
  • the format selected for each output is done based on a configurable priority list
  • directive allows to configure hiding input or output
  • currently we use nbconvert to handle both execution and output extraction

Before this could be merged we still need to:

  • add support for jupyter widgets (implemented in a separate PR that may be discussed after this one, as there are orthogonal issues to discuss that are specific to widgets)
  • refactor selection of mimetype to display (currently it's a nested if/else)
  • add a proper module docstring

It would also be good to allow configuration of the kernel, but that can perhaps be in a future PR.

Please take a look and see if you agree with the overall structure, and if anything else stands out as something we could already improve.

jbweston and others added 11 commits August 16, 2018 13:22
also factor out notebook execution and output writing into
separate functions.
If the 'new-notebook' option is given, then a new kernel is
started and the cells get appended to a new notebook. The name
may be optionally provided. This also affects the naming of the
cell output files.

The kernel may also be specified with the 'kernel' option, and this
is only legal if 'new-notebook' also occurs in the same directive.
Previously, the use of a proper URI scheme (file://) meant that
Sphinx was not copying any files, and instead was just copying
the URI verbatim. Now we signal to Sphinx that we want the files
in question to be copied into the output directory.
- stop assuming that current directory is the parent of the build directory
- correctly treat documents not in the root of the source directory
@SylvainCorlay
Copy link
Member

I am looking at it now. Making a few nitpicking commands, but in general, I love the direction that this is taking.

@akhmerov
Copy link
Member

akhmerov commented Sep 5, 2018

Some points worthy to consider at this stage:

  • How much flexibility do we want to allow? Is it OK to let users specify "cells" in segments? Out of order? Separately specify where the output should be plugged in?
  • At which stage of sphinx pipeline should we do the processing?
  • Is the current syntax choice best? Or would it be better to add a different directive that instead goes "please treat all the following code blocks as notebook cells"?

@akhmerov
Copy link
Member

akhmerov commented Sep 5, 2018

Another question: let's imagine the users want to allow handling of a new mime type (say geojson). Would we be able to allow them to do it at the level of configuration?

jupyter_sphinx/execute.py Outdated Show resolved Hide resolved
'image/png',
'image/jpeg',
'text/latex',
'text/plain'
Copy link
Member

@SylvainCorlay SylvainCorlay Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mime type for Jupyter interactive widgets is application/vnd.jupyter.widget-view+json.

It is rendered with a script tag with the right type:

<script type="application/vnd.jupyter.widget-view+json">
{
    "model_id": "91b9a192d52e4f178c3a80ad3bee61e5",
    "version_major": 2,
    "version_minor": 0
}
</script>

Maybe this should be the case for any mime type ending with +json? - cc @minrk @jasongrout

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What takes care of the rendering the script? Extra page header? What if we want to render something also in latex output?

Is it a standard behavior for various types of data? I am particularly interested in plotly, but they seem to not have a renderer that would function this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a library called the HTML embedder which can be included as a script tag. This is what we use to render widgets in e.g. http://nbviewer.jupyter.org/github/jupyter-widgets/ipywidgets/blob/master/docs/source/examples/Widget%20List.ipynb

Although, it will also require the widgets state to be stored in a separate application/vnd.jupyter.widget-view+json script tag. I will probably open a PR adding this later.

Note that plotly is now based on Jupyter widgets (since 3.0) so if they did things right, this will also enable the rendering of plotly charts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have FigureWidget, but it doesn't seem to save the actual figure data in the widget state right now. (This should go into widget_state, right?) They also support bundling json with the output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to the HTML embedder library? I failed to find it after a quick google search.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the package is @jupyter-widgets/html-manager
and the code is in the ipywidgets repository. You can read more about it here:

https://ipywidgets.readthedocs.io/en/stable/embedding.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this also work with scripts that have src tag set? Some widgets will contain a lot of data, and extracting them into a separate file would improve the page load times.

Thinking of producing output other than html: I remember there was a way to render a widget into a static image, is that right? Does it work on custom widgets? Can this be done without a browser (e.g. via an electron app)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now jbweston#2 where I've started implementing ipywidgets support. Perhaps we should first merge this PR and then we can address that one (there are still several unanswered questions.

hide_output=('hide-output' in self.options),
code_below=('code-below' in self.options),
kernel_name=self.options.get('kernel', '').strip(),
new_notebook=('new-notebook' in self.options),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the new_notebook option, which seems to be used as a means to make code snippets run in separate execution contexts.

I would prefer something like restart_kernel. Usingnotebook in the name of the options seems to expose the fact that it is creating a notebook under the hood, which is an implementation detail IMO.

We could remove the feature altogether for a first version, since this may be confusing to users. Or we could havesomething like kernel_id which will specify the kernel id to be created or something like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about the naming, indeed. I also agree that under-the-hood creation of notebooks is not a good thing to advertise.

At the same time we do want to allow the users to easily extract and run the code, and also specify the filename. Does that make sense?

So the user may need to:

  • Specify the filename
  • Indicate that the new kernel is needed

Not entirely sure what's the best way to translate these two parameters into options, and how to call those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you mean about the file name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the jupyter-download:... directives we allow to download the code as a notebook or a script. So we'd need to specify a filename for it to be human-readable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

603ef72 introduces a jupyter-kernel directive that is used to specify the kernel to use:

.. jupyter-kernel:: python3
    :id: my-kernel-id

the id is used as the filename and the argument to the directive is the kernel name.

@SylvainCorlay
Copy link
Member

How much flexibility do we want to allow? Is it OK to let users specify "cells" in segments? Out of > order? Separately specify where the output should be plugged in?

Maybe we should keep the package as simple and lean as possible at first before adding feature that we don't know are really needed.

At which stage of sphinx pipeline should we do the processing?
Is the current syntax choice best? Or would it be better to add a different directive that instead goes "please treat all the following code blocks as notebook cells"?

I like one-size-fits-all directive that you implemented better than the two directives that we have now.

Another question: let's imagine the users want to allow handling of a new mime type (say geojson). Would we be able to allow them to do it at the level of configuration?

The case of geo+json should be handled with the +json suffix that catches all into a script tag. Maybe registering custom mime type renderers will be the way to go in the future.

FilesWriter(build_directory=output_dir).write(
nbformat.writes(notebook), resources,
os.path.join(output_dir, notebook_name + '.ipynb')
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we necessarily need to write a notebook file to disk?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in case we want to let the user download it. I think this is quite useful for more involved tutorials.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, although it exposes the "notebook" nature of the directive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we didn't think too hard about the name of that parameter so far. Alternative names could be new-script, new-file, ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed the naming of the directive options elsewhere, but I believe that we should keep writing the notebooks. Are we in agreement on that?

@akhmerov
Copy link
Member

Summary of the discussion:

  • Rename the directive to jupyter-execute
  • Extend it with source option allowing to include code from files (relevant for e.g. initialization code)
  • Factor out the "global" options into a separate directive, named e.g. new-kernel, with optional options kernel-name, filename (and not notebook-name!), maybe even default state for showing/hiding code and source.
  • Rely on nbconvert 5.4, implement querying the kernel for widget state after we have submitted all the code to it (@SylvainCorlay).
  • Read the file extension off from the kernel metadata (requires nbconvert 5.4).

@jbweston
Copy link
Member Author

+ Rename the directive to jupyter-execute
+ Extend it with source option allowing to include code from files (relevant for e.g. initialization code)
+ Factor out the "global" options into a separate directive, named e.g. new-kernel, with optional options kernel-name, filename (and not notebook-name!), maybe even default state for showing/hiding code and source.

@akhmerov I addressed these 3 points. There is jbweston#2 that is aiming to address the point about widgets

@akhmerov
Copy link
Member

Awesome, let's merge that one (there are already conflicts). Then, as far as I can judge, the remaining bits will amount to cleanup and review.

@jbweston
Copy link
Member Author

jbweston commented Oct 19, 2018

@akhmerov I would prefer to merge this first and then I can point jbweston#2 to this repository instead.

IMO there's already a lot in this PR as it is, and there are actually a few more points that we need clarity on wrt. widget support (e.g. how to know what versions of the embedding JS to use).

I'd wager we'll still land jbweston#2 before Sylvain wants to make a release.

@jbweston
Copy link
Member Author

The case of geo+json should be handled with the +json suffix that catches all into a script tag. Maybe registering custom mime type renderers will be the way to go in the future.

Indeed, I have a branch where I'm trying this out, but I believe this change can be made in a later PR, as the output rendering is quite isolated and won't require changes in several places.

@jbweston
Copy link
Member Author

@SylvainCorlay I believe that we've responded to all of your comments, it'd be good to get another look over to see if there's anything we've missed.

jupyter_sphinx/execute.py Outdated Show resolved Hide resolved
Bump nbconvert dependency to 5.4 to ensure that 'language_info' is
populated after notebook execution.
@basnijholt
Copy link
Contributor

Here is the adaptive documentation that is generated with the code of this merge request, it works great!

(I only added this commit to be able to handle javascript.)

@jbweston
Copy link
Member Author

@SylvainCorlay did you get a chance to look at this?

@jbweston jbweston changed the title WIP: add an extension for executing code in a jupyter kernel add an extension for executing code in a jupyter kernel Nov 16, 2018
@SylvainCorlay SylvainCorlay merged commit a7b9e69 into jupyter:master Nov 25, 2018
@jbweston jbweston deleted the feature/execute branch April 25, 2019 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants