Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guess params by introspecting the _parameters_ cell #531

Merged
merged 17 commits into from
Sep 6, 2020

Conversation

fcollonval
Copy link
Member

@fcollonval fcollonval commented Aug 26, 2020

This is a reboot of #158

It adds a new option --help-notebook to create a new cli usage:

papermill --help-notebook input_notebook.ipynb

Due to that I had to remove the required statement on OUTPUT_PATH. But the test is still done within the papermill function to keep the behavior if --help-notebook is not set.

Fixes #225 => in particular the default parameters resulting of the code introspection are added as a dictionary in metadata nb.metadata['papermill']['default_parameters'].

For now this add the ability for Python. I'm not familiar to the other languages to build the proper regex.

Note: I use a regex instead of ast for Python for two reasons: ast removes all comments but I use them to get a help comment. ast parsing for non-python language won't be available. So it seems better to use regex for all languages.

A follow up could be to address #55, but there is a important issue on how to fairly evaluate the variable type for non-python language.

@codecov
Copy link

codecov bot commented Aug 26, 2020

Codecov Report

Merging #531 into main will increase coverage by 0.52%.
The diff coverage is 98.49%.

@@            Coverage Diff             @@
##             main     #531      +/-   ##
==========================================
+ Coverage   91.77%   92.30%   +0.52%     
==========================================
  Files          14       16       +2     
  Lines        1289     1403     +114     
==========================================
+ Hits         1183     1295     +112     
- Misses        106      108       +2     

@fcollonval fcollonval marked this pull request as ready for review August 26, 2020 09:26
@MSeal
Copy link
Member

MSeal commented Sep 1, 2020

Thanks for posting the PR -- I just got back and will get a chance to go through it in the next day or two

Copy link
Member

@MSeal MSeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for building this out. There's some comments to address and probably a little bit of rearrangement needed to avoid code paths when executing but I think the core implementation idea is solid in approach.

papermill/inspection.py Outdated Show resolved Hide resolved
papermill/iorw.py Outdated Show resolved Hide resolved

class PythonTranslator(Translator):
# Pattern to capture parameters within cell input
PARAMETER_PATTERN = re.compile(
r"^(?P<target>\w[\w_]*)\s*(:\s*[\"']?(?P<annotation>\w[\w_\[\],\s]*)[\"']?\s*)?=\s*(?P<value>.*?)(\s*#\s*(type:\s*(?P<type_comment>[^\s]*)\s*)?(?P<help>.*))?$" # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice actually -- it even extracts the comment on the end of foo = bar # a comment!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other way to do this for Python is to use the ast module (e.g. https://www.mattlayman.com/blog/2018/decipher-python-ast/) and extracting the assignments programmatically. However this won't work cross-language and could even hiccup with ipython specific capabilities with python kernels (e.g. !run-something). So I think regex is the KISS way to go.

Copy link
Member Author

@fcollonval fcollonval Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is exactly my note in the PR description 😉

Note: I use a regex instead of ast for Python for two reasons: ast removes all comments but I use them to get a help comment. ast parsing for non-python language won't be available. So it seems better to use regex for all languages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep just re-enforcing the idea

papermill/translators.py Show resolved Hide resolved
return mock


@pytest.mark.parametrize(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want more examples (maybe by supplying the parameter cell directly to the inspect call in a different test parameterization) with a variety of code arrangements to prove your comment compressing code all works as expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually done in papermill/tests/test_translators.py:test_inspect_python (at line 112)

requirements-dev.txt Outdated Show resolved Hide resolved
Copy link
Member

@MSeal MSeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement and the quick turn around on the review!

@MSeal MSeal merged commit 82b5c5d into nteract:main Sep 6, 2020
@fcollonval fcollonval deleted the guess-params-reboot branch September 7, 2020 06:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introspect Parameters before Execution
2 participants