Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Papermill raises error for id field when running jupyter-notebook 6.2 with the -p flag #568

Closed
malcolmbovey opened this issue Jan 18, 2021 · 16 comments

Comments

@malcolmbovey
Copy link

malcolmbovey commented Jan 18, 2021

When running papermill v2.2.1 in an environment where jupyter-notebook v6.2 is installed I am seeing the following error when running with parameters (-p flag)

papermill notebook.ipynb outputs/notebook.ipynb -p my_config --no-progress-bar --log-output

Output:

[NbConvertApp] ERROR | Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)
[2021-01-18T09:44:24.218Z] Failed validating 'additionalProperties' in code_cell:
[2021-01-18T09:44:24.218Z] On instance['cells'][0]:
[2021-01-18T09:44:24.218Z] {'cell_type': 'code',
[2021-01-18T09:44:24.218Z]  'execution_count': 1,
[2021-01-18T09:44:24.218Z]  'id': 'nasty-bearing',
[2021-01-18T09:44:24.218Z]  'metadata': {'execution': {'iopub.execute_input': '2021-01-18T09:44:22.903942Z',
[2021-01-18T09:44:24.218Z]                             'iopub.status.busy': '2021-01-18T09:44:22.903349Z',
[2021-01-18T09:44:24.218Z]                             'iopub.status.idle': '2021-01-18T09:44:22.905999Z',
[2021-01-18T09:44:24.218Z]                             'shell.execute_reply': '2021-01-18T09:44:22.905474Z'},
[2021-01-18T09:44:24.218Z]               'papermill': {'duration': 0.01294,
[2021-01-18T09:44:24.218Z]                             'end_time': '2021-01-18T09:44:22.906187',
[2021-01-18T09:44:24.218Z]                             'exception': False,
[2021-01-18T09:44:24.218Z]                             'start_time': '2021-01-18T09:44:22.893247',
[2021-01-18T09:44:24.218Z]                             'status': 'completed'},
[2021-01-18T09:44:24.218Z]               'tags': ['injected-parameters']}

I think this may be due to a change in jupyter-notebook v6.2 that has added an "id" field to the cell properties : jupyter/notebook#5928 .

The error does not occur when jupyter-notebook v6.1.6 or earlier is running

@malcolmbovey malcolmbovey changed the title Papermill raises error for id field when running jupyter-notebook 6.2 Papermill raises error for id field when running jupyter-notebook 6.2 with the -p flag Jan 18, 2021
@riderx
Copy link

riderx commented Jan 18, 2021

Same here, it broke my pytest .
How did you forced the jupyter-notebook version ?
When i do fix it, it doesn't use the rigth one

@malcolmbovey
Copy link
Author

Same here, it broke my pytest .
How did you forced the jupyter-notebook version ?
When i do fix it, it doesn't use the rigth one

I run jupyter inside docker, so I am just fixing the tag of the base image to an older version

@MSeal
Copy link
Member

MSeal commented Jan 19, 2021

Make sure you have the latest nbformat version 5.1.2 -- that includes a spec file for the new schema addition for cell ids. If that doesn't work check the json of the notebook. Is it minor version 5? If it's minor 4 and has cell ids then something is producing invalid ipynb files.

@malcolmbovey
Copy link
Author

This is the config I have in the environment where the error occurs:

jupyter core     : 4.7.0
jupyter-notebook : 6.2.0
qtconsole        : not installed
ipython          : 7.19.0
ipykernel        : 5.4.2
jupyter client   : 6.1.11
jupyter lab      : 2.2.9
nbconvert        : 6.0.7
ipywidgets       : 7.6.3
nbformat         : 5.1.2
traitlets        : 5.0.5

So I am running the correct version of nbconvert.

I checked and my notebook is at minor version 4. When you saying something is producing invalid ipynb files, is it papermill itself that is inserting the ids?

@MSeal
Copy link
Member

MSeal commented Jan 19, 2021

Papermill doesn't directly manage id fields, but nbformat should be auto populating ids if the notebook is a version 4.5 format.

I don't have time today to debug this a lot, but if the notebook has an id field in each cell and is minor version 4 before papermill processes it, then another library is publishing invalid ipynb files (basically saving 4.5 spec files but saying they're 4.4). If the id isn't present with a 4.4 file then there's a bug with papermill using nbformat to load the document and auto-upgrade to 4.5 format. If someone could verify which case is occurring it would speed up get this addressed

@MSeal
Copy link
Member

MSeal commented Jan 19, 2021

For now, downgrading nbformat to 5.0.8 will be only 4.4 version aware and might alleviate errors for you until this is sorted.

@malcolmbovey
Copy link
Author

Thanks for the response. Here's the metadata from one of my notebooks. I've checked and prior to running papermill, there is no id field in any the cells. So sounds like the second case?

  "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}

@MSeal
Copy link
Member

MSeal commented Jan 19, 2021

Ok that helps narrow it down to nbformat validation within papermill. Downgrading nbformat for now should unblock until we can get a fix in here

@malcolmbovey
Copy link
Author

Thanks, yes downgrading nbformat to 5.0.8 via pip appears to prevent the error ocurring

@cristobalcl
Copy link
Contributor

Hi, I think the problem is with the nbformat module. When papermill calls nbformat.v4.new_code_cell(source=param_content) (in parameterize.py:82) it adds an id that is only valid in the notebook format 4.5. So later validation (when writing the output) is failing for notebooks with versions as 4.4.

One quick fix is to force minor version to 5 in the __init__ of NotebookExecutionManager (it worked in my case):

        self.nb["nbformat_minor"] = 5

I could work on this or any other solution you suggest and do a PR, if you will.

@MSeal
Copy link
Member

MSeal commented Jan 19, 2021

Ahh I see. Probably these functions need to take notebook version as optional arguments: https://github.com/jupyter/nbformat/blob/e7e16bd92ec9a7c567a6b65aabcdf345b94f99c8/nbformat/v4/nbbase.py#L114-L153 OR we call https://github.com/jupyter/nbformat/blob/b8bad3b052ebeba449eef458941e869918b91999/nbformat/v4/convert.py#L26 on notebooks that flow through papermill. I think I prefer the latter because this is/should be the pattern that other jupyter stacks will be adopting.

cristobalcl added a commit to cristobalcl/papermill that referenced this issue Jan 20, 2021
@cristobalcl
Copy link
Contributor

I did a PR fixing the issue using the suggested solution with the upgrade function. I also added a little test.

Suggestions are welcome!

@MSeal MSeal closed this as completed in a4bf8a9 Jan 20, 2021
@MSeal
Copy link
Member

MSeal commented Jan 20, 2021

Papermill 2.3.0 on PyPI has the fix included. Conda-forge should update when it gets to auto-rebuilding the package there. Thanks for helping get that patched @cristobalcl

@malcolmbovey
Copy link
Author

Thanks both, appreciate the fast turnaround in getting this one fixed.

alexdunncs pushed a commit to alexdunncs/notebook_pge_wrapper that referenced this issue Mar 24, 2021
v2.3.0 and up includes a fix to a bug that breaks unit-tests
nteract/papermill#568
@DanielHabenicht
Copy link

Sorry to interrupt, I am running into the same error with the most recent versions (papermill==2.3.4, nbconvert==6.4.1, nbformat==5.1.3) and an older .ipynpb file in Github Actions.

Input Notebook:  03_Pcap.ipynb
Output Notebook: /tmp/ipynb/03_Pcap.ipynb
Input notebook does not contain a cell with tag 'parameters'

Executing:   0%|          | 0/27 [00:00<?, ?cell/s]Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)

Failed validating 'additionalProperties' in code_cell:

On instance['cells'][0]:
{'cell_type': 'code',
 'execution_count': None,
 'id': '90498313',
 'metadata': {'papermill': {'duration': None,
                            'end_time': None,
                            'exception': None,
                            'start_time': None,
                            'status': 'pending'},
              'tags': ['injected-parameters']},
 'outputs': ['...0 outputs...'],
 'source': '# Parameters\n'
           '( = ["\'", "c", "i", "\'", ",", " ", "\'", "t", "r", "...'}

Executing:   0%|          | 0/27 [00:00<?, ?cell/s]
Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)

Failed validating 'additionalProperties' in code_cell:

On instance['cells'][0]:
{'cell_type': 'code',
 'execution_count': None,
 'id': '90498313',
 'metadata': {'papermill': {'duration': None,
                            'end_time': None,
                            'exception': None,
                            'start_time': None,
                            'status': 'completed'},
              'tags': ['injected-parameters']},
 'outputs': ['...0 outputs...'],
 'source': '# Parameters\n'
           '( = ["\'", "c", "i", "\'", ",", " ", "\'", "t", "r", "...'}
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.2/x64/bin/papermill", line 8, in <module>
    sys.exit(papermill())
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/cli.py", line 242, in papermill
    execute_notebook(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/execute.py", line 91, in execute_notebook
    nb = papermill_engines.execute_notebook_with_engine(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/engines.py", line 49, in execute_notebook_with_engine
    return self.get_engine(engine_name).execute_notebook(nb, kernel_name, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/engines.py", line 310, in execute_notebook
    nb = cls.execute_managed_notebook(nb_man, kernel_name, log_output=log_output, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/engines.py", line 372, in execute_managed_notebook
    preprocessor.preprocess(nb_man, safe_kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/preprocess.py", line 20, in preprocess
    with self.setup_preprocessor(nb_man.nb, resources, km=km):
AttributeError: 'PapermillExecutePreprocessor' object has no attribute 'setup_preprocessor'
  "kernelspec": {
   "display_name": "Python 3.9.6 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.6"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}

If I upgrade to version 4.5. This error manifests:

Input Notebook:  03_Pcap.ipynb
Output Notebook: /tmp/ipynb/03_Pcap.ipynb
Input notebook does not contain a cell with tag 'parameters'

Executing:   0%|          | 0/27 [00:00<?, ?cell/s]
Executing:   0%|          | 0/27 [00:00<?, ?cell/s]
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.2/x64/bin/papermill", line 8, in <module>
    sys.exit(papermill())
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/cli.py", line 242, in papermill
    execute_notebook(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/execute.py", line 91, in execute_notebook
    nb = papermill_engines.execute_notebook_with_engine(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/engines.py", line 49, in execute_notebook_with_engine
    return self.get_engine(engine_name).execute_notebook(nb, kernel_name, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/engines.py", line 310, in execute_notebook
    nb = cls.execute_managed_notebook(nb_man, kernel_name, log_output=log_output, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/engines.py", line 372, in execute_managed_notebook
    preprocessor.preprocess(nb_man, safe_kwargs)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/papermill/preprocess.py", line 20, in preprocess
    with self.setup_preprocessor(nb_man.nb, resources, km=km):
AttributeError: 'PapermillExecutePreprocessor' object has no attribute 'setup_preprocessor'

Here is my CI build and the StackOverflow of how I came here.

@DanielHabenicht
Copy link

Looks like my requirements.txt got mixed up. I somehow ended up having an old package called papermill-nb-runner and pip somehow ended up installing an older version but displaying a newer one. (Reinstalling only papermill also worked)
I removed the extra requirements and now it's working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants