Cell ID JEP Implementation #189

MSeal · 2020-08-27T01:21:59Z

Introduces 4.5 format with cell ids added following https://github.com/jupyter/enhancement-proposals/blob/master/62-cell-id/cell-id.md. Cell ids are unique, and restricted in the character set is allowed.

nbformat/v4/nbformat.v4.5.schema.json

MSeal · 2020-10-02T07:43:16Z

Tested in lab and classic against:

Is there a good way to avoid some of the issues such as #167? Currently in JupyterLab 2.1.2 one can open a notebook with format 4.5, add cells, then save resulting in a 4.5 notebook without cell_ids.
successfully.

However when loading and then saving a fully valid 4.5 notebook from classic or lab, the id field is being stripped out between the nbformat.read and nbformat.write calls. The auto-fill for notebook cell ids addressing the above concern is repopulating the document on validation with new ids, so it doesn't throw any errors, but instead replaces all ID every save. I tested using nbformat's read and then write on their own and the ids are preserved. A look at the code paths in notebook server didn't reveal to me why this is being stripped out. @jasongrout or @minrk would you maybe know what might be causing that in the server side of the equation so we can patch that behavior?

MSeal · 2020-10-02T07:46:08Z

@willingc Do you have permissions to the appveyor for nbformat? I don't seem to have permissions disable that build which was replaced by github actions in a separate PR.

willingc · 2020-10-02T14:14:38Z

@vidartf Do you know how to completely remove appveyor on this project? Deleted the webhook but didn't seem to have any effect.

@MSeal Can you push again to this PR?

vidartf · 2020-10-03T18:39:39Z

@willingc I should have removed the project from appveyor now I think 👍

willingc · 2020-10-03T20:05:58Z

Thanks @vidartf 👍

MSeal · 2020-10-06T21:25:23Z

@afshin Any chance you have a little time / knowledge about the cell node reset I describe above with this change?

afshin · 2020-10-07T12:37:46Z

Hi @MSeal! I'm sorry, I am not very familiar with what is happening here. I wonder if there is something in the TS typings of the front-end that does the ID stripping. Or have you already isolated it and established it's the server that's doing it?

MSeal · 2020-10-07T16:50:30Z

@afshin I haven't narrowed it down, except that it happens in both lab and classic, and since classic is an older setup it made me suspicious that it was a server issue and not a frontend one. I could check this by seeing what payload reaches the UI over the network... that'd probably narrow it down. I tried digging through server code to see what might be causing it but I couldn't find any obvious culprit.

MSeal · 2020-10-07T20:59:45Z

Maybe @jasongrout might know here? I'll check if the cell id reaches the front-end contents payload later today when I get a minute to do so

rgbkrk · 2020-10-15T20:29:45Z

This isn't anything to do with additionalProperties being set to false while the allowed fields didn't include the ID is it? There could be a different version of the nbformat schema used at validation time causing the disappearance issue.

MSeal · 2020-10-15T21:17:54Z

Oh that does look like it is the source of the problem. I'm having difficulty getting a local installation to use my local npm package for nbformat to prove the described issue resolves once that's available. I don't actually even see a reference to nbformat-schema in the jupyter/notebook package info to point to my local installation. @rgbkrk do you know how to get that injected for classic? I'll take a look at lab next to see if I can try it there.

MSeal · 2020-10-15T21:51:04Z

Hmm I don't seem to know these front-ends enough to successfully patch them. It does look like lab has it's own clone of nbformat baked into the project but modifying that to add cell id alone doesn't appear to be sufficient to persist ids as they get read.

I don't think the front-ends ignoring id fields and having them be auto-replaced is a blocker to merging / releasing nbformat given the current behavior. I'm going to be unavailable for a while starting tomorrow, so if folks could help take over merging / releasing this PR once there's some thumbs up from anyone who maintains lab and classic I'd appreciate it.

rgbkrk · 2020-10-15T22:01:25Z

There might be more than one spot to hit this within notebook.js and cell.js --

The nbformat_minor is set to 1 in notebook.js:
https://github.com/jupyter/notebook/blob/2ba296039aa70afbfd90630fea14bb8461b40410/notebook/static/notebook/js/notebook.js#L145, so once we're ready to do the (almost) full thing in classic, then we'll want to bump this.

However... The main issue appears to be the toJSON and fromJSON functions on the Cell classes (and subclasses). Note the default behavior on toJSON:

https://github.com/jupyter/notebook/blob/1550410229e302ecf74d5bf82d5caba6e6296a64/notebook/static/notebook/js/cell.js#L489

We need to do something like data.id = this.id which would mean we'd also be reading it in the fromJSON call: https://github.com/jupyter/notebook/blob/1550410229e302ecf74d5bf82d5caba6e6296a64/notebook/static/notebook/js/cell.js#L510

You can tell it's working from the server because it doesn't give that big warning banner about the notebook validation (which does happen server side).

willingc · 2020-11-20T17:15:52Z

@jasongrout @afshin @rgbkrk Any updates on lab and classic? It would be great to get this merged and we can wait on cutting the next nbformat release if needed for lab or classic. FYI/ @MSeal

blois · 2020-11-25T18:49:41Z

I was looking into supporting this for Colab and am wondering what the order of operations is- are editors implementing and releasing support for the 4.5 schema before it is officially merged into the spec or should implementations wait for this to be merged in?

teucer · 2020-11-27T20:46:22Z

Planning also to use this functionality for cell level comments. Any timelines, please?

MSeal · 2020-11-28T22:34:33Z

@blois We've adopted the new format ahead of time for Noteable. I was hoping lab maintainers would put a say on this before I went on paternity leave in case they wanted to get ahead. I am thinking of just merging and releasing at this point as it doesn't break anything in lab/classic, just replaces ids periodically because those front-ends have their own management of cell structures outside of nbformat in the UI layers. @willingc @rgbkrk opinions on just merging and getting this out of pending status?

willingc · 2020-11-28T22:56:05Z

@MSeal I would say go ahead and merge with and then release as in #199.

goanpeca · 2020-11-30T23:27:26Z

Hi @willingc, fixed the merge conflicts. Should be good now :)

blois · 2020-12-02T00:59:39Z

For reference I was planning on supporting this in two phases- first would support cell IDs if the notebook was version 4.5 but not create new notebooks with version 4.5. Once most tools performing strict nbformat validation have updated to support 4.5 then we can start generating these files.

jasongrout · 2020-12-02T05:21:51Z

FYI, I think we can work on this for our JupyterLab 3.1 release.

I think the plan @blois puts forth makes sense. Since a 4.5 format file does not validate as a 4.4 file (if I understand correctly), we'll need time for people to upgrade their nbformat package and tools to support the format before we can start generating them by default.

rgbkrk · 2020-12-15T23:52:09Z

Thank you for moving this forward!

MSeal · 2020-12-17T20:10:21Z

@rgbkrk Should I kick off the npm and pypi release this weekend? I could also do a beta release first so folks could use it without auto-upgrading while they prepare releases.

goanpeca · 2020-12-17T20:49:00Z

Hi @MSeal, nice to have you back 👶🏽 :)

So @rgbkrk provided me access to PyPI and npm, so I really wanted to have #195 in to test it to make the releases.

Would you like to meet over the weekend and go over the process together? otherwise I would be happy to merge and release a Beta to try things out.

MSeal · 2020-12-18T01:50:34Z

Yea, let's meet this weekend @goanpeca. Afternoons US/Pacific time zone is best for me. Shoot me an email or ping me on the nteract slack and we can pick a time

goanpeca · 2020-12-18T01:52:57Z

I do not think I have access to the nteract slack 🙃?

Afternoons US/Pacific time zone is best for me

Sounds goo, will send an invite.

goanpeca · 2020-12-18T01:55:27Z

Just sent an invite, @MSeal does that work?

MSeal · 2020-12-18T04:10:09Z

Can we do two hours earlier or on Saturday?

The nteract slack is public fyi: https://nteract.slack.com/

willingc · 2020-12-18T22:02:50Z

Thanks for picking this up @MSeal

MSeal · 2020-12-20T23:15:49Z

@blois @jasongrout @rgbkrk FYI there's a beta0 release now with these changes for npm and pypi. @goanpeca got the bump2version all working today to support beta and prod releases automatically with tags / github actions and we used to it today to do the beta release.

I was planning to let the beta sit until early Jan and then do the full release as 5.0.9 without any beta indicators if that sounds reasonable to folks.

jasongrout · 2021-03-29T17:56:50Z

Just curious, is there a specific reason the cell ids were not added to unrecognized cells as top-level fields, along with cell_type and metadata? I could see it both ways, but it would seem nice and consistent for tooling if all cells had ids, not just the three official types.

willingc · 2021-03-29T21:41:55Z

@jasongrout At the point where @MSeal and I wrote the JEP, we wrote it with the 3 official types specified as a cell in mind. We explicitly focused on the 3 official types since those would be the ones most commonly referenced.

MSeal · 2021-03-30T06:39:28Z

Echoing @willingc. Additional I think it'd be reasonable to add it to unidentified cells... however that definition is kind of off, because I believe the spec doesn't actually allow for any cells to be anything but the 3 officials types via:

nbformat/nbformat/v4/nbformat.v4.schema.json

Lines 109 to 116 in 94c17b0

    
           "cell": { 
        
               "type": "object", 
        
               "oneOf": [ 
        
                   {"$ref": "#/definitions/raw_cell"}, 
        
                   {"$ref": "#/definitions/markdown_cell"}, 
        
                   {"$ref": "#/definitions/code_cell"} 
        
               ] 
        
           },

. Nothing references the unrecognized_cell definition so it's effectively a no-op in the spec.

jmtayloruk · 2021-03-31T10:07:28Z

Apologies if this is not the right place to flag this up (a brief search didn't find a more suitable place to raise this, and it's an observation that applies directly to the Cell ID implementation).

I think it would be worth sanitising the dictionary that is used to generate these cell IDs. One notebook I happened to peruse contained "transsexual-death" and "brown-israeli". An auto-generated system with scope for this sort of output just seems to be asking for trouble. I presume there must be methods out there for avoiding potentially controversial combinations? e.g. if whatthreewords had problems like this, I imagine we would have heard about it by now. Theirs are often entertaining, but never seem controversial.

MSeal · 2021-03-31T17:20:38Z

@jmtayloruk This issue was raised in #216 -- we're looking at how to prune the word list to avoid those issues. If you have input or suggestions after reading the thread we'd be happy to hear. At a minimum we're going to prune the list more aggressively in the near future.

jmtayloruk · 2021-03-31T17:25:42Z

Thanks @MSeal.

willingc reviewed Aug 27, 2020

View reviewed changes

nbformat/v4/nbformat.v4.5.schema.json Outdated Show resolved Hide resolved

MSeal changed the title ~~[WIP for JEP proposal review] Initial pass at adding proposed cell ids~~ Cell ID JEP Implementation Oct 2, 2020

MSeal force-pushed the cellIdProposal branch from d89b3b3 to a20a9ca Compare October 2, 2020 07:44

MSeal force-pushed the cellIdProposal branch from a20a9ca to 2ab7514 Compare October 2, 2020 15:58

MSeal requested review from minrk and takluyver October 6, 2020 21:22

MSeal mentioned this pull request Oct 6, 2020

PR: Use fastjsonschema if installed and add tests #191

Merged

Implemented CellIds from JEP-62

9ede360

MSeal force-pushed the cellIdProposal branch from 2ab7514 to 9ede360 Compare October 10, 2020 22:02

nicholaswold mentioned this pull request Oct 21, 2020

Applying minor version upgrades to notebook #196

Closed

willingc mentioned this pull request Nov 21, 2020

Prepare next release 5.1 #199

Closed

6 tasks

Merge with master and fix conflicts

3b0abec

willingc merged commit e7e16bd into jupyter:master Dec 1, 2020

willingc mentioned this pull request Dec 1, 2020

Create an alpha release including cell id for testing nteract/nteract#5384

Closed

mishaschwartz mentioned this pull request Jan 6, 2021

Ensure that cell ids persist after save jupyter/notebook#5928

Merged

jasongrout mentioned this pull request Feb 3, 2021

Support notebook format 4.5, adding cell IDs jupyterlab/jupyterlab#9729

Closed

jameslamb mentioned this pull request Feb 20, 2021

reorganize notebooks, add dask-ml to images jameslamb/lightgbm-dask-testing#7

Merged

MSeal mentioned this pull request Feb 27, 2021

Upgrade Jupyter Notebook files to use 4.5 format microsoft/vscode#117839

Closed

blois mentioned this pull request Mar 15, 2021

Missing proper language field googlecolab/colabtools#625

Closed

jasongrout mentioned this pull request Mar 29, 2021

Add cell id per notebook format 4.5 jupyterlab/jupyterlab#10018

Merged

blois mentioned this pull request Sep 29, 2021

jupyter: include cell id's when exporting to ipynb sagemathinc/cocalc#5504

Open

MSeal mentioned this pull request Oct 11, 2021

Upgrade Jupyter Notebook files to use 4.5 format microsoft/vscode#134834

Closed

sineed mentioned this pull request Nov 11, 2022

Where is version 4.5? #323

Closed

maartenbreddels mentioned this pull request Apr 26, 2024

fix: trust new notebooks and notebooks with new cells jupyter/nbclassic#275

Merged

Cell ID JEP Implementation #189

Cell ID JEP Implementation #189

Conversation

MSeal commented Aug 27, 2020 • edited Loading

MSeal commented Oct 2, 2020

MSeal commented Oct 2, 2020

willingc commented Oct 2, 2020

vidartf commented Oct 3, 2020

willingc commented Oct 3, 2020

MSeal commented Oct 6, 2020

afshin commented Oct 7, 2020

MSeal commented Oct 7, 2020

MSeal commented Oct 7, 2020

rgbkrk commented Oct 15, 2020

MSeal commented Oct 15, 2020

MSeal commented Oct 15, 2020

rgbkrk commented Oct 15, 2020

willingc commented Nov 20, 2020

blois commented Nov 25, 2020

teucer commented Nov 27, 2020

MSeal commented Nov 28, 2020

willingc commented Nov 28, 2020

goanpeca commented Nov 30, 2020

blois commented Dec 2, 2020

jasongrout commented Dec 2, 2020

rgbkrk commented Dec 15, 2020

MSeal commented Dec 17, 2020

goanpeca commented Dec 17, 2020

MSeal commented Dec 18, 2020

goanpeca commented Dec 18, 2020

goanpeca commented Dec 18, 2020 • edited Loading

MSeal commented Dec 18, 2020

willingc commented Dec 18, 2020

MSeal commented Dec 20, 2020

jasongrout commented Mar 29, 2021

willingc commented Mar 29, 2021

MSeal commented Mar 30, 2021

jmtayloruk commented Mar 31, 2021

MSeal commented Mar 31, 2021

jmtayloruk commented Mar 31, 2021 • edited Loading

MSeal commented Aug 27, 2020 •

edited

Loading

goanpeca commented Dec 18, 2020 •

edited

Loading

jmtayloruk commented Mar 31, 2021 •

edited

Loading