Paper: Papyri: Better documentation for the Scientific Ecosystem in Jupyter #700

Carreau · 2022-05-27T11:09:40Z

See http://procbuild.scipy.org/ for logs generated by the build process.

Thanks for all your work on organising SciPy.

Editor: @stargaser

Reviewers: @wd15, @karthikmurugadoss

edit papyri.rst test

small edits abstract

Edits051022

Edits cc0516

edits section 1.3

edits papyri solution part

karthikmurugadoss · 2022-06-13T15:30:32Z

The paper presented here discusses Papyri which is an approach for unifying the documentation experience in the scientific Python ecosystem. Primarily, Papyri aims at decoupling the process of creating documentation (using the presented IRD format) and actually rendering the documentation (which can often times be user specific)

The paper describes the motivation and current state of Papyri quite well. The following are my major and minor comments.

Major Comments:

Section on current implementation conveys a lot of information and is a bit difficult to follow for someone who isn't well-versed with existing documentation workflows. A few concepts that are introduced are RST parsing, CBOR representation, etc.
The section on IRD file installation can benefit from a visual showing the different components (SQLite database, Raw storage on disk, etc.) and how these components interact with each other.
Related to the above point, it is not clear why the object information is stored in 3 different places and what exactly is stored in each of these locations. More clarity here would be very helpful.
The context and usefulness of the local graph visualization is not described. Does Papyri create this visual as a part of the documentation generation process? Or is it here specifically to highly the connections into/from ndarray?

Minor Comments:

There are typos and missing words in a number of places which need to be addressed.
In the Current Implementation section, there are a number of referenced tools for which links to their sources would be helpful. e.g. Jedi, Pygments, Quart, Trio, etc.
The numbering for sections can be improved

deniederhut · 2022-06-14T00:47:57Z

@scoobies mark pending comment

full read and typos grammar edits

from wd15: > The intermediate format or IRD is a very important step for the > community. Other tools can build from this format either by > generating the documentation view or by generating the IRD from the > source code. It would be nice in the paper if the authors could > actually describe the details of the format or schema for the > IRD. The schema itself could become a standard for documentation > and, thus, making it transparent to the reader would be useful. We've extended the paragraph speaking of this. As the IRD isstill changing rapidly we don't belove a description that woudl be outdated in a week would be useful in the paper.

from wd15: > The intermediate format or IRD is a very important step for the > community. Other tools can build from this format either by > generating the documentation view or by generating the IRD from the > source code. It would be nice in the paper if the authors could > actually describe the details of the format or schema for the > IRD. The schema itself could become a standard for documentation > and, thus, making it transparent to the reader would be useful. We want to avoid duplication, and would prefer to point to other medatada source. Currently we limit to only what is necessary

Carreau · 2022-06-15T09:51:23Z

Many thanks for both reviews, I've tried to address most of the above points in different commits to try to make re-review easier.

Here is a small summary of the changes.

The intermediate format or IRD is a very important step for the
community. Other tools can build from this format either by
generating the documentation view or by generating the IRD from the
source code. It would be nice in the paper if the authors could
actually describe the details of the format or schema for the
IRD. The schema itself could become a standard for documentation
and, thus, making it transparent to the reader would be useful.

I believe the project is too young to give a complete description of the IRD,
there are still regular changes to the format every 2 to 3 weeks depending on
the activity, thus a detail description would be premature. The IRD is still
changing much less frequently than initially but still too frequently IMHO.
I tied to clarify this.

Suggestion: Pandoc is a tool that uses an AST and can covert
between many markup and documentation formats. Would it be useful
to mention Pandoc in the paper as an example of a successful tool
that uses a similar approach?

Yes, this is a good tool, I've added it. I was also recently made aware of
https://markdoc.io/docs/nodes, that was released recently but haven't had a
change to try it, so didn't wanted to make major changes that late in the
proceeding process.

Suggestion: As part of the IRD, package metadata is stored. Would
it be useful to use an existing schema such as CODEMETA.yaml for
this.

I've extended a bit this section, my main take so far is to limit to the minimum
vital, and rely on metadata that is stored somewhere else. I would much prefer
something like codemeta or other JSON LD format to be part of the package on
PyPI.

Section numbering could be improved by using "X.Y" for
subsections. Also, shouldn't the Introduction be numbered as Section 1?

We removed section numbering altogether, it should be something that is handled
at the proceeding level. Is the proceeding is compiled with a directive like
sectnum then sections will have number, if we add them manually then the
number will appear twice.

Section on current implementation conveys a lot of information and is a bit
difficult to follow for someone who isn't well-versed with existing
documentation workflows. A few concepts that are introduced are RST parsing,
CBOR representation, etc.

I've extend where possible, but as for other comments above about IRD schema, I
don't want to dive too far into the implementation as it is still changing
quickly and should not be necessary to understand the idea.

I hope that in the future other competing projects will for that
produce/consume IRD bundles, and potentially make completely different technical
choices.

The section on IRD file installation can benefit from a visual showing the
different components (SQLite database, Raw storage on disk, etc.) and how
these components interact with each other.

I've tried to clarify as well, and made a schema. I've also extended that these
choices are mostly made due to the current use case I target and could/should be
reconsider by an implementation with different targets.

Related to the above point, it is not clear why the object information is
stored in 3 different places and what exactly is stored in each of these
locations. More clarity here would be very helpful.

As for above, I've tried to clarify, let me know if this is clearer.

The context and usefulness of the local graph visualization is not described.
Does Papyri create this visual as a part of the documentation generation
process? Or is it here specifically to highly the connections into/from
ndarray?

Thanks, this was indeed not clear, I've reworked this section. This was trying
to demonstrate that changes to the documentation could be done without having
to re-do the generation step. This graph is indeed generated at render time, and
updates depending on which libraries you have documentation installed for, and
give you ideas of types of UI changes that could be implemented later.

There are typos and missing words in a number of places which need to be addressed.

We tried to fix things the best we could, and would appreciate any pointers to
remaining mistakes.

In the Current Implementation section, there are a number of referenced tools for which links to their sources would be helpful. e.g. Jedi, Pygments, Quart, Trio, etc.

This should be mostly fixed, beyond a couple of citation I need to expand with
the right DOIs.

full check

deniederhut · 2022-06-24T01:18:21Z

Awesome! @wd15 and @karthikmurugadoss -- do you feel this paper is now ready for inclusion in the proceedings?

karthikmurugadoss · 2022-06-24T02:58:54Z

Yes did another read through and it looks good to me!

Carreau · 2022-06-27T09:10:27Z

Yes did another read through and it looks good to me!

Many thanks, that's giving me extra motivation after the week-end !

wd15 · 2022-06-27T14:28:28Z

Looks much better. Nice work!

README.md

stargaser · 2022-07-01T21:34:11Z

@scoobies mark ready

Carreau and others added 30 commits May 4, 2022 10:27

start

a0739bb

edit papyri.rst test

c941a8a

Merge pull request #1 from carvalhocamille/2022

5697e4b

edit papyri.rst test

updates

237eefa

updates

631189b

misc

6366b99

Merge remote-tracking branch 'github/2022' into 2022

916d6b3

small edits

f1df383

edits rst intro/motiv

00d94e8

Merge pull request #2 from carvalhocamille/2022

1643ba3

small edits abstract

Merge pull request #3 from carvalhocamille/edits051022

d609b57

Edits051022

syn

06303b2

Merge remote-tracking branch 'github/2022' into 2022

3e713df

add ping

8086bf5

update

dd6e35b

more

008caed

edits intro

2747f92

edits intro and section 1 (need 1.3 to be edited still)

cfad97d

Merge pull request #4 from carvalhocamille/editsCC0516

b089e61

Edits cc0516

typoes

c0af4e5

edits section 1.3

67ef338

Merge pull request #6 from carvalhocamille/section1.3_bis

96146e2

edits section 1.3

precision

f3a2370

table pblm

7d80ba2

more updates

6381f10

more updates

d782e16

edits papyri solution part

c70fe00

new edits on IRD files

d89f0d5

Merge pull request #7 from carvalhocamille/edits_train_0522

d70a1b2

edits papyri solution part

syntax

de05236

fcollonval mentioned this pull request Jun 9, 2022

Weekly Team Meetings: Jan-Jun 2022 jupyterlab/frontends-team-compass#135

Closed

full read and typos grammar edits

b7ec884

scoobies added pending-comment and removed needs-more-review labels Jun 14, 2022

Carreau added 8 commits June 14, 2022 10:08

Merge pull request #11 from carvalhocamille/CC_typos_edits

9d485c4

full read and typos grammar edits

rework section on IRD file installation

b63ac3e

major rework on current implementation section

b4b7545

Extend on network graph

5e6c590

Add multiple reference reference and citations

b1129c7

finish citation

80a48d1

carvalhocamille and others added 2 commits June 16, 2022 15:38

full check

f18bc04

Merge pull request #12 from carvalhocamille/last_spell_check

a24c009

full check

wd15 approved these changes Jun 27, 2022

View reviewed changes

karthikmurugadoss approved these changes Jun 27, 2022

View reviewed changes

stargaser reviewed Jun 29, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

Carreau commented Jun 30, 2022

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update README.md

2d6bb04

scoobies added ready-for-review and removed pending-comment labels Jul 1, 2022

deniederhut merged commit 219b06f into scipy-conference:2022 Jul 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paper: Papyri: Better documentation for the Scientific Ecosystem in Jupyter #700

Paper: Papyri: Better documentation for the Scientific Ecosystem in Jupyter #700

Carreau commented May 27, 2022 •

edited by scoobies

Loading

karthikmurugadoss commented Jun 13, 2022

deniederhut commented Jun 14, 2022

Carreau commented Jun 15, 2022

deniederhut commented Jun 24, 2022

karthikmurugadoss commented Jun 24, 2022

Carreau commented Jun 27, 2022

wd15 commented Jun 27, 2022

stargaser commented Jul 1, 2022

Paper: Papyri: Better documentation for the Scientific Ecosystem in Jupyter #700

Paper: Papyri: Better documentation for the Scientific Ecosystem in Jupyter #700

Conversation

Carreau commented May 27, 2022 • edited by scoobies Loading

karthikmurugadoss commented Jun 13, 2022

deniederhut commented Jun 14, 2022

Carreau commented Jun 15, 2022

deniederhut commented Jun 24, 2022

karthikmurugadoss commented Jun 24, 2022

Carreau commented Jun 27, 2022

wd15 commented Jun 27, 2022

stargaser commented Jul 1, 2022

Carreau commented May 27, 2022 •

edited by scoobies

Loading