Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please transform NumpyDoc/ RestructuredText docstrings to nice tooltips to support useful docstrings across VS Code, Jupyter and Sphinx #5363

Closed
judej opened this issue Jan 17, 2024 Discussed in #2677 · 14 comments
Assignees
Labels
docstrings enhancement New feature or request fixed in next version (main) A fix has been implemented and will appear in an upcoming version

Comments

@judej
Copy link
Contributor

judej commented Jan 17, 2024

Discussed in #2677

Originally posted by MarcSkovMadsen January 26, 2022
Hi Pyright.

I believe you are providing the tooltips formatting for VS Code python files. So here goes 😄

I am VS Code/ Pylance/ Pyright user and contributor to the HoloViz ecosystem. Especially Panel. Panel builds on Param which provides parameters for your python classes. Similar to dataclasses, attrs, traits, traitlets, pydantic, django models etc. Param is especially well suited for building reactive, GUI applications.

I started out wanting to improve the development experiment for the HoloViz ecosystem in VS Code. VS Code does not understand param.Parametrized classes. After some discussions,I figured I need to autogenerate stub files. I experiment with that here holoviz/panel#3132. It's now at a level where I can autogenerate stubs for Panel.

But then I realized that the tooltips in VS Code only formats docstrings to a limited extent. And it clearly formats Markdown formatted docstrings better than RestructuredText based docstrings.

This makes it hard to provide docstrings that work great in VS Code, Ipython/ Jupyter and Sphinx. For Sphinx The HoloViz ecosystem uses NumpyDoc/ RestructuredText based docstrings. This is not well supported as tooltips in VS Code'.

RestructuredText module docstring example

Tooltips in VS Code cannot format Restructured hyperlinks, code blocks, figure blocks etc, which makes a module docstring like attached not very useful in VS Code.

panel_init.csv (rename to .rst before use)

panel_init_rst.mp4

Here I would expect to have clickable hyperlinks, well formatted code blocks and some nice .gifs displayed. But I dont.

Markdown module docstring example

Tooltips in VS Code can format Markdown hyperlinks, code blocks, figure blocks etc, which makes a module docstring like attached very useful in VS Code. But not very useful for a python package based on NumpyDoc/ RestructuredText and using Sphinx to autogenerate documentation.

panel_init.md

panel_init_markdown.mp4

Solution

Please supported nice tooltips based on NumpyDoc/ RestructuredText docstrings. For me a minimum useful solution would be to add support for hyperlinks, code blocks and figure blocks. Then I would be able to create documentation that is useful for both VS Code and Sphinx.

@judej judej added enhancement New feature or request needs decision Do we want this enhancement? docstrings labels Jan 17, 2024
@github-actions github-actions bot added the needs repro Issue has not been reproduced yet label Jan 17, 2024
@luabud
Copy link
Member

luabud commented Jan 17, 2024

we just converted this back to an issue to add it to our roadmap

@riziles
Copy link

riziles commented Jan 18, 2024

@judej and @luabud , FYI: the excellent Executable Books organization has built a RST-to-Markdown converter in Python: https://github.com/executablebooks/rst-to-myst . They maintain both Python and Typescript versions for many parts of their ecosystem, but I'm not sure if there is Typescript version of that. @rowanc1 or @fwkoch might have more info.

@rchiodo
Copy link
Contributor

rchiodo commented May 2, 2024

One thought for implementation was to translate all doc strings into the some type of AST and then translate the AST into markdown.

I looked into using tree-sitter for this. Seems there is a tree-sitter grammar for restructured text.

It does an okay job of generating a nice AST for say a numpy doc string:

Image

In this example, the sections are for the Parameters and the Returns. Each parameter is a list item.

The problem with this approach is shipping the tree-sitter grammar.

  1. There's no npm module for this grammar. We'd have to build the .node entries for different OS's ourself.
  2. This is just one grammar for restructured text. We'd have to create other grammars for google doc and Epytext and build and ship those too.
  3. The other grammars may not generate the same AST.

@rchiodo
Copy link
Contributor

rchiodo commented May 2, 2024

Another idea would be to run python code to do the conversion. The example mentioned above was this:
https://github.com/executablebooks/rst-to-myst

That has a number of problems though:

  1. It doesn't convert to markdown, but myst-markdown. Myst-markdown is not renderable by VS code. At least the example I used that the original poster had wasn't.
  2. It has a lot of dependencies. We'd have to ship these in order for this to work everywhere.
  3. It of course has to run python code to do this. Not really fast enough if you want to do this conversion on hover.

There's also pandoc, but that's GPL and would also require running python code. There's also docutils, but again, same problem.

@rchiodo
Copy link
Contributor

rchiodo commented May 2, 2024

Another idea is to do the conversion ourselves.

This has one major problem - the amount of cases to handle. restructuredText is rather large. It's unlikely we'd handle all of the cases as seen in the panel_init.csv in a short amount of time.

You might argue without direct conversion to markdown, the tree-sitter solution suffers from this same problem too. We'd still need to handle all of the possible AST nodes and convert them to markdown

@rchiodo
Copy link
Contributor

rchiodo commented May 2, 2024

Another idea is to use some other npm modules. I found this one that kind of does what tree-sitter does:

https://github.com/seikichi/restructured

The problem with that module is how out of date it is. Last serious commit was 8 years ago.

@rchiodo
Copy link
Contributor

rchiodo commented May 2, 2024

This looks promising:
https://peggyjs.org/documentation.html

Need to figure out hard it is to generate a grammar or if somebody already has a grammar for reST

@rchiodo
Copy link
Contributor

rchiodo commented May 3, 2024

Turns out generating a grammar is rather difficult. Essentially we start getting back to what https://github.com/seikichi/restructured has (but with a different AST generated). PEG grammars aren't particularly good at non greedy matching so you have to do custom things to get say the Header sections to work out. They're also terrible at handling errors. If the docstrings aren't perfect, it just fails.

Maybe I'm going about this wrong. Perhaps I should create the object model (or AST first) and then figure out different ways to generate it.

@rchiodo
Copy link
Contributor

rchiodo commented May 6, 2024

Github markdown (the opposite side) can be parsed using https://github.com/commonmark/commonmark.js, which looks like it's not based on a generic parser/grammar combo, but rather a custom parser written in Javascript.

@judej judej removed the needs decision Do we want this enhancement? label May 8, 2024
@rchiodo
Copy link
Contributor

rchiodo commented May 29, 2024

See this query for other docstring related issues that might be resolved by this change:
https://github.com/orgs/microsoft/projects/145/views/1?filterQuery=is%3Aopen+docstring+

@rchiodo
Copy link
Contributor

rchiodo commented May 29, 2024

Current status

  • Using web-tree-sitter and the grammar defined here: https://github.com/stsewd/tree-sitter-rst
  • Parses most restructured text but has some issues
  • A LOT of the docstrings out there are not valid restructured text or are indented incorrectly

Example:

def foo(a:int,b:float,c:str):
    """
    Parameters
    ----------
        a (int): integer number.
        b (float): description that takes
            more than 1 line in docstr
        c (str): word.
    """
    pass

Creates an AST like so:

Image

But what the user likely intended is this:

Image

Which is rather easy to turn into markdown. The first one not so much.

So there's a lot of work to get an AST that's what the user expects.

However for those things that actually have well formatted restructured text (like numpy), the results are pretty nice:

Image

@rchiodo
Copy link
Contributor

rchiodo commented May 29, 2024

Panel works pretty well too:

Image

Image

@microsoft microsoft deleted a comment from heejaechang May 29, 2024
@rchiodo rchiodo added the fixed in next version (main) A fix has been implemented and will appear in an upcoming version label Jun 12, 2024
@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

This issue has been fixed in prerelease version 2024.6.100, which we've just released. You can find the changelog here: CHANGELOG.md

@rchiodo rchiodo closed this as completed Jun 14, 2024
@rchiodo
Copy link
Contributor

rchiodo commented Jun 14, 2024

The fix for this issue is behind an experimental feature flag:

"python.analysis.supportRestructuredText": true

If you want to try out our new restructuredText support, enable this flag. It's behind a flag at the moment until we can make sure it handles all the possible Sphinx/GoogleDoc/Epytext scenarios that customers need. Please log additional issues if this setting isn't working out for you.

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docstrings enhancement New feature or request fixed in next version (main) A fix has been implemented and will appear in an upcoming version
Projects
None yet
Development

No branches or pull requests

5 participants