Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically dedent docstring constants by default #81283

Closed
gpshead opened this issue May 30, 2019 · 21 comments
Closed

Automatically dedent docstring constants by default #81283

gpshead opened this issue May 30, 2019 · 21 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@gpshead
Copy link
Member

gpshead commented May 30, 2019

BPO 37102
Nosy @gpshead, @methane, @Carreau, @pablogsal, @remilapeyre

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-05-30.18:00:00.547>
labels = ['interpreter-core', 'type-feature', '3.9']
title = 'Automatically dedent docstring constants by default'
updated_at = <Date 2019-06-06.14:36:45.161>
user = 'https://github.com/gpshead'

bugs.python.org fields:

activity = <Date 2019-06-06.14:36:45.161>
actor = 'remi.lapeyre'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2019-05-30.18:00:00.547>
creator = 'gregory.p.smith'
dependencies = []
files = []
hgrepos = []
issue_num = 37102
keywords = []
message_count = 4.0
messages = ['343990', '344128', '344801', '344819']
nosy_count = 5.0
nosy_names = ['gregory.p.smith', 'methane', 'mbussonn', 'pablogsal', 'remi.lapeyre']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue37102'
versions = ['Python 3.9']

Linked PRs

@gpshead
Copy link
Member Author

gpshead commented May 30, 2019

I'm spawning this issue of as a separate feature from https://bugs.python.org/issue36906 (adding string dedent method and an optimization to do it at compile timer on constants).

It'd be great if docstrings were given a similar treatment. Right now we carry the whitespace burden within the str constants for __doc__ stored in all code objects in the process. This adds up.

This is not _quite_ the same as calling textwrap.dedent() or the upcoming str.dedent method on them at compile time. We need to special case the first line of docstrings. inspect.getdoc() is our library function that does this for us today.

Chance of breaking things with this change? not impossible, but extremely minor. Something using docstrings as data _and_ depending on leading indentation whitespace preservation would not like this. I am not aware of anything that would ever do that.

@gpshead gpshead added 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels May 30, 2019
@remilapeyre
Copy link
Mannequin

remilapeyre mannequin commented May 31, 2019

Hi, I'm working on a PR. It should be ready in a couple of days. It's more involved than what I thought as to avoid importing inspect during compilation I will probably need to port cleandoc() in C.

@methane
Copy link
Member

methane commented Jun 6, 2019

How about do inspect.cleandoc() instead of just a dedent?
Some method has docstring like this:

"""First line

blah blah blah blah

    example code here
...
"""

In such docstring, dedent can not strip indent well.

There is existing attempt (in Japanese):

https://qiita.com/hhatto/items/3da6c6820817395f2c39#%E6%84%9A%E7%9B%B4%E3%81%AB%E5%AE%9F%E8%A3%85%E3%81%97%E3%81%9F%E3%82%B3%E3%83%BC%E3%83%89

@remilapeyre
Copy link
Mannequin

remilapeyre mannequin commented Jun 6, 2019

This is the function I inlined and as far as I can tell, my approach as been similar to the one you linked.

I'm still need to fix some issues as doctest was expecting to find the string before dedenting though.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@methane methane removed the 3.9 only security fixes label Jan 12, 2023
@methane
Copy link
Member

methane commented Jul 4, 2023

cleandoc is not idempotent. If we cleandoc on compile time, pydoc and inspect.getdoc() shouldn't cleandoc(doc).

And if user write to obj.__doc__, user need to cleandoc. Otherwise, pydoc will have broken layout.
This is backward incompatibile change.

import inspect

s = """
  first line
    second line
      third line
"""

while True:
    print('---')
    print(s)
    t = inspect.cleandoc(s)
    if t == s:
        break
    s = t

print('---')

output:

---

  first line
    second line
      third line

---
first line
  second line
    third line
---
first line
second line
  third line
---

@methane
Copy link
Member

methane commented Jul 4, 2023

In the PR, compile time cleandoc doesn't remove leading and trailing newlines.

code:

import inspect

def foo():
    """
    first line
      second
        third
    """

print('---')
print(foo.__doc__)
print('---')
print(inspect.getdoc(foo))
print('---')

current behavior:

---

    first line
      second
        third

---
first line
  second
    third
---

c-cleandoc (#106066)

---

first line
  second
    third

---
first line
  second
    third
---

By this way, I think the PR is minimize incompatibility.
Only one small change to docstring is needed to pass test suite.

https://github.com/python/cpython/pull/106411/files#diff-2752e009dea2759fb0a5a0866446e8bbd3f5ed30c030b26d252d7a1178b413e7R1290

methane added a commit that referenced this issue Jul 15, 2023
@methane methane closed this as completed Jul 15, 2023
hroncok added a commit to hroncok/oauthlib that referenced this issue Oct 24, 2023
Since Python 3.13.0a1, docstrings are automatically dedented.
See python/cpython#81283
and https://docs.python.org/3.13/whatsnew/3.13.html#other-language-changes

As a result, using a docstring with leading space as a test case
breaks the test assumption.

The initial commit which introduced this test a decade ago
(6c0c791)
does not specify why testing the spaces is important.
@hroncok
Copy link
Contributor

hroncok commented Oct 24, 2023

Chance of breaking things with this change? not impossible, but extremely minor. Something using docstrings as data and depending on leading indentation whitespace preservation would not like this. I am not aware of anything that would ever do that.

Tests often do that.

So far I saw:

_________________________________ test_excepts _________________________________

    def test_excepts():
        # These are descriptors, make sure this works correctly.
        assert excepts.__name__ == 'excepts'
>       assert (
            'A wrapper around a function to catch exceptions and\n'
            '    dispatch to a handler.\n'
        ) in excepts.__doc__
E       AssertionError: assert 'A wrapper around a function to catch exceptions and\n    dispatch to a handler.\n' in 'A wrapper around a function to catch exceptions and\ndispatch to a handler.\n\nThis is like a functional try/except block, in the same way that\nifexprs are functional if/else blocks.\n\nExamples\n--------\n>>> excepting = excepts(\n...     ValueError,\n...     lambda a: [1, 2].index(a),\n...     lambda _: -1,\n... )\n>>> excepting(1)\n0\n>>> excepting(3)\n-1\n\nMultiple exceptions and default except clause.\n>>> excepting = excepts((IndexError, KeyError), lambda a: a[0])\n>>> excepting([])\n>>> excepting([1])\n1\n>>> excepting({})\n>>> excepting({0: 1})\n1\n'
E        +  where 'A wrapper around a function to catch exceptions and\ndispatch to a handler.\n\nThis is like a functional try/except block, in the same way that\nifexprs are functional if/else blocks.\n\nExamples\n--------\n>>> excepting = excepts(\n...     ValueError,\n...     lambda a: [1, 2].index(a),\n...     lambda _: -1,\n... )\n>>> excepting(1)\n0\n>>> excepting(3)\n-1\n\nMultiple exceptions and default except clause.\n>>> excepting = excepts((IndexError, KeyError), lambda a: a[0])\n>>> excepting([])\n>>> excepting([1])\n1\n>>> excepting({})\n>>> excepting({0: 1})\n1\n' = excepts.__doc__

toolz/tests/test_functoolz.py:741: AssertionError

(toolz)

And:

________________________ UtilsTests.test_filter_params _________________________

self = <tests.oauth1.rfc5849.test_utils.UtilsTests testMethod=test_filter_params>

    def test_filter_params(self):
    
        # The following is an isolated test function used to test the filter_params decorator.
        @filter_params
        def special_test_function(params, realm=None):
            """ I am a special test function """
            return 'OAuth ' + ','.join(['='.join([k, v]) for k, v in params])
    
        # check that the docstring got through
>       self.assertEqual(special_test_function.__doc__, " I am a special test function ")
E       AssertionError: 'I am a special test function ' != ' I am a special test function '
E       - I am a special test function 
E       +  I am a special test function 
E       ? +

tests/oauth1/rfc5849/test_utils.py:60: AssertionError

(oauthlib)

auvipy pushed a commit to oauthlib/oauthlib that referenced this issue Oct 25, 2023
Since Python 3.13.0a1, docstrings are automatically dedented.
See python/cpython#81283
and https://docs.python.org/3.13/whatsnew/3.13.html#other-language-changes

As a result, using a docstring with leading space as a test case
breaks the test assumption.

The initial commit which introduced this test a decade ago
(6c0c791)
does not specify why testing the spaces is important.
@hroncok
Copy link
Contributor

hroncok commented Oct 26, 2023

======================================================================
FAIL: test_asReStructuredText_empty_with_multiline_docstring (zope.interface.tests.test_document.Test_asReStructuredText.test_asReStructuredText_empty_with_multiline_docstring)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILDROOT/python-zope-interface-6.0-2.fc40.x86_64/usr/lib64/python3.13/site-packages/zope/interface/tests/test_document.py", line 298, in test_asReStructuredText_empty_with_multiline_docstring
    self.assertEqual(self._callFUT(IEmpty), EXPECTED)
AssertionError: '``IE[39 chars]n \n It can be used to annotate any class or o[66 chars]\n\n' != '``IE[39 chars]n \n             It can be used to annotate an[90 chars]\n\n'
  ``IEmpty``
  
   This is an empty interface.
   
-  It can be used to annotate any class or object, because it promises
+              It can be used to annotate any class or object, because it promises
? ++++++++++++
-  nothing.
+              nothing.
  
   Attributes:
  
   Methods:
  


======================================================================
FAIL: test_asStructuredText_empty_with_multiline_docstring (zope.interface.tests.test_document.Test_asStructuredText.test_asStructuredText_empty_with_multiline_docstring)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILDROOT/python-zope-interface-6.0-2.fc40.x86_64/usr/lib64/python3.13/site-packages/zope/interface/tests/test_document.py", line 74, in test_asStructuredText_empty_with_multiline_docstring
    self.assertEqual(self._callFUT(IEmpty), EXPECTED)
AssertionError: 'IEmp[35 chars]n \n It can be used to annotate any class or o[66 chars]\n\n' != 'IEmp[35 chars]n \n             It can be used to annotate an[90 chars]\n\n'
  IEmpty
  
   This is an empty interface.
   
-  It can be used to annotate any class or object, because it promises
+              It can be used to annotate any class or object, because it promises
? ++++++++++++
-  nothing.
+              nothing.
  
   Attributes:
  
   Methods:
  


----------------------------------------------------------------------
Ran 1355 tests in 0.391s

(zope-interface)

@hroncok
Copy link
Contributor

hroncok commented Nov 21, 2023

Even scipy:

________________________________ test_decorator ________________________________
[gw1] linux -- Python 3.13.0 /usr/bin/python3

    @pytest.mark.skipif(DOCSTRINGS_STRIPPED, reason="docstrings stripped")
    def test_decorator():
        with suppress_warnings() as sup:
            sup.filter(category=DeprecationWarning)
            # with unindentation of parameters
            decorator = doccer.filldoc(doc_dict, True)
    
            @decorator
            def func():
                """ Docstring
                %(strtest3)s
                """
>           assert_equal(func.__doc__, """ Docstring
                Another test
                   with some indent
                """)
E           AssertionError: 
E           Items are not equal:
E            ACTUAL: 'Docstring\nAnother test\n   with some indent\n'
E            DESIRED: ' Docstring\n            Another test\n               with some indent\n            '

scipy/misc/tests/test_doccer.py:86: AssertionError

icemac pushed a commit to zopefoundation/zope.interface that referenced this issue Dec 20, 2023
In Python 3.13, compiler strips indents from docstrings.
See python/cpython#81283

Fixes: #279
@Carreau
Copy link
Contributor

Carreau commented Dec 31, 2023

Yep, this breaks IPython as well, as we have test testing that docstrings actually have leading spaces.

Edit: I'll see if we can workaround this as I understand why one might want to do that.

@methane
Copy link
Member

methane commented Jan 5, 2024

Do you think we should revert this change in 3.13?

@hroncok
Copy link
Contributor

hroncok commented Jan 5, 2024

If there is a plan to include it in 3.14 anyway, then probably not. This is a breaking change but I have no idea how to properly have a deprecation period.

@Carreau
Copy link
Contributor

Carreau commented Jan 5, 2024

I think this is still early enough, but the more time goes on, the more Python compile and ast lose informations that is useful for tooling. And it often would be nice to have a 2 passes tokenizer/compiler with maybe an option to not run the cleanup and normalisation passes ?

@gpshead
Copy link
Member Author

gpshead commented Jan 5, 2024

It is ideal if the impacted things can improve their tests to not rely on leading spaces in the middle of docstrings.

IMNSHO, it is still too early in 3.13 to decide if we should roll this back. So far I haven't seen any compelling examples that are not just over specific tests asserting things to be "as previously implemented" rather than code relying specifically on code indentation style/level based leading spaces being present in docstrings.

The win for the world by reducing space consumed by docstrings by default feels rather large. So lets try to see if we can keep this.


regarding future tokenizer & compiler do less or more options... other things have a desire for some of those as well, but so far it has been hard to get people interested in providing that kind of thing within CPython and the standard library as CPython won't use such things and offering such options is by its nature both slower and painful to maintain. It feels like third party Python code analysis tooling may be a better way to get that in a less-disruptive manner.

@pablogsal
Copy link
Member

This is also a compiler change which means that the tokenize and AST modules won't be affected by the optimisation. Tools can still analyse source code normally before this optimisation is applied.

@gpshead
Copy link
Member Author

gpshead commented Jan 5, 2024

This is a breaking change but I have no idea how to properly have a deprecation period.

agreed. it'd be hard to do that in a meaningful manner.

There are plausible designs to retain strict indentation included compatibility while reducing memory consumption & pyc size. They'd add implementation complexity and could cpu usage as a tradeoff. Example: Storing docstrings compressed and transparently decompressing them upon .__doc__ access.

@webknjaz
Copy link
Contributor

Does this affect Sphinx's API generation from docstrings? Has anyone checked? It appears to me that it could break syntaxes like doctests or example code blocks that should be indented in RST which is what often ends up in docstrings.

@methane
Copy link
Member

methane commented Jan 30, 2024

Does this affect Sphinx's API generation from docstrings? Has anyone checked? It appears to me that it could break syntaxes like doctests or example code blocks that should be indented in RST which is what often ends up in docstrings.

This PR doesn't remove all indent. This PR remoes only common indent.
Sphinx uses inspect.getdoc() and inspect.getdoc() uses inspect.cleandoc().
This PR doesn't break it because inspect.cleandoc(s) == inspect.cleandoc(compile_time_remove_indent(s)) is guaranteed.

@webknjaz
Copy link
Contributor

Ah, great! Thanks for clarifying. I wasn't sure by looking at that PR.

@domdfcoding
Copy link
Contributor

Was it intended that with this change tabs are now automatically converted into 8 spaces in __doc__, where previously they were left alone.

For example:

def f():
	"""
	hello
		world
	"""

print(repr(f.__doc__))

Python 3.12: '\n\thello\n\t\tworld\n\t'

Python 3.13: '\nhello\n world\n'

Previously I could call __doc__.expandtabs to get the desired tab size.

@methane
Copy link
Member

methane commented Feb 6, 2024

Was it intended that with this change tabs are now automatically converted into 8 spaces in __doc__, where previously they were left alone.

Yes it is intended. I intended almost ident to inspect.cleandoc() and inspect.getdoc().
Is __doc__ incompatible to inspect.cleandoc() important use case?

penguinpee added a commit to penguinpee/nibabel that referenced this issue Apr 1, 2024
- Dedent docstrings in Python 3.13+
- Fix nipy#1311
- Ref: python/cpython#81283
frenzymadness added a commit to frenzymadness/ipywidgets that referenced this issue Jun 18, 2024
Python compiler newly removes indent from docstrings
python/cpython#81283
martinRenou pushed a commit to frenzymadness/ipywidgets that referenced this issue Aug 21, 2024
Python compiler newly removes indent from docstrings
python/cpython#81283
martinRenou pushed a commit to jupyter-widgets/ipywidgets that referenced this issue Aug 21, 2024
Python compiler newly removes indent from docstrings
python/cpython#81283
lelit added a commit to lelit/pglast that referenced this issue Sep 30, 2024
I think the test failure may be caused by recent Python dedenting
docstrings (see python/cpython#81283).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

7 participants