Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically dedent docstring constants by default #81283

Closed
gpshead opened this issue May 30, 2019 · 21 comments
Closed

Automatically dedent docstring constants by default #81283

gpshead opened this issue May 30, 2019 · 21 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@gpshead
Copy link
Member

gpshead commented May 30, 2019

BPO 37102
Nosy @gpshead, @methane, @Carreau, @pablogsal, @remilapeyre

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-05-30.18:00:00.547>
labels = ['interpreter-core', 'type-feature', '3.9']
title = 'Automatically dedent docstring constants by default'
updated_at = <Date 2019-06-06.14:36:45.161>
user = 'https://github.com/gpshead'

bugs.python.org fields:

activity = <Date 2019-06-06.14:36:45.161>
actor = 'remi.lapeyre'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2019-05-30.18:00:00.547>
creator = 'gregory.p.smith'
dependencies = []
files = []
hgrepos = []
issue_num = 37102
keywords = []
message_count = 4.0
messages = ['343990', '344128', '344801', '344819']
nosy_count = 5.0
nosy_names = ['gregory.p.smith', 'methane', 'mbussonn', 'pablogsal', 'remi.lapeyre']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue37102'
versions = ['Python 3.9']

Linked PRs

@gpshead
Copy link
Member Author

gpshead commented May 30, 2019

I'm spawning this issue of as a separate feature from https://bugs.python.org/issue36906 (adding string dedent method and an optimization to do it at compile timer on constants).

It'd be great if docstrings were given a similar treatment. Right now we carry the whitespace burden within the str constants for __doc__ stored in all code objects in the process. This adds up.

This is not _quite_ the same as calling textwrap.dedent() or the upcoming str.dedent method on them at compile time. We need to special case the first line of docstrings. inspect.getdoc() is our library function that does this for us today.

Chance of breaking things with this change? not impossible, but extremely minor. Something using docstrings as data _and_ depending on leading indentation whitespace preservation would not like this. I am not aware of anything that would ever do that.

@gpshead gpshead added 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels May 30, 2019
@remilapeyre
Copy link
Mannequin

remilapeyre mannequin commented May 31, 2019

Hi, I'm working on a PR. It should be ready in a couple of days. It's more involved than what I thought as to avoid importing inspect during compilation I will probably need to port cleandoc() in C.

@methane
Copy link
Member

methane commented Jun 6, 2019

How about do inspect.cleandoc() instead of just a dedent?
Some method has docstring like this:

"""First line

blah blah blah blah

    example code here
...
"""

In such docstring, dedent can not strip indent well.

There is existing attempt (in Japanese):

https://qiita.com/hhatto/items/3da6c6820817395f2c39#%E6%84%9A%E7%9B%B4%E3%81%AB%E5%AE%9F%E8%A3%85%E3%81%97%E3%81%9F%E3%82%B3%E3%83%BC%E3%83%89

@remilapeyre
Copy link
Mannequin

remilapeyre mannequin commented Jun 6, 2019

This is the function I inlined and as far as I can tell, my approach as been similar to the one you linked.

I'm still need to fix some issues as doctest was expecting to find the string before dedenting though.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@methane methane removed the 3.9 only security fixes label Jan 12, 2023
@methane
Copy link
Member

methane commented Jul 4, 2023

cleandoc is not idempotent. If we cleandoc on compile time, pydoc and inspect.getdoc() shouldn't cleandoc(doc).

And if user write to obj.__doc__, user need to cleandoc. Otherwise, pydoc will have broken layout.
This is backward incompatibile change.

import inspect

s = """
  first line
    second line
      third line
"""

while True:
    print('---')
    print(s)
    t = inspect.cleandoc(s)
    if t == s:
        break
    s = t

print('---')

output:

---

  first line
    second line
      third line

---
first line
  second line
    third line
---
first line
second line
  third line
---

@methane
Copy link
Member

methane commented Jul 4, 2023

In the PR, compile time cleandoc doesn't remove leading and trailing newlines.

code:

import inspect

def foo():
    """
    first line
      second
        third
    """

print('---')
print(foo.__doc__)
print('---')
print(inspect.getdoc(foo))
print('---')

current behavior:

---

    first line
      second
        third

---
first line
  second
    third
---

c-cleandoc (#106066)

---

first line
  second
    third

---
first line
  second
    third
---

By this way, I think the PR is minimize incompatibility.
Only one small change to docstring is needed to pass test suite.

https://github.com/python/cpython/pull/106411/files#diff-2752e009dea2759fb0a5a0866446e8bbd3f5ed30c030b26d252d7a1178b413e7R1290

methane added a commit that referenced this issue Jul 15, 2023
Co-authored-by: Éric <merwok@netwok.org>
@methane methane closed this as completed Jul 15, 2023
hroncok added a commit to hroncok/oauthlib that referenced this issue Oct 24, 2023
Since Python 3.13.0a1, docstrings are automatically dedented.
See python/cpython#81283
and https://docs.python.org/3.13/whatsnew/3.13.html#other-language-changes

As a result, using a docstring with leading space as a test case
breaks the test assumption.

The initial commit which introduced this test a decade ago
(6c0c791)
does not specify why testing the spaces is important.
@hroncok
Copy link
Contributor

hroncok commented Oct 24, 2023

Chance of breaking things with this change? not impossible, but extremely minor. Something using docstrings as data and depending on leading indentation whitespace preservation would not like this. I am not aware of anything that would ever do that.

Tests often do that.

So far I saw:

_________________________________ test_excepts _________________________________

    def test_excepts():
        # These are descriptors, make sure this works correctly.
        assert excepts.__name__ == 'excepts'
>       assert (
            'A wrapper around a function to catch exceptions and\n'
            '    dispatch to a handler.\n'
        ) in excepts.__doc__
E       AssertionError: assert 'A wrapper around a function to catch exceptions and\n    dispatch to a handler.\n' in 'A wrapper around a function to catch exceptions and\ndispatch to a handler.\n\nThis is like a functional try/except block, in the same way that\nifexprs are functional if/else blocks.\n\nExamples\n--------\n>>> excepting = excepts(\n...     ValueError,\n...     lambda a: [1, 2].index(a),\n...     lambda _: -1,\n... )\n>>> excepting(1)\n0\n>>> excepting(3)\n-1\n\nMultiple exceptions and default except clause.\n>>> excepting = excepts((IndexError, KeyError), lambda a: a[0])\n>>> excepting([])\n>>> excepting([1])\n1\n>>> excepting({})\n>>> excepting({0: 1})\n1\n'
E        +  where 'A wrapper around a function to catch exceptions and\ndispatch to a handler.\n\nThis is like a functional try/except block, in the same way that\nifexprs are functional if/else blocks.\n\nExamples\n--------\n>>> excepting = excepts(\n...     ValueError,\n...     lambda a: [1, 2].index(a),\n...     lambda _: -1,\n... )\n>>> excepting(1)\n0\n>>> excepting(3)\n-1\n\nMultiple exceptions and default except clause.\n>>> excepting = excepts((IndexError, KeyError), lambda a: a[0])\n>>> excepting([])\n>>> excepting([1])\n1\n>>> excepting({})\n>>> excepting({0: 1})\n1\n' = excepts.__doc__

toolz/tests/test_functoolz.py:741: AssertionError

(toolz)

And:

________________________ UtilsTests.test_filter_params _________________________

self = <tests.oauth1.rfc5849.test_utils.UtilsTests testMethod=test_filter_params>

    def test_filter_params(self):
    
        # The following is an isolated test function used to test the filter_params decorator.
        @filter_params
        def special_test_function(params, realm=None):
            """ I am a special test function """
            return 'OAuth ' + ','.join(['='.join([k, v]) for k, v in params])
    
        # check that the docstring got through
>       self.assertEqual(special_test_function.__doc__, " I am a special test function ")
E       AssertionError: 'I am a special test function ' != ' I am a special test function '
E       - I am a special test function 
E       +  I am a special test function 
E       ? +

tests/oauth1/rfc5849/test_utils.py:60: AssertionError

(oauthlib)

auvipy pushed a commit to oauthlib/oauthlib that referenced this issue Oct 25, 2023
Since Python 3.13.0a1, docstrings are automatically dedented.
See python/cpython#81283
and https://docs.python.org/3.13/whatsnew/3.13.html#other-language-changes

As a result, using a docstring with leading space as a test case
breaks the test assumption.

The initial commit which introduced this test a decade ago
(6c0c791)
does not specify why testing the spaces is important.
@hroncok
Copy link
Contributor

hroncok commented Oct 26, 2023

======================================================================
FAIL: test_asReStructuredText_empty_with_multiline_docstring (zope.interface.tests.test_document.Test_asReStructuredText.test_asReStructuredText_empty_with_multiline_docstring)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILDROOT/python-zope-interface-6.0-2.fc40.x86_64/usr/lib64/python3.13/site-packages/zope/interface/tests/test_document.py", line 298, in test_asReStructuredText_empty_with_multiline_docstring
    self.assertEqual(self._callFUT(IEmpty), EXPECTED)
AssertionError: '``IE[39 chars]n \n It can be used to annotate any class or o[66 chars]\n\n' != '``IE[39 chars]n \n             It can be used to annotate an[90 chars]\n\n'
  ``IEmpty``
  
   This is an empty interface.
   
-  It can be used to annotate any class or object, because it promises
+              It can be used to annotate any class or object, because it promises
? ++++++++++++
-  nothing.
+              nothing.
  
   Attributes:
  
   Methods:
  


======================================================================
FAIL: test_asStructuredText_empty_with_multiline_docstring (zope.interface.tests.test_document.Test_asStructuredText.test_asStructuredText_empty_with_multiline_docstring)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILDROOT/python-zope-interface-6.0-2.fc40.x86_64/usr/lib64/python3.13/site-packages/zope/interface/tests/test_document.py", line 74, in test_asStructuredText_empty_with_multiline_docstring
    self.assertEqual(self._callFUT(IEmpty), EXPECTED)
AssertionError: 'IEmp[35 chars]n \n It can be used to annotate any class or o[66 chars]\n\n' != 'IEmp[35 chars]n \n             It can be used to annotate an[90 chars]\n\n'
  IEmpty
  
   This is an empty interface.
   
-  It can be used to annotate any class or object, because it promises
+              It can be used to annotate any class or object, because it promises
? ++++++++++++
-  nothing.
+              nothing.
  
   Attributes:
  
   Methods:
  


----------------------------------------------------------------------
Ran 1355 tests in 0.391s

(zope-interface)

@hroncok
Copy link
Contributor

hroncok commented Nov 21, 2023

Even scipy:

________________________________ test_decorator ________________________________
[gw1] linux -- Python 3.13.0 /usr/bin/python3

    @pytest.mark.skipif(DOCSTRINGS_STRIPPED, reason="docstrings stripped")
    def test_decorator():
        with suppress_warnings() as sup:
            sup.filter(category=DeprecationWarning)
            # with unindentation of parameters
            decorator = doccer.filldoc(doc_dict, True)
    
            @decorator
            def func():
                """ Docstring
                %(strtest3)s
                """
>           assert_equal(func.__doc__, """ Docstring
                Another test
                   with some indent
                """)
E           AssertionError: 
E           Items are not equal:
E            ACTUAL: 'Docstring\nAnother test\n   with some indent\n'
E            DESIRED: ' Docstring\n            Another test\n               with some indent\n            '

scipy/misc/tests/test_doccer.py:86: AssertionError

@webknjaz
Copy link
Contributor

Does this affect Sphinx's API generation from docstrings? Has anyone checked? It appears to me that it could break syntaxes like doctests or example code blocks that should be indented in RST which is what often ends up in docstrings.

@methane
Copy link
Member

methane commented Jan 30, 2024

Does this affect Sphinx's API generation from docstrings? Has anyone checked? It appears to me that it could break syntaxes like doctests or example code blocks that should be indented in RST which is what often ends up in docstrings.

This PR doesn't remove all indent. This PR remoes only common indent.
Sphinx uses inspect.getdoc() and inspect.getdoc() uses inspect.cleandoc().
This PR doesn't break it because inspect.cleandoc(s) == inspect.cleandoc(compile_time_remove_indent(s)) is guaranteed.

@webknjaz
Copy link
Contributor

Ah, great! Thanks for clarifying. I wasn't sure by looking at that PR.

@domdfcoding
Copy link
Contributor

Was it intended that with this change tabs are now automatically converted into 8 spaces in __doc__, where previously they were left alone.

For example:

def f():
	"""
	hello
		world
	"""

print(repr(f.__doc__))

Python 3.12: '\n\thello\n\t\tworld\n\t'

Python 3.13: '\nhello\n world\n'

Previously I could call __doc__.expandtabs to get the desired tab size.

@methane
Copy link
Member

methane commented Feb 6, 2024

Was it intended that with this change tabs are now automatically converted into 8 spaces in __doc__, where previously they were left alone.

Yes it is intended. I intended almost ident to inspect.cleandoc() and inspect.getdoc().
Is __doc__ incompatible to inspect.cleandoc() important use case?

penguinpee added a commit to penguinpee/nibabel that referenced this issue Apr 1, 2024
- Dedent docstrings in Python 3.13+
- Fix nipy#1311
- Ref: python/cpython#81283
frenzymadness added a commit to frenzymadness/ipywidgets that referenced this issue Jun 18, 2024
Python compiler newly removes indent from docstrings
python/cpython#81283
martinRenou pushed a commit to frenzymadness/ipywidgets that referenced this issue Aug 21, 2024
Python compiler newly removes indent from docstrings
python/cpython#81283
martinRenou pushed a commit to jupyter-widgets/ipywidgets that referenced this issue Aug 21, 2024
Python compiler newly removes indent from docstrings
python/cpython#81283
lelit added a commit to lelit/pglast that referenced this issue Sep 30, 2024
I think the test failure may be caused by recent Python dedenting
docstrings (see python/cpython#81283).
jtniehof added a commit to jtniehof/spacepy that referenced this issue Oct 14, 2024
potiuk added a commit to potiuk/airflow that referenced this issue Oct 31, 2024
In Python 3.13 compiler, docstrings are automatically dedented. And
while Python 3.13 is not yet officially supported by Airflow for
some strange (and unknown) reasons, when you install breeze with
uv tools that have Python 3.13 installed, it might get into
a state that the docstrings are dedented in Breeze's bytecode and
it causes hash calculation for configuration options that have
docstring-generated help differs on different installations.

In order to make it consistent - we are now always dedenting the
help strings before hash calculation.

See python/cpython#81283
potiuk added a commit to apache/airflow that referenced this issue Oct 31, 2024
In Python 3.13 compiler, docstrings are automatically dedented. And
while Python 3.13 is not yet officially supported by Airflow for
some strange (and unknown) reasons, when you install breeze with
uv tools that have Python 3.13 installed, it might get into
a state that the docstrings are dedented in Breeze's bytecode and
it causes hash calculation for configuration options that have
docstring-generated help differs on different installations.

In order to make it consistent - we are now always dedenting the
help strings before hash calculation.

See python/cpython#81283
drsteve pushed a commit to spacepy/spacepy that referenced this issue Nov 1, 2024
ellisms pushed a commit to ellisms/airflow that referenced this issue Nov 13, 2024
In Python 3.13 compiler, docstrings are automatically dedented. And
while Python 3.13 is not yet officially supported by Airflow for
some strange (and unknown) reasons, when you install breeze with
uv tools that have Python 3.13 installed, it might get into
a state that the docstrings are dedented in Breeze's bytecode and
it causes hash calculation for configuration options that have
docstring-generated help differs on different installations.

In order to make it consistent - we are now always dedenting the
help strings before hash calculation.

See python/cpython#81283
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

7 participants