Skip to content

Docstring Errors Examples

Galuh Sahid edited this page Feb 10, 2020 · 19 revisions

GL01: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)

Bad

def assert_categorical_equal(
    left, right, check_dtype=True, check_category_order=True, obj="Categorical"
):
    """Test that Categoricals are equivalent.

    Parameters
    ----------
    left : Categorical
    ...
    """
    pass

Good

def assert_categorical_equal(
    left, right, check_dtype=True, check_category_order=True, obj="Categorical"
):
    """
    Test that Categoricals are equivalent.

    Parameters
    ----------
    left : Categorical
    ...
    """
    pass

Read more about summary in Contributing Docstring - Section 2: Extended Summary

Possible false positives (need verification)

None:None:GL01:pandas.tseries.offsets.DateOffset.normalize:Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)

Can't find this in the codebase.

GL02: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)

Bad

def unstack():
    """
    Pivot a row index to columns.

    When using a MultiIndex, a level can be pivoted so each value in
    the index becomes a column. This is especially useful when a subindex
    is repeated for the main index, and data is easier to visualize as a
    pivot table.

    The index level will be automatically removed from the index when added
    as columns."""
    pass

Good

def unstack():
    """
    Pivot a row index to columns.

    When using a MultiIndex, a level can be pivoted so each value in
    the index becomes a column. This is especially useful when a subindex
    is repeated for the main index, and data is easier to visualize as a
    pivot table.

    The index level will be automatically removed from the index when added
    as columns.
    """
    pass

Possible false positives (need verification)

None:None:GL02:pandas.Timestamp.weekday:Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)

Notes:

  • Can't find this in the codebase
  • Only these ones left

GL08: The object does not have a docstring

Read how to write a docstring here.

Bad

@property
def right(self):
    return self._data._right

Good

@property
def right(self):
    """
    Return the right endpoints of each Interval in the IntervalIndex as
    an Index.
    """
    return self._data._right

Possible false positives (need verification)

Notes

  • A majority of these came from pandas/pandas/tseries/offsets.py:381 since it seems like a shared docstring, so fixing it would reduce a lot of GL08 errors.

SS06: Summary should fit in a single line

Rewriting the docstring might be required.

Bad

def duplicated(self, subset=None, keep="first"):
    """
    Return boolean Series denoting duplicate rows, optionally only
    considering certain columns.
    """
    ...

Good

def duplicated(self, subset=None, keep="first"):
    """
    Return boolean Series denoting duplicate rows.

    Considering certain columns is optional.
    """
    ...

Possible False Positives (need verification)

None:None:SS06:pandas.core.groupby.SeriesGroupBy.is_monotonic_increasing:Summary should fit in a single line

Can't find it in the codebase.

ES01: No extended summary found

Read more about extended summary here.

Bad

def unstack():
    """
    Pivot a row index to columns.
    """
    pass

Good

def unstack():
    """
    Pivot a row index to columns.

    When using a MultiIndex, a level can be pivoted so each value in
    the index becomes a column. This is especially useful when a subindex
    is repeated for the main index, and data is easier to visualize as a
    pivot table.

    The index level will be automatically removed from the index when added
    as columns.
    """
    pass

PR01: Parameters {missing_params} not documented

Bad

class Series:
    def plot(self, kind, **kwargs):
        """
        Generate a plot.

        Render the data in the Series as a matplotlib plot of the
        specified kind.

        Note the blank line between the parameters title and the first
        parameter. Also, note that after the name of the parameter `kind`
        and before the colon, a space is missing.

        Also, note that the parameter descriptions do not start with a
        capital letter, and do not finish with a dot.

        Finally, the `**kwargs` parameter is missing.

        Parameters
        ----------

        kind: str
            kind of matplotlib plot
        """
        pass

Good

We need to add **kwargs** to the docstring.

class Series:
    def plot(self, kind, color='blue', **kwargs):
        """
        Generate a plot.

        Render the data in the Series as a matplotlib plot of the
        specified kind.

        Parameters
        ----------
        kind : str
            Kind of matplotlib plot.
        color : str, default 'blue'
            Color name or rgb code.
        **kwargs
            These parameters will be passed to the matplotlib plotting
            function.
        """
        pass

PR02: Unknown parameters {unknown_params}

Bad

def astype(self, dtype, copy=True, errors="raise", **kwargs):
	"""
        Cast a pandas object to a specified dtype ``dtype``.
        Parameters
        ----------
        dtype : data type, or dict of column name -> data type
            Use a numpy.dtype or Python type to cast entire pandas object to
            the same type. Alternatively, use {col: dtype, ...}, where col is a
            column label and dtype is a numpy.dtype or Python type to cast one
            or more of the DataFrame's columns to column-specific types.
        copy : bool, default True
            Return a copy when ``copy=True`` (be very careful setting
            ``copy=False`` as changes to values then may propagate to other
            pandas objects).
        errors : {'raise', 'ignore'}, default 'raise'
            Control raising of exceptions on invalid data for provided dtype.
            - ``raise`` : allow exceptions to be raised
            - ``ignore`` : suppress exceptions. On error return original object.
            .. versionadded:: 0.20.0
        kwargs : keyword arguments to pass on to the constructor
        Returns
        -------
        casted : same type as caller
	"""
	...

kwargs is not recognized as a parameter. It should be **kwargs.

Good

Change kwargs to **kwargs:

def astype(self, dtype, copy=True, errors="raise", **kwargs):
	"""
        Cast a pandas object to a specified dtype ``dtype``.
        Parameters
        ----------
        dtype : data type, or dict of column name -> data type
            Use a numpy.dtype or Python type to cast entire pandas object to
            the same type. Alternatively, use {col: dtype, ...}, where col is a
            column label and dtype is a numpy.dtype or Python type to cast one
            or more of the DataFrame's columns to column-specific types.
        copy : bool, default True
            Return a copy when ``copy=True`` (be very careful setting
            ``copy=False`` as changes to values then may propagate to other
            pandas objects).
        errors : {'raise', 'ignore'}, default 'raise'
            Control raising of exceptions on invalid data for provided dtype.
            - ``raise`` : allow exceptions to be raised
            - ``ignore`` : suppress exceptions. On error return original object.
            .. versionadded:: 0.20.0
        **kwargs : keyword arguments to pass on to the constructor
        Returns
        -------
        casted : same type as caller
	"""
	...

Possible false positives (need verification)

pandas/pandas/plotting/_core.py:542:PR02:pandas.DataFrame.plot:Unknown parameters {'table', 'xlim', 'colormap', 'colorbar', '**kwargs', 'title', 'yticks', 'position', 'xerr', 'mark_right', 'grid', 'style', 'fontsize', 'backend', 'logy', 'y', 'ylim', 'rot', 'kind', 'legend', 'include_bool', 'loglog', 'xticks', 'logx', 'use_index', 'yerr', 'x', 'figsize'}

I'm not sure if this is a false positive or not.

None:None:PR02:pandas.core.groupby.DataFrameGroupBy.tshift:Unknown parameters {'axis', 'periods', 'freq'}

I found tshift in pandas/core/generic.py:8923 but it looks like they already have parameters?

PR07: Parameter description should finish with "."

Possible false positives

Parameters ending with deprecated

    truediv : bool, optional
        Whether to use true division, like in Python >= 3.
        deprecated:: 1.0.0

SA04: Missing description for see also

Bad

@property
    def categories(self):
        """
        The categories of this categorical.

        Setting assigns new values to each category (effectively a rename of
        each individual category).

        The assigned value has to be a list-like object. All items must be
        unique and the number of items in the new categories must be the same
        as the number of items in the old categories.

        Assigning to `categories` is a inplace operation!

        Raises
        ------
        ValueError
            If the new categories do not validate as categories or if the
            number of new categories is unequal the number of old categories

        See Also
        --------
        rename_categories
        reorder_categories
        add_categories
        remove_categories
        remove_unused_categories
        set_categories
        """
       pass

Good

def mean(self, skipna=True):
        """
        Return the mean value of the Array.

        .. versionadded:: 0.25.0

        Parameters
        ----------
        skipna : bool, default True
            Whether to ignore any NaT elements.

        Returns
        -------
        scalar
            Timestamp or Timedelta.

        See Also
        --------
        numpy.ndarray.mean : Returns the average of array elements along a given axis.
        Series.mean : Return the mean value in a Series.

        Notes
        -----
        mean is only defined for Datetime and Timedelta dtypes, not for Period.
        """
Clone this wiki locally