-
Notifications
You must be signed in to change notification settings - Fork 4
Docstring Errors Examples
Galuh Sahid edited this page Feb 10, 2020
·
19 revisions
- GL01: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
- GL02: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
- GL08: The object does not have a docstring
- SS06: Summary should fit in a single line
- ES01: No extended summary found
- PR01: Parameters {missing_params} not documented
- PR02: Unknown parameters {unknown_params}
- PR07: Parameter description should finish with "."
- SA04: Missing description for see also
GL01: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
def assert_categorical_equal(
left, right, check_dtype=True, check_category_order=True, obj="Categorical"
):
"""Test that Categoricals are equivalent.
Parameters
----------
left : Categorical
...
"""
pass
def assert_categorical_equal(
left, right, check_dtype=True, check_category_order=True, obj="Categorical"
):
"""
Test that Categoricals are equivalent.
Parameters
----------
left : Categorical
...
"""
pass
Read more about summary in Contributing Docstring - Section 2: Extended Summary
GL02: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
def unstack():
"""
Pivot a row index to columns.
When using a MultiIndex, a level can be pivoted so each value in
the index becomes a column. This is especially useful when a subindex
is repeated for the main index, and data is easier to visualize as a
pivot table.
The index level will be automatically removed from the index when added
as columns."""
pass
def unstack():
"""
Pivot a row index to columns.
When using a MultiIndex, a level can be pivoted so each value in
the index becomes a column. This is especially useful when a subindex
is repeated for the main index, and data is easier to visualize as a
pivot table.
The index level will be automatically removed from the index when added
as columns.
"""
pass
@property
def right(self):
return self._data._right
@property
def right(self):
"""
Return the right endpoints of each Interval in the IntervalIndex as
an Index.
"""
return self._data._right
Read how to write a docstring here.
def duplicated(self, subset=None, keep="first"):
"""
Return boolean Series denoting duplicate rows, optionally only
considering certain columns.
"""
...
def duplicated(self, subset=None, keep="first"):
"""
Return boolean Series denoting duplicate rows.
Considering certain columns is optional.
"""
...
Rewriting the docstring might be required.
def unstack():
"""
Pivot a row index to columns.
"""
pass
def unstack():
"""
Pivot a row index to columns.
When using a MultiIndex, a level can be pivoted so each value in
the index becomes a column. This is especially useful when a subindex
is repeated for the main index, and data is easier to visualize as a
pivot table.
The index level will be automatically removed from the index when added
as columns.
"""
pass
Read more about extended summary here.
class Series:
def plot(self, kind, **kwargs):
"""
Generate a plot.
Render the data in the Series as a matplotlib plot of the
specified kind.
Note the blank line between the parameters title and the first
parameter. Also, note that after the name of the parameter `kind`
and before the colon, a space is missing.
Also, note that the parameter descriptions do not start with a
capital letter, and do not finish with a dot.
Finally, the `**kwargs` parameter is missing.
Parameters
----------
kind: str
kind of matplotlib plot
"""
pass
We need to add **kwargs**
to the docstring:
class Series:
def plot(self, kind, color='blue', **kwargs):
"""
Generate a plot.
Render the data in the Series as a matplotlib plot of the
specified kind.
Parameters
----------
kind : str
Kind of matplotlib plot.
color : str, default 'blue'
Color name or rgb code.
**kwargs
These parameters will be passed to the matplotlib plotting
function.
"""
pass
def astype(self, dtype, copy=True, errors="raise", **kwargs):
"""
Cast a pandas object to a specified dtype ``dtype``.
Parameters
----------
dtype : data type, or dict of column name -> data type
Use a numpy.dtype or Python type to cast entire pandas object to
the same type. Alternatively, use {col: dtype, ...}, where col is a
column label and dtype is a numpy.dtype or Python type to cast one
or more of the DataFrame's columns to column-specific types.
copy : bool, default True
Return a copy when ``copy=True`` (be very careful setting
``copy=False`` as changes to values then may propagate to other
pandas objects).
errors : {'raise', 'ignore'}, default 'raise'
Control raising of exceptions on invalid data for provided dtype.
- ``raise`` : allow exceptions to be raised
- ``ignore`` : suppress exceptions. On error return original object.
.. versionadded:: 0.20.0
kwargs : keyword arguments to pass on to the constructor
Returns
-------
casted : same type as caller
"""
...
kwargs
is not recognized as a parameter. It should be **kwargs
.
Change kwargs
to **kwargs
:
def astype(self, dtype, copy=True, errors="raise", **kwargs):
"""
Cast a pandas object to a specified dtype ``dtype``.
Parameters
----------
dtype : data type, or dict of column name -> data type
Use a numpy.dtype or Python type to cast entire pandas object to
the same type. Alternatively, use {col: dtype, ...}, where col is a
column label and dtype is a numpy.dtype or Python type to cast one
or more of the DataFrame's columns to column-specific types.
copy : bool, default True
Return a copy when ``copy=True`` (be very careful setting
``copy=False`` as changes to values then may propagate to other
pandas objects).
errors : {'raise', 'ignore'}, default 'raise'
Control raising of exceptions on invalid data for provided dtype.
- ``raise`` : allow exceptions to be raised
- ``ignore`` : suppress exceptions. On error return original object.
.. versionadded:: 0.20.0
**kwargs : keyword arguments to pass on to the constructor
Returns
-------
casted : same type as caller
"""
...
The code below would output an error "Parameter 'path' type should use 'str' instead of 'string'.
def read_spss(
path: Union[str, Path],
usecols: Optional[Sequence[str]] = None,
convert_categoricals: bool = True,
) -> DataFrame:
"""
Load an SPSS file from the file path, returning a DataFrame.
.. versionadded:: 0.25.0
Parameters
----------
path : string or Path
File path.
usecols : list-like, optional
Return a subset of the columns. If None, return all columns.
convert_categoricals : bool, default is True
Convert categorical columns into pd.Categorical.
Returns
-------
DataFrame
"""
def read_spss(
path: Union[str, Path],
usecols: Optional[Sequence[str]] = None,
convert_categoricals: bool = True,
) -> DataFrame:
"""
Load an SPSS file from the file path, returning a DataFrame.
.. versionadded:: 0.25.0
Parameters
----------
path : str or Path
File path.
usecols : list-like, optional
Return a subset of the columns. If None, return all columns.
convert_categoricals : bool, default is True
Convert categorical columns into pd.Categorical.
Returns
-------
DataFrame
"""
In the example below, the parameter axis
is missing a description:
def _get_counts_nanvar(
value_counts: Tuple[int],
mask: Optional[np.ndarray],
axis: Optional[int],
ddof: int,
dtype=float,
) -> Tuple[Union[int, np.ndarray], Union[int, np.ndarray]]:
""" Get the count of non-null values along an axis, accounting
for degrees of freedom.
Parameters
----------
values_shape : Tuple[int]
shape tuple from values ndarray, used if mask is None
mask : Optional[ndarray[bool]]
locations in values that should be considered missing
axis : Optional[int]
ddof : int
degrees of freedom
dtype : type, optional
type to use for count
Returns
-------
count : scalar or array
d : scalar or array
"""
def _get_counts_nanvar(
value_counts: Tuple[int],
mask: Optional[np.ndarray],
axis: Optional[int],
ddof: int,
dtype=float,
) -> Tuple[Union[int, np.ndarray], Union[int, np.ndarray]]:
""" Get the count of non-null values along an axis, accounting
for degrees of freedom.
Parameters
----------
values_shape : Tuple[int]
shape tuple from values ndarray, used if mask is None
mask : Optional[ndarray[bool]]
locations in values that should be considered missing
axis : Optional[int]
axis to count along
ddof : int
degrees of freedom
dtype : type, optional
type to use for count
Returns
-------
count : scalar or array
d : scalar or array
"""
The description of the parameter axis
does not start with a capital letter:
def take_nd(
arr, indexer, axis: int = 0, out=None, fill_value=np.nan, allow_fill: bool = True
):
"""
Specialized Cython take which sets NaN values in one pass
This dispatches to ``take`` defined on ExtensionArrays. It does not
currently dispatch to ``SparseArray.take`` for sparse ``arr``.
Parameters
----------
arr : array-like
Input array.
indexer : ndarray
1-D array of indices to take, subarrays corresponding to -1 value
indices are filed with fill_value
axis : int, default 0
axis to take from
out : ndarray or None, default None
Optional output array, must be appropriate type to hold input and
fill_value together, if indexer has any -1 value entries; call
maybe_promote to determine this type for any fill_value
fill_value : any, default np.nan
Fill value to replace -1 values with
allow_fill : boolean, default True
If False, indexer is assumed to contain no -1 values so no filling
will be done. This short-circuits computation of a mask. Result is
undefined if allow_fill == False and -1 is present in indexer.
Returns
-------
subarray : array-like
May be the same type as the input, or cast to an ndarray.
"""
def take_nd(
arr, indexer, axis: int = 0, out=None, fill_value=np.nan, allow_fill: bool = True
):
"""
Specialized Cython take which sets NaN values in one pass
This dispatches to ``take`` defined on ExtensionArrays. It does not
currently dispatch to ``SparseArray.take`` for sparse ``arr``.
Parameters
----------
arr : array-like
Input array.
indexer : ndarray
1-D array of indices to take, subarrays corresponding to -1 value
indices are filed with fill_value
axis : int, default 0
Axis to take from
out : ndarray or None, default None
Optional output array, must be appropriate type to hold input and
fill_value together, if indexer has any -1 value entries; call
maybe_promote to determine this type for any fill_value
fill_value : any, default np.nan
Fill value to replace -1 values with
allow_fill : boolean, default True
If False, indexer is assumed to contain no -1 values so no filling
will be done. This short-circuits computation of a mask. Result is
undefined if allow_fill == False and -1 is present in indexer.
Returns
-------
subarray : array-like
May be the same type as the input, or cast to an ndarray.
"""
The description of the parameter axis
does not finish with ".":
def cumsum(self, axis=0, *args, **kwargs):
"""
Cumulative sum of non-NA/null values.
When performing the cumulative summation, any non-NA/null values will
be skipped. The resulting SparseArray will preserve the locations of
NaN values, but the fill value will be `np.nan` regardless.
Parameters
----------
axis : int or None
Axis over which to perform the cumulative summation. If None,
perform cumulative summation over flattened array
Returns
-------
cumsum : SparseArray
"""
def cumsum(self, axis=0, *args, **kwargs):
"""
Cumulative sum of non-NA/null values.
When performing the cumulative summation, any non-NA/null values will
be skipped. The resulting SparseArray will preserve the locations of
NaN values, but the fill value will be `np.nan` regardless.
Parameters
----------
axis : int or None
Axis over which to perform the cumulative summation. If None,
perform cumulative summation over flattened array.
Returns
-------
cumsum : SparseArray
"""
Parameters ending with deprecated
truediv : bool, optional
Whether to use true division, like in Python >= 3.
deprecated:: 1.0.0
RT02: The first line of the Returns section should contain only the type, unless multiple values are being returned
The first line of the Returns section should contain only the type:
def is_overlapping(self):
"""
Return True if the IntervalIndex has overlapping intervals, else False.
Two intervals overlap if they share a common point, including closed
endpoints. Intervals that only have an open endpoint in common do not
overlap.
.. versionadded:: 0.24.0
Returns
-------
bool : Boolean indicating if the IntervalIndex has overlapping intervals.
def is_overlapping(self):
"""
Return True if the IntervalIndex has overlapping intervals, else False.
Two intervals overlap if they share a common point, including closed
endpoints. Intervals that only have an open endpoint in common do not
overlap.
.. versionadded:: 0.24.0
Returns
-------
bool
Boolean indicating if the IntervalIndex has overlapping intervals.
def is_overlapping(self):
"""
Return True if the IntervalIndex has overlapping intervals, else False.
Two intervals overlap if they share a common point, including closed
endpoints. Intervals that only have an open endpoint in common do not
overlap.
.. versionadded:: 0.24.0
Returns
-------
bool
def is_overlapping(self):
"""
Return True if the IntervalIndex has overlapping intervals, else False.
Two intervals overlap if they share a common point, including closed
endpoints. Intervals that only have an open endpoint in common do not
overlap.
.. versionadded:: 0.24.0
Returns
-------
bool
Boolean indicating if the IntervalIndex has overlapping intervals.
def __iter__(self):
"""
Return an iterator over the boxed values
"""
...
for i in range(chunks):
start_i = i * chunksize
end_i = min((i + 1) * chunksize, length)
converted = tslib.ints_to_pydatetime(
data[start_i:end_i], tz=self.tz, freq=self.freq, box="timestamp"
)
for v in converted:
yield v
def __iter__(self):
"""
Return an iterator over the boxed values
Yields
------
tstamp : Timestamp
"""
...
for i in range(chunks):
start_i = i * chunksize
end_i = min((i + 1) * chunksize, length)
converted = tslib.ints_to_pydatetime(
data[start_i:end_i], tz=self.tz, freq=self.freq, box="timestamp"
)
for v in converted:
yield v
def mean(self, skipna=True):
"""
Return the mean value of the Array.
.. versionadded:: 0.25.0
Parameters
----------
skipna : bool, default True
Whether to ignore any NaT elements.
Returns
-------
scalar
Timestamp or Timedelta.
See Also
--------
numpy.ndarray.mean
Series.mean
Notes
-----
mean is only defined for Datetime and Timedelta dtypes, not for Period.
"""
def mean(self, skipna=True):
"""
Return the mean value of the Array.
.. versionadded:: 0.25.0
Parameters
----------
skipna : bool, default True
Whether to ignore any NaT elements.
Returns
-------
scalar
Timestamp or Timedelta.
See Also
--------
numpy.ndarray.mean : Returns the average of array elements along a given axis.
Series.mean : Return the mean value in a Series.
Notes
-----
mean is only defined for Datetime and Timedelta dtypes, not for Period.
"""