API: add top-level melt function as method to DataFrame

xref pandas-dev#12640 xref pandas-dev#14876 Author: Aleksey Bilogur <aleksey.bilogur@gmail.com> Closes pandas-dev#15521 from ResidentMario/12640 and squashes the following commits: 1657246 [Aleksey Bilogur] two doc changes 28a38f2 [Aleksey Bilogur] tweak whatsnew entry. 5f306a9 [Aleksey Bilogur] +whatsnew ff895fe [Aleksey Bilogur] Add tests, update docs. 11f3fe4 [Aleksey Bilogur] rm stray debug. 3cbbed5 [Aleksey Bilogur] Melt docstring. d54dc2f [Aleksey Bilogur] +pd.DataFrame.melt.
jreback · Apr 4, 2017 · e50d397 · e50d397
1 parent faf6401
commit e50d397
Show file tree

Hide file tree

Showing 6 changed files with 182 additions and 133 deletions.
diff --git a/doc/source/api.rst b/doc/source/api.rst
@@ -933,6 +933,7 @@ Reshaping, sorting, transposing
    DataFrame.swaplevel
    DataFrame.stack
    DataFrame.unstack
+   DataFrame.melt
    DataFrame.T
    DataFrame.to_panel
    DataFrame.to_xarray

diff --git a/doc/source/reshaping.rst b/doc/source/reshaping.rst
@@ -265,8 +265,8 @@ the right thing:
 Reshaping by Melt
 -----------------
 
-The :func:`~pandas.melt` function is useful to massage a
-DataFrame into a format where one or more columns are identifier variables,
+The top-level :func:``melt` and :func:`~DataFrame.melt` functions are useful to
+massage a DataFrame into a format where one or more columns are identifier variables,
 while all other columns, considered measured variables, are "unpivoted" to the
 row axis, leaving just two non-identifier columns, "variable" and "value". The
 names of those columns can be customized by supplying the ``var_name`` and
@@ -281,10 +281,11 @@ For instance,
                           'height' : [5.5, 6.0],
                           'weight' : [130, 150]})
    cheese
-   pd.melt(cheese, id_vars=['first', 'last'])
-   pd.melt(cheese, id_vars=['first', 'last'], var_name='quantity')
+   cheese.melt(id_vars=['first', 'last'])
+   cheese.melt(id_vars=['first', 'last'], var_name='quantity')
 
-Another way to transform is to use the ``wide_to_long`` panel data convenience function.
+Another way to transform is to use the ``wide_to_long`` panel data convenience
+function.
 
 .. ipython:: python
 

diff --git a/doc/source/whatsnew/v0.20.0.txt b/doc/source/whatsnew/v0.20.0.txt
@@ -324,6 +324,7 @@ Other Enhancements
 - ``Series.sort_index`` accepts parameters ``kind`` and ``na_position`` (:issue:`13589`, :issue:`14444`)
 
 - ``DataFrame`` has gained a ``nunique()`` method to count the distinct values over an axis (:issue:`14336`).
+- ``DataFrame`` has gained a ``melt()`` method, equivalent to ``pd.melt()``, for unpivoting from a wide to long format (:issue:`12640`).
 - ``DataFrame.groupby()`` has gained a ``.nunique()`` method to count the distinct values for all columns within each group (:issue:`14336`, :issue:`15197`).
 
 - ``pd.read_excel()`` now preserves sheet order when using ``sheetname=None`` (:issue:`9930`)

diff --git a/pandas/core/frame.py b/pandas/core/frame.py
@@ -4051,6 +4051,110 @@ def unstack(self, level=-1, fill_value=None):
         from pandas.core.reshape import unstack
         return unstack(self, level, fill_value)
 
+    _shared_docs['melt'] = ("""
+    "Unpivots" a DataFrame from wide format to long format, optionally
+    leaving identifier variables set.
+
+    This function is useful to massage a DataFrame into a format where one
+    or more columns are identifier variables (`id_vars`), while all other
+    columns, considered measured variables (`value_vars`), are "unpivoted" to
+    the row axis, leaving just two non-identifier columns, 'variable' and
+    'value'.
+
+    %(versionadded)s
+    Parameters
+    ----------
+    frame : DataFrame
+    id_vars : tuple, list, or ndarray, optional
+        Column(s) to use as identifier variables.
+    value_vars : tuple, list, or ndarray, optional
+        Column(s) to unpivot. If not specified, uses all columns that
+        are not set as `id_vars`.
+    var_name : scalar
+        Name to use for the 'variable' column. If None it uses
+        ``frame.columns.name`` or 'variable'.
+    value_name : scalar, default 'value'
+        Name to use for the 'value' column.
+    col_level : int or string, optional
+        If columns are a MultiIndex then use this level to melt.
+
+    See also
+    --------
+    %(other)s
+    pivot_table
+    DataFrame.pivot
+
+    Examples
+    --------
+    >>> import pandas as pd
+    >>> df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},
+    ...                    'B': {0: 1, 1: 3, 2: 5},
+    ...                    'C': {0: 2, 1: 4, 2: 6}})
+    >>> df
+       A  B  C
+    0  a  1  2
+    1  b  3  4
+    2  c  5  6
+
+    >>> %(caller)sid_vars=['A'], value_vars=['B'])
+       A variable  value
+    0  a        B      1
+    1  b        B      3
+    2  c        B      5
+
+    >>> %(caller)sid_vars=['A'], value_vars=['B', 'C'])
+       A variable  value
+    0  a        B      1
+    1  b        B      3
+    2  c        B      5
+    3  a        C      2
+    4  b        C      4
+    5  c        C      6
+
+    The names of 'variable' and 'value' columns can be customized:
+
+    >>> %(caller)sid_vars=['A'], value_vars=['B'],
+    ...         var_name='myVarname', value_name='myValname')
+       A myVarname  myValname
+    0  a         B          1
+    1  b         B          3
+    2  c         B          5
+
+    If you have multi-index columns:
+
+    >>> df.columns = [list('ABC'), list('DEF')]
+    >>> df
+       A  B  C
+       D  E  F
+    0  a  1  2
+    1  b  3  4
+    2  c  5  6
+
+    >>> %(caller)scol_level=0, id_vars=['A'], value_vars=['B'])
+       A variable  value
+    0  a        B      1
+    1  b        B      3
+    2  c        B      5
+
+    >>> %(caller)sid_vars=[('A', 'D')], value_vars=[('B', 'E')])
+      (A, D) variable_0 variable_1  value
+    0      a          B          E      1
+    1      b          B          E      3
+    2      c          B          E      5
+
+    """)
+
+    @Appender(_shared_docs['melt'] %
+              dict(caller='df.melt(',
+                   versionadded='.. versionadded:: 0.20.0\n',
+                   other='melt'))
+    def melt(self, id_vars=None, value_vars=None, var_name=None,
+             value_name='value', col_level=None):
+        from pandas.core.reshape import melt
+        return melt(self, id_vars=id_vars, value_vars=value_vars,
+                    var_name=var_name, value_name=value_name,
+                    col_level=col_level)
+
     # ----------------------------------------------------------------------
     # Time series-related
 

diff --git a/pandas/core/reshape.py b/pandas/core/reshape.py
@@ -28,6 +28,8 @@
 import pandas.core.algorithms as algos
 from pandas._libs import algos as _algos, reshape as _reshape
 
+from pandas.core.frame import _shared_docs
+from pandas.util.decorators import Appender
 from pandas.core.index import MultiIndex, _get_na_value
 
 
@@ -701,98 +703,12 @@ def _convert_level_number(level_num, columns):
     return result
 
 
+@Appender(_shared_docs['melt'] %
+          dict(caller='pd.melt(df, ',
+               versionadded="",
+               other='DataFrame.melt'))
 def melt(frame, id_vars=None, value_vars=None, var_name=None,
          value_name='value', col_level=None):
-    """
-    "Unpivots" a DataFrame from wide format to long format, optionally leaving
-    identifier variables set.
-
-    This function is useful to massage a DataFrame into a format where one
-    or more columns are identifier variables (`id_vars`), while all other
-    columns, considered measured variables (`value_vars`), are "unpivoted" to
-    the row axis, leaving just two non-identifier columns, 'variable' and
-    'value'.
-
-    Parameters
-    ----------
-    frame : DataFrame
-    id_vars : tuple, list, or ndarray, optional
-        Column(s) to use as identifier variables.
-    value_vars : tuple, list, or ndarray, optional
-        Column(s) to unpivot. If not specified, uses all columns that
-        are not set as `id_vars`.
-    var_name : scalar
-        Name to use for the 'variable' column. If None it uses
-        ``frame.columns.name`` or 'variable'.
-    value_name : scalar, default 'value'
-        Name to use for the 'value' column.
-    col_level : int or string, optional
-        If columns are a MultiIndex then use this level to melt.
-
-    See also
-    --------
-    pivot_table
-    DataFrame.pivot
-
-    Examples
-    --------
-    >>> import pandas as pd
-    >>> df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},
-    ...                    'B': {0: 1, 1: 3, 2: 5},
-    ...                    'C': {0: 2, 1: 4, 2: 6}})
-    >>> df
-       A  B  C
-    0  a  1  2
-    1  b  3  4
-    2  c  5  6
-
-    >>> pd.melt(df, id_vars=['A'], value_vars=['B'])
-       A variable  value
-    0  a        B      1
-    1  b        B      3
-    2  c        B      5
-
-    >>> pd.melt(df, id_vars=['A'], value_vars=['B', 'C'])
-       A variable  value
-    0  a        B      1
-    1  b        B      3
-    2  c        B      5
-    3  a        C      2
-    4  b        C      4
-    5  c        C      6
-
-    The names of 'variable' and 'value' columns can be customized:
-
-    >>> pd.melt(df, id_vars=['A'], value_vars=['B'],
-    ...         var_name='myVarname', value_name='myValname')
-       A myVarname  myValname
-    0  a         B          1
-    1  b         B          3
-    2  c         B          5
-
-    If you have multi-index columns:
-
-    >>> df.columns = [list('ABC'), list('DEF')]
-    >>> df
-       A  B  C
-       D  E  F
-    0  a  1  2
-    1  b  3  4
-    2  c  5  6
-
-    >>> pd.melt(df, col_level=0, id_vars=['A'], value_vars=['B'])
-       A variable  value
-    0  a        B      1
-    1  b        B      3
-    2  c        B      5
-
-    >>> pd.melt(df, id_vars=[('A', 'D')], value_vars=[('B', 'E')])
-      (A, D) variable_0 variable_1  value
-    0      a          B          E      1
-    1      b          B          E      3
-    2      c          B          E      5
-
-    """
     # TODO: what about the existing index?
     if id_vars is not None:
         if not is_list_like(id_vars):