Supporting current panel use cases, interactions with xarray #62

wesm · 2016-11-29T19:05:42Z

The most common use case for panels I've seen has been as an aligning container for data frames -- you can insert a DataFrame "item" as you would a column normally. This can alleviate some awkwardness when working with multi-indexed data.

Couple questions around panels:

If we drop Panel as an analytical data structure (i.e. what is currently offered by the NDFrame construct), we should consider the API that will replace the current to_panel and to_frame workflows
It may be worthwhile to consider keeping around Panel as a simple container data structure for maintaining a related collection of DataFrames and supporting rudimentary reshaping / axis-swapping functionality. For example, if you have a dict of DataFrame objects in some orientation, you could create a panel, swap axes, then convert to some other data structure (e.g. xarray, MultiIndex-ed DataFrame). If you want to do deeper analysis, you should convert to xarray.

In either case, we'd be eliminating a bunch of thinly supported code

The text was updated successfully, but these errors were encountered:

shoyer · 2016-11-29T23:35:46Z

I agree, keeping around Panel as a simple data container could make sense. I have also found it to be useful as an intermediate data structure for easier data alignment, though I can't think of particular use cases off the top of my head.

CC @MaximilianR

max-sixty · 2016-11-30T16:31:08Z

I don't have a strong view.

xarray is pretty good for aligning! So I predominately use that:

In [5]: df = pd.DataFrame(np.random.rand(3,4), columns=list('abcd'))

In [6]: df
Out[6]:
          a         b         c         d
0  0.164063  0.014835  0.529693  0.268561
1  0.076066  0.598840  0.887823  0.566114
2  0.599438  0.021646  0.775174  0.959695

In [7]: xr.Dataset({'first': df, 'second': df[list('ab')]})
Out[7]:
<xarray.Dataset>
Dimensions:  (dim_0: 3, dim_1: 4)
Coordinates:
  * dim_0    (dim_0) int64 0 1 2
  * dim_1    (dim_1) object 'a' 'b' 'c' 'd'
Data variables:
    second   (dim_0, dim_1) float64 0.1641 0.01483 nan nan 0.07607 0.5988 ...
    first    (dim_0, dim_1) float64 0.1641 0.01483 0.5297 0.2686 0.07607 ...

And pandas' stack / unstacking is pretty good for swapping axes.

What's the use case where you'd need functionality in pandas?

we should consider the API that will replace the current to_panel and to_frame workflows

@jreback has built some good .to_xarray, and we've built some decent (not perfect yet) coercion by passing xarray & pandas objects into each others' constructors

jreback · 2017-04-07T19:35:34Z

this is merged: pandas-dev/pandas#15601

so can think about this (at some point).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting current panel use cases, interactions with xarray #62

Supporting current panel use cases, interactions with xarray #62

wesm commented Nov 29, 2016

shoyer commented Nov 29, 2016

max-sixty commented Nov 30, 2016

jreback commented Apr 7, 2017

Supporting current panel use cases, interactions with xarray #62

Supporting current panel use cases, interactions with xarray #62

Comments

wesm commented Nov 29, 2016

shoyer commented Nov 29, 2016

max-sixty commented Nov 30, 2016

jreback commented Apr 7, 2017