Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Series ndarray properties (strides, data, base, itemsize, flags) #20721

Merged
3 changes: 3 additions & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -885,6 +885,9 @@ Deprecations
- The ``convert_datetime64`` parameter in :func:`DataFrame.to_records` has been deprecated and will be removed in a future version. The NumPy bug motivating this parameter has been resolved. The default value for this parameter has also changed from ``True`` to ``None`` (:issue:`18160`).
- :func:`Series.rolling().apply() <pandas.core.window.Rolling.apply>`, :func:`DataFrame.rolling().apply() <pandas.core.window.Rolling.apply>`,
:func:`Series.expanding().apply() <pandas.core.window.Expanding.apply>`, and :func:`DataFrame.expanding().apply() <pandas.core.window.Expanding.apply>` have deprecated passing an ``np.array`` by default. One will need to pass the new ``raw`` parameter to be explicit about what is passed (:issue:`20584`)
- The ``data``, ``base``, ``strides``, ``flags`` and ``itemsize`` properties
of the ``Series`` and ``Index`` classes have been deprecated and will be
removed in a future version (:issue:`20419`).
- ``DatetimeIndex.offset`` is deprecated. Use ``DatetimeIndex.freq`` instead (:issue:`20716`)

.. _whatsnew_0230.prior_deprecations:
Expand Down
2 changes: 2 additions & 0 deletions pandas/core/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -1491,6 +1491,8 @@ def take_nd(arr, indexer, axis=0, out=None, fill_value=np.nan, mask_info=None,
if is_sparse(arr):
arr = arr.get_values()

arr = np.asarray(arr)

if indexer is None:
indexer = np.arange(arr.shape[axis], dtype=np.int64)
dtype, fill_value = arr.dtype, arr.dtype.type()
Expand Down
15 changes: 15 additions & 0 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -737,11 +737,17 @@ def item(self):
@property
def data(self):
""" return the data pointer of the underlying data """
warnings.warn("{obj}.data is deprecated and will be removed "
"in a future version".format(obj=type(self).__name__),
FutureWarning, stacklevel=2)
return self.values.data

@property
def itemsize(self):
""" return the size of the dtype of the item of the underlying data """
warnings.warn("{obj}.itemsize is deprecated and will be removed "
"in a future version".format(obj=type(self).__name__),
FutureWarning, stacklevel=2)
return self._ndarray_values.itemsize

@property
Expand All @@ -752,6 +758,9 @@ def nbytes(self):
@property
def strides(self):
""" return the strides of the underlying data """
warnings.warn("{obj}.strudes is deprecated and will be removed "
"in a future version".format(obj=type(self).__name__),
FutureWarning, stacklevel=2)
return self._ndarray_values.strides

@property
Expand All @@ -762,13 +771,19 @@ def size(self):
@property
def flags(self):
""" return the ndarray.flags for the underlying data """
warnings.warn("{obj}.flags is deprecated and will be removed "
"in a future version".format(obj=type(self).__name__),
FutureWarning, stacklevel=2)
return self.values.flags

@property
def base(self):
""" return the base object if the memory of the underlying data is
shared
"""
warnings.warn("{obj}.base is deprecated and will be removed "
"in a future version".format(obj=type(self).__name__),
FutureWarning, stacklevel=2)
return self.values.base

@property
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -548,6 +548,9 @@ def _deepcopy_if_needed(self, orig, copy=False):
"""
if copy:
# Retrieve the "base objects", i.e. the original memory allocations
if not isinstance(orig, np.ndarray):
# orig is a DatetimeIndex
orig = orig.values
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback This special case is because we sometimes construct DatetimeIndex objects where the _data attribute still is a DatetimeIndex in itself.
To me this seems like something we should not do? (it doesn't hurt, but there is also no reason to have it I think, so better always store the ndarray to be consistent?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should never do that
how did this come up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just opened an issue about it: #20810
It came up from fixing the warnings in the tests. The above case, where orig is not always a ndarray (hence my special case if statement), triggered it, because orig always comes from a _data attribute, and so I expected it to always be a ndarray.

orig = orig if orig.base is None else orig.base
new = self._data if self._data.base is None else self._data.base
if orig is new:
Expand Down
6 changes: 6 additions & 0 deletions pandas/core/internals.py
Original file line number Diff line number Diff line change
Expand Up @@ -2816,6 +2816,12 @@ def _maybe_coerce_values(self, values, dtype=None):

return values

@property
def is_view(self):
""" return a boolean if I am possibly a view """
# check the ndarray values of the DatetimeIndex values
return self.values.values.base is not None

def copy(self, deep=True, mgr=None):
""" copy constructor """
values = self.values
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/groupby/aggregate/test_other.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,7 @@ def test_series_agg_multi_pure_python():
'F': np.random.randn(11)})

def bad(x):
assert (len(x.base) > 0)
assert (len(x.values.base) > 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure what the purpose of this assert actually is (was introduced in 71e9046)

return 'foo'

result = data.groupby(['A', 'B']).agg(bad)
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/groupby/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1066,7 +1066,7 @@ def convert_fast(x):

def convert_force_pure(x):
# base will be length 0
assert (len(x.base) > 0)
assert (len(x.values.base) > 0)
return Decimal(str(x.mean()))

grouped = s.groupby(labels)
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
class Base(object):
""" base class for index sub-class tests """
_holder = None
_compat_props = ['shape', 'ndim', 'size', 'itemsize', 'nbytes']
_compat_props = ['shape', 'ndim', 'size', 'nbytes']

def setup_indices(self):
for name, idx in self.indices.items():
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/test_multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

class TestMultiIndex(Base):
_holder = MultiIndex
_compat_props = ['shape', 'ndim', 'size', 'itemsize']
_compat_props = ['shape', 'ndim', 'size']

def setup_method(self, method):
major_axis = Index(['foo', 'bar', 'baz', 'qux'])
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/test_range.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@

class TestRangeIndex(Numeric):
_holder = RangeIndex
_compat_props = ['shape', 'ndim', 'size', 'itemsize']
_compat_props = ['shape', 'ndim', 'size']

def setup_method(self, method):
self.indices = dict(index=RangeIndex(0, 20, 2, name='foo'),
Expand Down
10 changes: 7 additions & 3 deletions pandas/tests/internals/test_internals.py
Original file line number Diff line number Diff line change
Expand Up @@ -466,16 +466,20 @@ def test_copy(self, mgr):

# view assertion
assert cp_blk.equals(blk)
assert cp_blk.values.base is blk.values.base
if isinstance(blk.values, np.ndarray):
assert cp_blk.values.base is blk.values.base
else:
# DatetimeTZBlock has DatetimeIndex values
assert cp_blk.values.values.base is blk.values.values.base

cp = mgr.copy(deep=True)
for blk, cp_blk in zip(mgr.blocks, cp.blocks):

# copy assertion we either have a None for a base or in case of
# some blocks it is an array (e.g. datetimetz), but was copied
assert cp_blk.equals(blk)
if cp_blk.values.base is not None and blk.values.base is not None:
assert cp_blk.values.base is not blk.values.base
if not isinstance(cp_blk.values, np.ndarray):
assert cp_blk.values.values.base is not blk.values.values.base
else:
assert cp_blk.values.base is None and blk.values.base is None

Expand Down
14 changes: 10 additions & 4 deletions pandas/tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -316,16 +316,22 @@ def test_ndarray_compat_properties(self):

for o in self.objs:
# Check that we work.
for p in ['shape', 'dtype', 'flags', 'T',
'strides', 'itemsize', 'nbytes']:
for p in ['shape', 'dtype', 'T', 'nbytes']:
assert getattr(o, p, None) is not None

assert hasattr(o, 'base')
# deprecated properties
for p in ['flags', 'strides', 'itemsize']:
with tm.assert_produces_warning(FutureWarning):
assert getattr(o, p, None) is not None

with tm.assert_produces_warning(FutureWarning):
assert hasattr(o, 'base')

# If we have a datetime-like dtype then needs a view to work
# but the user is responsible for that
try:
assert o.data is not None
with tm.assert_produces_warning(FutureWarning):
assert o.data is not None
except ValueError:
pass

Expand Down