ENH: Intervalindex #15309

jreback · 2017-02-05T20:07:03Z

closes #7640
closes #8625

reprise of #8707

redid the construction impl a bit, now uses the proper Index calling conventions (as well as other sub-classing conventions).
more testing, including all generic index tests (but still prob need some more)
add ._mask to track internal nans (its lazily computed), IOW you can have null entries in the index.
using tempita, so have intervalindex.pxi.in
the extension is housed in _interval now, rather than a part of lib
IntervalDtype is full fledged

In [11]: df = DataFrame({'A': range(10)})
    ...: s = pd.cut(df.A, 5)
    ...: df['B'] = s
    ...: df['C'] = np.array(s)
    ...: df2 = df.set_index('B')
    ...: df3 = df.set_index('C')
    ...: 
    ...: 

In [12]: df
Out[12]: 
   A              B              C
0  0  (-0.009, 1.8]  (-0.009, 1.8]
1  1  (-0.009, 1.8]  (-0.009, 1.8]
2  2     (1.8, 3.6]     (1.8, 3.6]
3  3     (1.8, 3.6]     (1.8, 3.6]
4  4     (3.6, 5.4]     (3.6, 5.4]
5  5     (3.6, 5.4]     (3.6, 5.4]
6  6     (5.4, 7.2]     (5.4, 7.2]
7  7     (5.4, 7.2]     (5.4, 7.2]
8  8     (7.2, 9.0]     (7.2, 9.0]
9  9     (7.2, 9.0]     (7.2, 9.0]

In [13]: df.dtypes
Out[13]: 
A       int64
B    category
C      object
dtype: object

In [14]: df.C.values
Out[14]: 
array([Interval(-0.0089999999999999993, 1.8, closed='right'),
       Interval(-0.0089999999999999993, 1.8, closed='right'),
       Interval(1.8, 3.6000000000000001, closed='right'),
       Interval(1.8, 3.6000000000000001, closed='right'),
       Interval(3.6000000000000001, 5.4000000000000004, closed='right'),
       Interval(3.6000000000000001, 5.4000000000000004, closed='right'),
       Interval(5.4000000000000004, 7.2000000000000002, closed='right'),
       Interval(5.4000000000000004, 7.2000000000000002, closed='right'),
       Interval(7.2000000000000002, 9.0, closed='right'),
       Interval(7.2000000000000002, 9.0, closed='right')], dtype=object)

Similar indicies

In [6]: df3.index
Out[6]: 
IntervalIndex(left=[-0.009, -0.009, 1.8, 1.8, 3.6, 3.6, 5.4, 5.4, 7.2, 7.2],
              right=[1.8, 1.8, 3.6, 3.6, 5.4, 5.4, 7.2, 7.2, 9.0, 9.0],
              closed='right',
              name='C',
              dtype='interval[float64]')

In [7]: df2.index
Out[7]: CategoricalIndex([(-0.009, 1.8], (-0.009, 1.8], (1.8, 3.6], (1.8, 3.6], (3.6, 5.4], (3.6, 5.4], (5.4, 7.2], (5.4, 7.2], (7.2, 9.0], (7.2, 9.0]], categories=[(-0.009, 1.8], (1.8, 3.6], (3.6, 5.4], (5.4, 7.2], (7.2, 9.0]], ordered=True, name='B', dtype='category')

indexing

In [2]: df2
Out[2]: 
               A              C
B                              
(-0.009, 1.8]  0  (-0.009, 1.8]
(-0.009, 1.8]  1  (-0.009, 1.8]
(1.8, 3.6]     2     (1.8, 3.6]
(1.8, 3.6]     3     (1.8, 3.6]
(3.6, 5.4]     4     (3.6, 5.4]
(3.6, 5.4]     5     (3.6, 5.4]
(5.4, 7.2]     6     (5.4, 7.2]
(5.4, 7.2]     7     (5.4, 7.2]
(7.2, 9.0]     8     (7.2, 9.0]
(7.2, 9.0]     9     (7.2, 9.0]

In [3]: df2.loc[[2, 5]]
Out[3]: 
            A           C
B                        
(1.8, 3.6]  2  (1.8, 3.6]
(1.8, 3.6]  3  (1.8, 3.6]
(3.6, 5.4]  4  (3.6, 5.4]
(3.6, 5.4]  5  (3.6, 5.4]

# work similarly
In [5]: df3.loc[[2, 5]]
Out[5]: 
            A           B
C                        
(1.8, 3.6]  2  (1.8, 3.6]
(1.8, 3.6]  3  (1.8, 3.6]
(3.6, 5.4]  4  (3.6, 5.4]
(3.6, 5.4]  5  (3.6, 5.4]

jreback · 2017-02-05T20:43:23Z

@shoyer so interval_range is sort of trivial for the basic case, but I think you want something like

pd.interval_range(start, stop, step=1)

In [1]: pd.IntervalIndex.from_breaks(np.arange(0, 10, 2))
Out[1]: 
IntervalIndex(left=[0, 2, 4, 6],
              right=[2, 4, 6, 8],
              closed='right',
              dtype='interval[int64]')

what other signature is useful here?

codecov-io · 2017-02-06T13:37:28Z

Codecov Report

❗ No coverage uploaded for pull request base (master@7ee73ff). Click here to learn what that means.
The diff coverage is 91.47%.

@@            Coverage Diff            @@
##             master   #15309   +/-   ##
=========================================
  Coverage          ?      91%           
=========================================
  Files             ?      145           
  Lines             ?    50236           
  Branches          ?        0           
=========================================
  Hits              ?    45718           
  Misses            ?     4518           
  Partials          ?        0

Flag	Coverage Δ
#multiple	`88.8% <91.47%> (?)`
#single	`40.38% <25.71%> (?)`

Impacted Files	Coverage Δ
pandas/types/api.py	`100% <ø> (ø)`
pandas/tseries/period.py	`92.74% <100%> (ø)`
pandas/types/inference.py	`98.33% <100%> (ø)`
pandas/types/generic.py	`100% <100%> (ø)`
pandas/indexes/api.py	`98.68% <100%> (ø)`
pandas/indexes/multi.py	`96.7% <100%> (ø)`
pandas/indexes/category.py	`98.43% <100%> (ø)`
pandas/core/algorithms.py	`94.57% <100%> (ø)`
pandas/core/api.py	`100% <100%> (ø)`
pandas/core/indexing.py	`94.01% <100%> (ø)`
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ee73ff...11ab1e1. Read the comment docs.

shoyer · 2017-02-06T17:07:40Z

so interval_range is sort of trivial for the basic case, but I think you want something like

Yes, that's one use case, though it's not necessarily worth doing it that's all it does.

I was thinking of interval_range also for datetimes (with start/periods arguments), as part of envisioning Interval as a eventual replacement for Period. But that is a long way off (if ever) and I don't think it will be terribly useful in the meanwhile, unless we get Interval working for resample operations.

shoyer

I got about half-way through, still need to look at the Cython.

I'm not convinced that it makes sense to add mask. It introduces potential ambiguity in the data model, because now there are two ways of having null values (in the sub-indexes and mask) that we have the keep in sync.

shoyer · 2017-02-06T17:09:56Z

doc/source/whatsnew/v0.20.0.txt

@@ -96,6 +97,9 @@ support for bz2 compression in the python 2 c-engine improved (:issue:`14874`).

 .. _whatsnew_0200.enhancements.uint64_support:

+Enanced UInt64 Support


shoyer · 2017-02-06T17:11:51Z

pandas/core/algorithms.py

+        if dropna and (result.values == 0).all():
+            result = result.iloc[0:0]
+
+        # normalizing is by len of all (regarless of dropna)


shoyer · 2017-02-06T17:18:01Z

pandas/indexes/interval.py

+        if right is None:
+
+            if not isinstance(left, IntervalIndex):
+                return Index(left, closed=closed, name=name, copy=copy)


I would raise here instead. Otherwise something like IntervalIndex(np.arange(10)) would return a Int64Index, which is clearly a bug.

Likewise with mixtures of Intervals with inconsistent closed values

In [46]: pd.IntervalIndex([pd.Interval(0, 1), pd.Interval(2, 3, closed='left')]) Out[46]: Index([(0, 1], [2, 3)], dtype='object')

That should probably raise.

shoyer · 2017-02-06T17:20:07Z

pandas/indexes/interval.py

+    def __new__(cls, left, right=None, closed='right', mask=None,
+                name=None, copy=False, dtype=None, fastpath=False):
+
+        if right is None:


As you probably know, I'm not a big fan of these highly overloaded constructor.

I do get that it is convenient for consistency with other index types to write IntervalIndex(values).

shoyer · 2017-02-06T17:20:29Z

pandas/indexes/interval.py

+        if right is None:
+
+            if not isinstance(left, IntervalIndex):
+                return Index(left, closed=closed, name=name, **kwargs)


same concern as above

shoyer · 2017-02-06T17:23:00Z

pandas/indexes/interval.py

+            raise ValueError('left and right must have the same length')
+        left_valid = notnull(self.left)
+        right_valid = notnull(self.right)
+        if not (left_valid == right_valid).all():


This check should be updated to account for the mask.

shoyer · 2017-02-06T17:25:48Z

pandas/indexes/interval.py

+                      closed='right')
+        """
+        breaks = np.asarray(breaks)
+        mask = isnull(breaks[:-1])


I guess this is reasonable, but note that the only valid IntervalIndex you can construct with from_breaks that contains nulls is all null.

shoyer · 2017-02-06T17:26:52Z

pandas/indexes/interval.py

+        return len(self.left)
+
+    @cache_readonly
+    def values(self):


I think I had a cython function for this (maybe not necessary).

shoyer · 2017-02-06T17:29:04Z

pandas/indexes/interval.py

+    def _maybe_cast_slice_bound(self, label, side, kind):
+        return getattr(self, side)._maybe_cast_slice_bound(label, side, kind)
+
+    def _convert_list_indexer(self, keyarr, kind=None):


Which methods call _convert_list_indexer?

shoyer · 2017-02-06T17:32:30Z

pandas/indexes/interval.py

+
+        else:
+            if isinstance(target, IntervalIndex):
+                raise NotImplementedError(


This might be an important. I think it's necessary to ensure df.loc[df.index] and df.reindex(df.index) work?

jreback · 2017-02-08T21:56:47Z

@shoyer

In [1]: df = DataFrame({'A': range(10)})
   ...: s = pd.cut(df.A, 5)
   ...: df['B'] = s
   ...: df = df.set_index('B')
   ...: 

In [2]: df
Out[2]: 
               A
B               
(-0.009, 1.8]  0
(-0.009, 1.8]  1
(1.8, 3.6]     2
(1.8, 3.6]     3
(3.6, 5.4]     4
(3.6, 5.4]     5
(5.4, 7.2]     6
(5.4, 7.2]     7
(7.2, 9.0]     8
(7.2, 9.0]     9

In [3]: df.index
Out[3]: 
IntervalIndex(left=[-0.009, -0.009, 1.8, 1.8, 3.6, 3.6, 5.4, 5.4, 7.2, 7.2],
              right=[1.8, 1.8, 3.6, 3.6, 5.4, 5.4, 7.2, 7.2, 9.0, 9.0],
              closed='right',
              name='B',
              dtype='interval[float64]')

In [4]: df.loc[4]
Out[4]: 
            A
B            
(3.6, 5.4]  4
(3.6, 5.4]  5

In [5]: df.index.get_loc(4)
Out[5]: array([4, 5])

In [6]: df.index.get_indexer([4])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-f6dde575d826> in <module>()
----> 1 df.index.get_indexer([4])
In [6]: df.index.get_indexer([4])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-f6dde575d826> in <module>()
----> 1 df.index.get_indexer([4])

/Users/jreback/pandas/pandas/indexes/interval.py in get_indexer(self, target, method, limit, tolerance)
    521                     'for IntervalIndex indexers')
    522             else:
--> 523                 return self._engine.get_indexer(target.values)
    524 
    525     def sort_values(self, return_indexer=False, ascending=True):

/Users/jreback/pandas/pandas/src/intervaltree.pxi in pandas._interval.IntervalTree.get_indexer (pandas/src/interval.c:11925)()
    141                 result.append(-1)
    142             elif result.data.n > old_len + 1:
--> 143                 raise KeyError(
    144                     'indexer does not intersect a unique set of intervals')
    145             old_len = result.data.n

KeyError: 'indexer does not intersect a unique set of intervals'

In [7]: df.index.is_unique
Out[7]: False

Is [6] just not well-represented in the intervaltrees?

I actually think this is a pretty common case.

shoyer · 2017-02-08T22:44:29Z

In [3]: df.index
Out[3]:
IntervalIndex(left=[-0.009, -0.009, 1.8, 1.8, 3.6, 3.6, 5.4, 5.4, 7.2, 7.2],
right=[1.8, 1.8, 3.6, 3.6, 5.4, 5.4, 7.2, 7.2, 9.0, 9.0],
closed='right',
name='B',
dtype='interval[float64]')

Shouldn't this be a CategoricalIndex with IntervalIndex categories?

Is [6] just not well-represented in the intervaltrees?

I think it's just as well represented for interval trees as it is for hash tables. But my understanding is that get_indexer is always supposed to return an array of the same length as the requested labels, so it doesn't work if you don't have unique values. I think this is the analogous case for Int64Index:

>>> idx = pd.Index([1, 2, 2, 2, 3])
>>> idx.get_indexer([1])  # note that I'm not even indexing the duplicated value (2)
InvalidIndexError: Reindexing only valid with uniquely valued Index objects

So actually, for consistency we should probably raise in get_indexer whenever the index has non-unique or even overlapping values.

For this sort of behavior, I think you want get_indexer_non_unique -- and I actually wrote an IntervalTree method for that, though it may not be hooked up into IntervalIndex.

jreback · 2017-02-08T22:52:05Z

Shouldn't this be a CategoricalIndex with IntervalIndex categories?

interesting you should say that, I fiddled with this a bit

In [1]: df = DataFrame({'A': range(10)})
   ...: s = pd.cut(df.A, 5)
   ...: 

In [2]: s
Out[2]: 
0    (-0.009, 1.8]
1    (-0.009, 1.8]
2       (1.8, 3.6]
3       (1.8, 3.6]
4       (3.6, 5.4]
5       (3.6, 5.4]
6       (5.4, 7.2]
7       (5.4, 7.2]
8       (7.2, 9.0]
9       (7.2, 9.0]
Name: A, dtype: category
Categories (5, interval[float64]): [(-0.009, 1.8] < (1.8, 3.6] < (3.6, 5.4] < (5.4, 7.2] < (7.2, 9.0]]

In [4]: s.cat.categories
Out[4]: 
IntervalIndex(left=[-0.009, 1.8, 3.6, 5.4, 7.2],
              right=[1.8, 3.6, 5.4, 7.2, 9.0],
              closed='right',
              dtype='interval[float64]')

so .cut return a Categorical (Series/Index depending on what's passed in). This is the same as we do now, but the Index is made up of the Interval-like strings.

So this actually makes a lot of sense. BUT, if you assign it to a DataFrame what should this do. I can leave it as a categorical, but then you get weird inconsistences, see here: bfd99d3, so instead I coerce this to a ndarray of intervals. Thus the index because an IntervalIndex automatically. Which is way more useful that a Categorical of IntervalIndex. I think this is ok.

jreback · 2017-02-08T22:53:32Z

So this actually makes a lot of sense. BUT, if you assign it to a DataFrame what should this do. I can leave it as a categorical, but then you get weird inconsistences, see here: bfd99d3, so instead I coerce this to a ndarray of intervals. Thus the index because an IntervalIndex automatically. Which is way more useful that a Categorical of IntervalIndex. I think this is ok.

ahh, yes I can easily try different paths for unique and non-unique, ok.

Do you have an implementation for non-unique? (e.g. given indexers return the included intervals for that indexer).

jreback · 2017-02-08T22:54:07Z

@shoyer ahh I see you DO have a get_indexer_non_unique, ahh let me try that.

jreback · 2017-02-08T23:07:38Z

In [1]:         df = DataFrame({'A': range(10)})
   ...:         s = pd.cut(df.A, 5)
   ...:         df['B'] = s
   ...:         df = df.set_index('B')
   ...: 

In [2]: df
Out[2]: 
               A
B               
(-0.009, 1.8]  0
(-0.009, 1.8]  1
(1.8, 3.6]     2
(1.8, 3.6]     3
(3.6, 5.4]     4
(3.6, 5.4]     5
(5.4, 7.2]     6
(5.4, 7.2]     7
(7.2, 9.0]     8
(7.2, 9.0]     9

this all looks ok

In [3]: df.loc[[4]]
Out[3]: 
            A
B            
(3.6, 5.4]  4
(3.6, 5.4]  5

In [4]: df.loc[[4, 5]]
Out[4]: 
            A
B            
(3.6, 5.4]  4
(3.6, 5.4]  5
(3.6, 5.4]  4
(3.6, 5.4]  5

In [5]: df.loc[[10]]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-5-315006c1f115> in <module>()
----> 1 df.loc[[10]]

/Users/jreback/pandas/pandas/core/indexing.py in __getitem__(self, key)
   1338         else:
   1339             key = com._apply_if_callable(key, self.obj)
-> 1340             return self._getitem_axis(key, axis=0)
   1341 
   1342     def _is_scalar_access(self, key):

/Users/jreback/pandas/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1538                     raise ValueError('Cannot index with multidimensional key')
   1539 
-> 1540                 return self._getitem_iterable(key, axis=axis)
   1541 
   1542             # nested tuple slicing

dev/pandas/pull/15309/commits/464cff2d04be86753ce72cd0f3d96eb2fad7c227

   1541 
   1542             # nested tuple slicing

/Users/jreback/pandas/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
   1048     def _getitem_iterable(self, key, axis=0):
   1049         if self._should_validate_iterable(axis):
-> 1050             self._has_valid_type(key, axis)
   1051 
   1052         labels = self.obj._get_axis(axis)

/Users/jreback/pandas/pandas/core/indexing.py in _has_valid_type(self, key, axis)
   1428 
   1429                 raise KeyError("None of [%s] are in the [%s]" %
-> 1430                                (key, self.obj._get_axis_name(axis)))
   1431 
   1432             return True

KeyError: 'None of [[10]] are in the [index]'

This is suspect, this should be returning -1 for the indexers (rather than raising). So going to fix.

In [6]: df.loc[[10, 4]]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-50e470a1fe86> in <module>()
----> 1 df.loc[[10, 4]]

/Users/jreback/pandas/pandas/core/indexing.py in __getitem__(self, key)
   1338         else:
   1339             key = com._apply_if_callable(key, self.obj)
-> 1340             return self._getitem_axis(key, axis=0)
   1341 
   1342     def _is_scalar_access(self, key):

/Users/jreback/pandas/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1538                     raise ValueError('Cannot index with multidimensional key')
   1539 
-> 1540                 return self._getitem_iterable(key, axis=axis)
   1541 
   1542             # nested tuple slicing

/Users/jreback/pandas/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
   1068             # have the index handle the indexer and possibly return
   1069             # an indexer or raising
-> 1070             indexer = labels._convert_list_indexer(keyarr, kind=self.name)
   1071             if indexer is not None:
   1072                 return self.obj.take(indexer, axis=axis)

/Users/jreback/pandas/pandas/indexes/interval.py in _convert_list_indexer(self, keyarr, kind)
    415         # TODO: handle keyarr if it includes intervals
    416         if (locs == -1).any():
--> 417             raise KeyError("a list-indexer must only include "
    418                            "existing intervals")
    419 

KeyError: 'a list-indexer must only include existing intervals'

shoyer · 2017-02-09T01:33:27Z

pandas/indexes/interval.py

-                    'for IntervalIndex indexers')
-            else:
-                return self._engine.get_indexer(target.values)
+            indexer = self._engine.get_indexer(target.values)


I think this should go in a separate Intervalndex.get_indexer_non_unique method, right?

My understanding of the contract of Index.get_indexer is that it returns an integer array with length equal to target.

jreback · 2017-02-10T00:04:29Z

@shoyer

this is from the original issue

IntervalIndex should play nicely when used as the levels for Categorical variable (#7217), but it is not the same as a CategoricalIndex (#7629). For example, a IntervalIndex should not allow for redundant values. To represent redundant or non-continuous intervals, you would need to make in a Categorical or CategoricalIndex which uses a IntervalIndex for the levels. Calling df.reset_index() on an DataFrame with an IntervalIndex would create a new Categorical column.

why should an IntervalIndex NOT allow redundant values? (and instead be a CategoricalIndex with level of an IntervalIndex). Is there a reason for this constraint? (just want to understand your rationale here).

shoyer · 2017-02-10T00:09:27Z

why should an IntervalIndex NOT allow redundant values? (and instead be a CategoricalIndex with level of an IntervalIndex). Is there a reason for this constraint? (just want to understand your rationale here).

Yeah, I don't remember why now but this does seem strange.

The only reason why this makes any sense to me is that there are optimizations one can do for look-ups if the IntervalIndex consists of non-overlapping, continuous intervals (you can use binary search instead of the interval tree).

qAp · 2017-02-15T08:49:16Z

If you don't me asking... if I wanted to use the new pd.cut in this PR right now, what would I need to do? I have:

git remote add jreback https://github.com/jreback/pandas.git
git fetch jreback
git checkout remotes/jreback/intervalindex
git checkout -b intervalindex

The code now has the changes made in this PR, but import pandas raises an Error except for if I switched back to the master branch. Do I need to somehow reinstall pandas off the intervalindex branch?

jorisvandenbossche · 2017-02-15T08:54:49Z

@qAp This PR has changes in the c code, so you need to rebuild pandas. So after the commands you used above to checkout the correct branch, you need to do python setup.py build_ext -i.
See the contributing docs for more information: http://pandas.pydata.org/pandas-docs/stable/contributing.html

jreback · 2017-02-15T16:01:03Z

@qAp happy to have you test this out.....pls report any issues here.

alexlenail · 2017-03-02T15:05:42Z

@jreback Any news on this?

jreback · 2017-03-02T15:08:03Z

@zfrenchee I have to make a few more internal changes to make some things consistent.

can I ask your usecases? / examples (e.g. so can have additional tests mainly.....)

jreback · 2017-04-09T13:46:39Z

going to merge in next day or 2. Will almost certainly need a follow up in any event.

closes pandas-dev#7640 closes pandas-dev#8625

rename _is_contained_in -> contains add sorting test

jreback · 2017-04-14T13:33:06Z

merged.

give this a whirl.

shoyer · 2017-04-14T16:38:14Z

Jeff -- thanks for pushing this one past the finish line!

…

On Fri, Apr 14, 2017 at 6:33 AM, Jeff Reback ***@***.***> wrote: merged. give this a whirl. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15309 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABKS1l706ZxVNoIWKJIn36ypkUn8-KZJks5rv3WVgaJpZM4L3m2_> .

jreback · 2017-04-14T16:40:15Z

:>

and let the bugs begin! hahah.

jreback added Enhancement Interval Interval data type labels Feb 5, 2017

jreback force-pushed the intervalindex branch from 63900d9 to 8415bb7 Compare February 5, 2017 20:11

shoyer reviewed Feb 6, 2017

View reviewed changes

jreback force-pushed the intervalindex branch 4 times, most recently from 9865000 to a7f097b Compare February 7, 2017 18:53

jreback force-pushed the intervalindex branch from a7f097b to 46f5179 Compare February 8, 2017 21:57

shoyer reviewed Feb 9, 2017

View reviewed changes

jorisvandenbossche mentioned this pull request Feb 9, 2017

cannot get cut() to display desired bins label #15357

Open

jreback force-pushed the intervalindex branch from cc37b9d to 067375c Compare February 15, 2017 16:00

jreback force-pushed the intervalindex branch 2 times, most recently from be778c4 to 9844010 Compare February 16, 2017 14:34

jreback force-pushed the intervalindex branch 5 times, most recently from b1c0c61 to 275d951 Compare April 7, 2017 00:30

jreback force-pushed the intervalindex branch from 275d951 to 758f3e4 Compare April 7, 2017 20:59

jreback modified the milestones: 0.21.0, 0.20.0 Apr 11, 2017

shoyer and others added 12 commits April 13, 2017 18:12

API/ENH: IntervalIndex

74162aa

closes pandas-dev#7640 closes pandas-dev#8625

CLN/COMPAT: IntervalIndex

340c98b

more tests & fixes for non-unique / overlaps

4a5ebea

rename _is_contained_in -> contains add sorting test

allow pd.cut to take an IntervalIndex for bins

e5f8082

more docs

b2d26eb

pep

f0e3ad2

api-types test fixing

4333937

sorting example

3a3e02e

fixup on merge of changes in algorithms.py

7577335

doc example and bug

fbc1cf8

more docs

834df76

merge conflicts

11ab1e1

jreback force-pushed the intervalindex branch from 758f3e4 to 11ab1e1 Compare April 13, 2017 22:12

jreback closed this in 9991579 Apr 14, 2017

alexlenail mentioned this pull request Apr 18, 2017

Pandas recently added IntervalIndex -- would this make sense to add to Blaze? blaze/blaze#1630

Open

has2k1 mentioned this pull request May 7, 2017

Imprecise intervals/labels by pd.cut #16276

Closed

alexlenail mentioned this pull request May 18, 2017

Confusing (possibly buggy) IntervalIndex behavior #16316

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Intervalindex #15309

ENH: Intervalindex #15309

jreback commented Feb 5, 2017 •

edited

Loading

jreback commented Feb 5, 2017

codecov-io commented Feb 6, 2017 •

edited by codecov bot

Loading

shoyer commented Feb 6, 2017

shoyer left a comment

shoyer Feb 6, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

TomAugspurger Mar 15, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

shoyer Feb 6, 2017

jreback commented Feb 8, 2017

shoyer commented Feb 8, 2017

jreback commented Feb 8, 2017

jreback commented Feb 8, 2017

jreback commented Feb 8, 2017

jreback commented Feb 8, 2017

shoyer Feb 9, 2017

jreback commented Feb 10, 2017

shoyer commented Feb 10, 2017

qAp commented Feb 15, 2017

jorisvandenbossche commented Feb 15, 2017

jreback commented Feb 15, 2017

alexlenail commented Mar 2, 2017

jreback commented Mar 2, 2017

jreback commented Apr 9, 2017

jreback commented Apr 14, 2017

shoyer commented Apr 14, 2017 via email

jreback commented Apr 14, 2017

		@@ -96,6 +97,9 @@ support for bz2 compression in the python 2 c-engine improved (:issue:`14874`).

		.. _whatsnew_0200.enhancements.uint64_support:

		Enanced UInt64 Support

ENH: Intervalindex #15309

ENH: Intervalindex #15309

Conversation

jreback commented Feb 5, 2017 • edited Loading

jreback commented Feb 5, 2017

codecov-io commented Feb 6, 2017 • edited by codecov bot Loading

Codecov Report

shoyer commented Feb 6, 2017

shoyer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Feb 8, 2017

shoyer commented Feb 8, 2017

jreback commented Feb 8, 2017

jreback commented Feb 8, 2017

jreback commented Feb 8, 2017

jreback commented Feb 8, 2017

Choose a reason for hiding this comment

jreback commented Feb 10, 2017

shoyer commented Feb 10, 2017

qAp commented Feb 15, 2017

jorisvandenbossche commented Feb 15, 2017

jreback commented Feb 15, 2017

alexlenail commented Mar 2, 2017

jreback commented Mar 2, 2017

jreback commented Apr 9, 2017

jreback commented Apr 14, 2017

shoyer commented Apr 14, 2017 via email

jreback commented Apr 14, 2017

jreback commented Feb 5, 2017 •

edited

Loading

codecov-io commented Feb 6, 2017 •

edited by codecov bot

Loading