TST/REF: refactor the arithmetic tests for IntegerArray #34454

jorisvandenbossche · 2020-05-29T11:53:24Z

This is an attempt to make the arithmetic tests for the masked arrays more understandable and maintainable (doing it here for IntegerArray as a start, but the idea would be to do the same for BooleanArray, and later FloatingArray as well, and at that point also share some of those tests).

The problem with the current arrays/integer/test_arithmetic.py is that is quite complex to see what is going on (which I experienced when making a version for FloatingArray in #34307): for example, there is a huge _check_op method that has all the logic to created the expected result, but so this also has all the special cases of all ops combined, making it very difficult to see what is going on or to know what is now exactly tested for a certain op.
The reason for this structure is that those tests originally came from the base extension tests in tests/extension/, where the tests needed to be very generic. However, we already moved out those tests for IntgerArray, since they were all customized (nothing was still being inherited from the base class), but so we also don't need to maintain the original class structure then.

The logic how I constructed the new tests:

I first test each op with an explicitly constructed expected result. Here, we test the IntegerArray-specific special cases (like things as 1 ** pd.NA, or division resulting in a float numpy array, ..). Explicitly writing it down makes it a lot easier to read and to see what is tested (no complex if op == ...: .., constructs, and no complex correcting of expected results based on ndarray)
Those first tests are all with EAs. In a next set of tests, I added tests for array+scalar ops, array+ndarray ops, ... But here, I just rely on creating the expected result with the EA itself (since EA+EA ops were already tested before)
Similar tests for frame+scalar, series+scalar, series+array with the EA dtypes -> rely on the actual EA op for the expected result instead of having _check_op handle that as well.

cc @dsaxton @jreback

dsaxton · 2020-05-29T17:11:49Z

Nice @jorisvandenbossche, this looks much easier to follow / understand

jorisvandenbossche · 2020-06-03T21:06:13Z

All good here?
cc @TomAugspurger @jbrockmendel

jreback · 2020-06-03T22:15:41Z

yep looks good, pls rebase and merge on green.

TomAugspurger · 2020-06-04T11:40:09Z

Have people seen these test failures recently:

________________________ test_on_offset_implementations ________________________

[gw0] linux -- Python 3.7.7 /home/travis/miniconda3/envs/pandas-dev/bin/python

    @given(gen_random_datetime, gen_yqm_offset)

>   def test_on_offset_implementations(dt, offset):

pandas/tests/tseries/offsets/test_offsets_properties.py:89: 

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

pandas/tests/tseries/offsets/test_offsets_properties.py:94: in test_on_offset_implementations

    compare = (dt + offset) - offset

pandas/_libs/tslibs/offsets.pyx:440: in pandas._libs.tslibs.offsets.BaseOffset.__add__

    return other.__add__(self)

pandas/_libs/tslibs/offsets.pyx:442: in pandas._libs.tslibs.offsets.BaseOffset.__add__

    return self.apply(other)

pandas/_libs/tslibs/offsets.pyx:136: in pandas._libs.tslibs.offsets.apply_wraps.wrapper

    result = result.tz_localize(tz)

pandas/_libs/tslibs/timestamps.pyx:1272: in pandas._libs.tslibs.timestamps.Timestamp.tz_localize

    value = tz_localize_to_utc(np.array([self.value], dtype='i8'), tz,

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise pytz.NonExistentTimeError(stamp)

E   pytz.exceptions.NonExistentTimeError: 1905-06-01 00:00:00

pandas/_libs/tslibs/tzconversion.pyx:276: NonExistentTimeError

---------------------------------- Hypothesis ----------------------------------

Falsifying example: test_on_offset_implementations(

    dt=datetime.datetime(1900, 1, 1, 0, 0, tzinfo=tzfile('/usr/share/zoneinfo/Asia/Singapore')),

    offset=<65 * MonthBegins>,

)

And the arm build timed out.

jbrockmendel · 2020-06-04T17:02:36Z

Have people seen these test failures recently: [...] test_on_offset_implementations

Yes. I've been ignoring them until i can get around to them

…-tests

jorisvandenbossche · 2020-06-06T08:29:15Z

@TomAugspurger open an issue for that?

jorisvandenbossche · 2020-06-06T08:29:54Z

Will do a follow-up doing the same for BooleanArray, and then deduplicating some of them

TST/REF: refactor the arithmetic tests for IntegerArray

30ccec0

jorisvandenbossche added Refactor Internal refactoring of code Testing pandas testing functions or related to the test suite NA - MaskedArrays Related to pd.NA and nullable extension arrays labels May 29, 2020

jorisvandenbossche added this to the 1.1 milestone May 29, 2020

jorisvandenbossche mentioned this pull request May 29, 2020

ENH: nullable Float32/64 ExtensionArray #34307

Merged

2 tasks

Merge remote-tracking branch 'upstream/master' into masked-arithmetic…

ae807dd

…-tests

jorisvandenbossche merged commit 6258397 into pandas-dev:master Jun 6, 2020

jorisvandenbossche deleted the masked-arithmetic-tests branch June 6, 2020 08:29

jorisvandenbossche mentioned this pull request Jun 6, 2020

TST/REF: arithmetic tests for BooleanArray + consolidate with integer masked tests #34623

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST/REF: refactor the arithmetic tests for IntegerArray #34454

TST/REF: refactor the arithmetic tests for IntegerArray #34454

jorisvandenbossche commented May 29, 2020 •

edited

Loading

dsaxton commented May 29, 2020

jorisvandenbossche commented Jun 3, 2020

jreback commented Jun 3, 2020

TomAugspurger commented Jun 4, 2020

jbrockmendel commented Jun 4, 2020

jorisvandenbossche commented Jun 6, 2020

jorisvandenbossche commented Jun 6, 2020

TST/REF: refactor the arithmetic tests for IntegerArray #34454

TST/REF: refactor the arithmetic tests for IntegerArray #34454

Conversation

jorisvandenbossche commented May 29, 2020 • edited Loading

dsaxton commented May 29, 2020

jorisvandenbossche commented Jun 3, 2020

jreback commented Jun 3, 2020

TomAugspurger commented Jun 4, 2020

jbrockmendel commented Jun 4, 2020

jorisvandenbossche commented Jun 6, 2020

jorisvandenbossche commented Jun 6, 2020

jorisvandenbossche commented May 29, 2020 •

edited

Loading