Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleaned up imports #18264

Merged
merged 2 commits into from
Nov 23, 2017
Merged

cleaned up imports #18264

merged 2 commits into from
Nov 23, 2017

Conversation

tdpetrou
Copy link
Contributor

This is a continuation of #18148 and does the following

  • puts the reshape imports from pandas.core.api to pandas.core.reshape.api
  • uses pandas.core.dtypes.generic.ABCMultiIndex inplace of MultiIndex in melt.py
  • uses from pandas import DataFrame in melt.py inside functions
  • imports directly from pandas in reshape tests

There is one test failing. Statsmodels imports get_dummies from pandas.core.api which is no longer possible. It should import directly from pandas.

@jorisvandenbossche jorisvandenbossche added Clean Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 13, 2017
@jorisvandenbossche
Copy link
Member

@jreback Why is it needed to use ABCMultiIndex instead of MultiIndex and move the DataFrame imports inline? There is no circular import here?

@jorisvandenbossche
Copy link
Member

There is one test failing. Statsmodels imports get_dummies from pandas.core.api which is no longer possible. It should import directly from pandas.

Then I think we should put it back, but we should also open an issue at statsmodels that they should not do that (independently of whether we have that import there or not, they should never import from 'core')

@tdpetrou
Copy link
Contributor Author

tdpetrou commented Nov 13, 2017

@jorisvandenbossche Looks like statsmodels has it fixed now.

Edited:

@TomAugspurger fixed this here: statsmodels/statsmodels@45e55d0#diff-a494553e88130577065e0b012246ca51

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Nov 13, 2017

The problem is that they didn't yet do a new release. So for now I would not do any change that breaks a released version of statsmodels, as we are not sure at this point a new statsmodel release will be out before a new pandas release.

@tdpetrou
Copy link
Contributor Author

Yes, that makes sense. But, pandas 0.22 won't come out for another 3+ months (where is the release schedule btw?) and statsmodels has been on 0.8 for 9 months now. Should we find out if they are going to release the next version before 0.22?

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Nov 13, 2017 via email

@codecov
Copy link

codecov bot commented Nov 13, 2017

Codecov Report

Merging #18264 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18264      +/-   ##
==========================================
+ Coverage   91.38%   91.38%   +<.01%     
==========================================
  Files         164      164              
  Lines       49884    49884              
==========================================
+ Hits        45584    45587       +3     
+ Misses       4300     4297       -3
Flag Coverage Δ
#multiple 89.19% <100%> (+0.02%) ⬆️
#single 39.41% <33.33%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/api.py 100% <ø> (ø) ⬆️
pandas/core/reshape/melt.py 96.22% <100%> (+0.03%) ⬆️
pandas/core/reshape/api.py 100% <100%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/plotting/_converter.py 65.25% <0%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f4c960...8782852. Read the comment docs.

@jorisvandenbossche
Copy link
Member

Are you able to just import get_dummies in pandas.core.api, or does that cause circular import issues?

That is no problem, as we were only cleaning up imports in core/api.py that are not really necessary from pandas point of view (but thus used by statsmodels). So I mean easy to just not do the clean-up (yet)

@pandas-dev pandas-dev deleted a comment from codecov bot Nov 13, 2017
@jreback
Copy link
Contributor

jreback commented Nov 14, 2017

@tdpetrou

There is one test failing. Statsmodels imports get_dummies from pandas.core.api which is no longer possible. It should import directly from pandas.

so test_downstream.py is failing? we test against the latest released versions of these packages

I want to add a repro test with the statsmodels failure, then you can leave get_dummies import in pandas.core.api, with a TODO to move later (and we'll create an issue about it).

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Nov 14, 2017 via email

@jreback
Copy link
Contributor

jreback commented Nov 14, 2017

@TomAugspurger that's good, but our test should still fail; its against the released version.

if its NOT failing, then our test is not good enough (maybe it doesn't import that code).

@tdpetrou
Copy link
Contributor Author

@jreback Yes it was test_downstream.py. Ok I'll keep it get_dummies import in pandas.core.api with TODO

pandas/tests/test_downstream.py:60:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../anaconda3/lib/python3.6/site-packages/statsmodels/api.py:14: in <module>
    from .discrete.discrete_model import (Poisson, Logit, Probit,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    """
    from __future__ import division

    __all__ = ["Poisson", "Logit", "Probit", "MNLogit", "NegativeBinomial"]

    from statsmodels.compat.python import lmap, lzip, range
    import numpy as np
    from scipy.special import gammaln
    from scipy import stats, special, optimize  # opt just for nbin
    import statsmodels.tools.tools as tools
    from statsmodels.tools import data as data_tools
    from statsmodels.tools.decorators import (resettable_cache,
            cache_readonly)
    from statsmodels.regression.linear_model import OLS
    from scipy import stats, special, optimize  # opt just for nbin
    from scipy.stats import nbinom
    from statsmodels.tools.sm_exceptions import PerfectSeparationError
    from statsmodels.tools.numdiff import (approx_fprime, approx_hess,
                                           approx_hess_cs, approx_fprime_cs)
    import statsmodels.base.model as base
    from statsmodels.base.data import handle_data  # for mnlogit
    import statsmodels.regression.linear_model as lm
    import statsmodels.base.wrapper as wrap
    from statsmodels.compat.numpy import np_matrix_rank
>   from pandas.core.api import get_dummies
E   ImportError: cannot import name 'get_dummies'

../../anaconda3/lib/python3.6/site-packages/statsmodels/discrete/discrete_model.py:41: ImportError

@jreback
Copy link
Contributor

jreback commented Nov 15, 2017

@tdpetrou so this PR should be failing as the current code has the move of get_dummies. yes I want to leave it until its fixed in statsmodels, but why does this NOT fail? this would indicate the test is flawed.

@tdpetrou
Copy link
Contributor Author

@jreback test_downstream.py is failing on my machine. Are you implying that the CI services are linking to the unreleased statsmodels versions?

@jreback
Copy link
Contributor

jreback commented Nov 17, 2017

this is puzzling why this is not failing, going to test on another build, #18333 let's see.

@jreback
Copy link
Contributor

jreback commented Nov 17, 2017

@tdpetrou go ahead and rebase, this should fail.

@codecov
Copy link

codecov bot commented Nov 17, 2017

Codecov Report

Merging #18264 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18264      +/-   ##
==========================================
+ Coverage   91.38%   91.38%   +<.01%     
==========================================
  Files         164      164              
  Lines       49790    49884      +94     
==========================================
+ Hits        45501    45587      +86     
- Misses       4289     4297       +8
Flag Coverage Δ
#multiple 89.19% <100%> (+0.02%) ⬆️
#single 39.41% <33.33%> (-0.15%) ⬇️
Impacted Files Coverage Δ
pandas/core/api.py 100% <ø> (ø) ⬆️
pandas/core/reshape/melt.py 96.22% <100%> (+0.03%) ⬆️
pandas/core/reshape/api.py 100% <100%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/util/_decorators.py 80.7% <0%> (-0.34%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/core/sparse/series.py 95.28% <0%> (-0.06%) ⬇️
pandas/core/series.py 95% <0%> (-0.02%) ⬇️
pandas/core/generic.py 95.72% <0%> (-0.01%) ⬇️
pandas/core/reshape/reshape.py 100% <0%> (ø) ⬆️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfad581...794dd6d. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 17, 2017

Codecov Report

Merging #18264 into master will increase coverage by 0.29%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18264      +/-   ##
==========================================
+ Coverage   91.35%   91.65%   +0.29%     
==========================================
  Files         163      164       +1     
  Lines       49681    51487    +1806     
==========================================
+ Hits        45386    47189    +1803     
- Misses       4295     4298       +3
Flag Coverage Δ
#multiple 89.52% <100%> (+0.39%) ⬆️
#single 40.54% <33.33%> (+0.8%) ⬆️
Impacted Files Coverage Δ
pandas/core/api.py 100% <ø> (ø) ⬆️
pandas/core/reshape/melt.py 96.22% <100%> (+0.03%) ⬆️
pandas/core/reshape/api.py 100% <100%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/tseries/offsets.py 96.7% <0%> (-0.31%) ⬇️
pandas/core/resample.py 96.16% <0%> (-0.19%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️
pandas/compat/__init__.py 58.1% <0%> (-0.09%) ⬇️
pandas/core/indexes/timedeltas.py 91.14% <0%> (-0.04%) ⬇️
pandas/core/generic.py 95.73% <0%> (ø) ⬆️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dac9b43...1f56dfd. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

can you rebase 1 more time, I want to see this fail; the statsmodels test was being skipped.

@tdpetrou
Copy link
Contributor Author

@jreback Did this fail correctly?

@jorisvandenbossche
Copy link
Member

Yes, it is correctly failing now. So you can now put it back in core/api.py

@jreback can you answer my question above #18264 (comment) ?

@tdpetrou
Copy link
Contributor Author

@jorisvandenbossche I put get_dummies back in core.api. I left the get_dummies import in core.reshape.api.

@jreback
Copy link
Contributor

jreback commented Nov 22, 2017

Why is it needed to use ABCMultiIndex instead of MultiIndex and move the DataFrame imports inline? There is no circular import here?

we use the ABC* for instance checking generally, these have no dependencies. This actually does have a conditional import, that's why its pandas.core.frame.DataFrame; we try to remove these and import inside the function as its much cleaner.

@jreback
Copy link
Contributor

jreback commented Nov 22, 2017

@tdpetrou

I put get_dummies back in core.api. I left the get_dummies import in core.reshape.api.

thanks. ping on green.

Can you make an issue to move this (on the pandas side, when statsmodels releases).

@tdpetrou
Copy link
Contributor Author

@jreback I think this is good but circleci is failing?

I'll keep a note to make an issue on the next statsmodels release

@jreback jreback added this to the 0.22.0 milestone Nov 23, 2017
@jreback jreback merged commit 04b628f into pandas-dev:master Nov 23, 2017
@jreback
Copy link
Contributor

jreback commented Nov 23, 2017

thanks @tdpetrou

I'll keep a note to make an issue on the next statsmodels release

can you make an issue about this, we have 2 changes we need to make now (I'll edit with the other one)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants