-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: expose pandas.errors #15541
API: expose pandas.errors #15541
Conversation
@shoyer xref #15537 (comment) |
ff94598
to
4386409
Compare
@jorisvandenbossche xref #15181 (comment) I think this actually works.
|
@jreback second one should be |
thanks @TomAugspurger yes that works too |
Yes, but that is because you didn't do any deprecations of the original ones. Referring from two locations to the same exception is no problem, but it was deprecating the old one (eg by using a decorator) that caused the problems discussed before. I am in favor of putting this just in |
why would you create another namespace ? when we already have a perfectly good one. |
@jorisvandenbossche can you comment on your objection? |
Well, that comes back to my initial objections to the In this specific case, what I also would do is creating an actual module (let it be And to be clear, big +1 to make those exceptions more officially public! |
@jorisvandenbossche sure, but these are really 2 separate things.
I don't think the 2nd part is relevant here. I mean we can always change how they are imported into the public namespace. The point here is to have a separate public API layer (just as we did with |
I agree with Joris that it would be better to define the exceptions in
`pandas.exceptions` or whereever the public API location is (personally I
like the shorter `pandas.errors`). Unlike __repr__ I don't think Python
provides any builtin way to indicate the canonical location of errors. I
like making a immediately visible location the API location -- it makes it
much harder to unintentionally use the wrong path.
…On Fri, Mar 3, 2017 at 8:03 AM Jeff Reback ***@***.***> wrote:
@jorisvandenbossche <https://github.com/jorisvandenbossche> sure, but
these are really 2 separate things.
- an API for exceptions to the user
- move / change the internal usage
I don't think the 2nd part is relevant here. I mean we can always change
*how* they are imported into the public namespace. The point here is to
have a separate public API layer (just as we did with pandas.api.types.
The actual exceptions being defined elsewhere is fine.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15541 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABKS1vdT_M_Eh7sMXWUJz9IsDt1PlGqAks5riDnTgaJpZM4MQFAO>
.
|
@shoyer @jorisvandenbossche what is the reason to deviate from the established find with This is the point of the top-level api namespace, IOW.
this is going to eventually proliferate. We cannot have We have 1 and only 1 public namespace (or at least moving in that direction). |
If the default for namespaces like Most libraries don't use special designated |
no, I am thinking about The point is that pandas NEEDS a specifically public api, that is not just everything the kitchen sink. just because scipy/numpy does something does not make it the 'correct' way to do things. sure its a nice reference, but pandas has 2 sets of users, API and 'regular' ones. These are different and need to be treated differently. |
The problem is that for years people building things on top of pandas have just reached in and used anything. That cannot continue to happen. How better to provide implementation flexibility than to have a specific public API? I thought we had settled this discussion a while back. |
Ultimately, I don't think this matters too much, but given a choice between:
I think the later (option 2) is more user friendly and consistent with the top level namespace On the other hand, it's true that option 2 is a little harder to transition to (for us), so maybe it's not worth the trouble. I don't really understand your point about "API" vs "regular" users. I don't think it's a terribly advanced use case to catch appropriate exception types. I do agree that we want to reduce noise in the top level namespace. |
I would this is obvious.
Ultimately these are part of the same namespace that is publicly exposed. The point is that it is completely silly that people can just use any of the namespace. Sure you can, but pandas has gotten so big, that you simply cannot live that that when you need to move things around to change implementation details. |
how exactly is this more friendly? I would argue that this is much less friendly, simply because you don't know where to find things that you might need. This is a huge surface area, both for dev and for usage. Limiting this scope is hugely important. |
998b171
to
56e8631
Compare
Codecov Report
@@ Coverage Diff @@
## master #15541 +/- ##
==========================================
- Coverage 90.98% 90.96% -0.03%
==========================================
Files 143 145 +2
Lines 49477 49483 +6
==========================================
- Hits 45019 45014 -5
- Misses 4458 4469 +11
Continue to review full report at Codecov.
|
related: this now happens in our doc builds.
certainly could do a PR upstream to fix this (e.g. with conditional imports). But might be nice as a starting location for some library code. This is a useful enough routine that its nice to expose to the outside world. |
cc @wesm |
I've been running into these internal API / external API issues lately (e.g. needing to use |
We indeed need to document this better. To come back on the discussion above, I am also more in favor of "option 2" from #15541 (comment). It's indeed maybe a bit harder to get to that point, but IMO it would be much more clear to be able to say: everything accessible from the |
@shoyer
|
@jreback looks great to me -- thanks! One question on the docstring for
|
@shoyer these have long-time meaning in pandas (could be changed, but this is what they are).
|
pushed some examples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment, looks good to me
pandas/_libs/src/inference.pyx
Outdated
------- | ||
string describing the common type of the input data. | ||
Results can include: | ||
- floating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small rst thing: no indentation needed, and white line before the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, though I don't think we actually have a doc-generation section for pandas.api
ATM. (i'll make an issue).
@jorisvandenbossche if you have a chance. |
@@ -2,6 +2,7 @@ | |||
|
|||
import warnings | |||
warnings.warn("The pandas.lib module is deprecated and will be " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to specially catch infer_dtype
here and give a more specific message since it has moved and is public? Noticed I'm using it some real code, I think based on your SO answer here - http://stackoverflow.com/a/20670901/3657742.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In [1]: pd.lib.infer_dtype('foo')
/Users/jreback/miniconda3/envs/pandas/bin/ipython:1: FutureWarning: pandas.lib.infer_dtype is deprecated. Please use pandas._libs.lib.infer_dtype instead.
#!/Users/jreback/miniconda3/envs/pandas/bin/python
Out[1]: 'string'
hmm, let me see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
In [1]: pd.lib.infer_dtype('foo')
/Users/jreback/miniconda3/envs/pandas/bin/ipython:1: FutureWarning: pandas.lib is deprecated and will be removed in a future version. You can access infer_dtype in pandas.api.lib.infer_dtype
#!/Users/jreback/miniconda3/envs/pandas/bin/python
Out[1]: 'string'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in latest push
add infer_dtype to pandas.api.lib
move all exception / warning impls (and references) from pandas.core.common and pandas.io.common closes pandas-dev#14800
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
* API: expose dtypes in pandas.api.types xref #16015 xref apache/arrow#585 xref #16042 xref #15541 (comment)
* API: expose dtypes in pandas.api.types xref pandas-dev#16015 xref apache/arrow#585 xref pandas-dev#16042 xref pandas-dev#15541 (comment)
closes #14800
this doesn't actually deprecate the existing (as we have discussed in #15181) but baby steps.
I also didn't add the 'hated'!
SettingWithCopy*
ones; these will be removed in pandas 2.0 anyhow, so no point in adding them.