-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: spin msgpack support to [new] pandas-msgpack #15841
Comments
Thanks for the heads up. We mainly use this with blaze so we just need to add that as a dependency to |
@llllllllll I was just proposing this to see if people think its a good idea :> but yes you could be explicit on the dependency. my reasoning for proposing this are several fold really.
|
cc @wesm |
+1 on splitting off this code |
I also think splitting off pandas-json would be a good idea (I would like to consider deprecate that for a native RapidJSON-based C++ reader via Arrow sometime in the next year) |
This to me feels like an argument against splitting off this code into a separate package. If the pandas-msgpack format is tied to implementation details of pandas objects, then you would usually want to upgrade pandas-msgpack and pandas in lockstep, or else you risk breakage because your pandas-msgpack version is incompatible with your pandas version. Having the msgpack format in the pandas codebase itself ensures that you don't have to worry as much about version compatibility issues. |
With continuous integration tools, we can automate the testing, so that doesn't seem like an issue. I think pandas's monolithic nature has made it harder for the community to make progress on components that may evolve at a different pace vs. the rest of the project. |
I agree with this, and as someone building an application on top of pandas, I appreciate the value of potentially slimming down the core distribution. At the same time, I suspect that the monolithic nature of pandas is actually a feature for many of pandas' users, since it means they don't have worry about or manage a constellation of One way to alleviate this concern might be to have something like a |
actually metapackagea would be great for this e.g. right now we should have a
you. could actually go nuts with this
so a big question is how to organize this to make it useful and not confusing |
morning project! https://github.com/pydata/pandas-msgpack not released yet. |
Just to add a "lamer's" perspective on this. Not a long time ago iptyhon notebook / jupyter went through a similar split. So now there is a bunch of 'sub-packages' available for installation. For example in conda:
When the split happened - my initial reaction was indeed "what should I install now?". Well it turned out that I might be wrong, but it is my understanding that Without doubts clearly documented changes come indispensable. I.e. why should not |
@wikiped
For example, if you are using SQL, there are myriad of options to install, pandas cannot know what to do here (sure we could install A possible future path is to make pandas a meta-package, with sub-packages like:
etc. The questions are:
My feeling at this point is that is ok to split off non-core functionaility and direct the user to install it if they want it. |
For |
I also think making Complexity might be constrained / driven by:
I can hardly speak for the whole pandas user-community and perhaps it would be best to get a broader feedback on this from users. And I am not sure what would be the right way to handle this. Perhaps some Part of this exercise is to make a good list of sub-parts to facilitate good feedback. And it might be good to go a bit nuts about it and have longer than needed list of parts to select from:
The final list of sub-parts might be re-grouped depending on feedback and where it makes sense. |
msgpack is deprecated #30112 |
similar to the split-off for pandas-gbq.
Would simplify the main pandas codebase a bit by making this a separately maintained package.
The text was updated successfully, but these errors were encountered: