Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified common dtype discovery #13947

Closed
2 tasks
sstanovnik opened this issue Aug 9, 2016 · 2 comments
Closed
2 tasks

Unified common dtype discovery #13947

sstanovnik opened this issue Aug 9, 2016 · 2 comments
Labels
Compat pandas objects compatability with Numpy or Python functions Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@sstanovnik
Copy link
Contributor

Common dtype discovery (like np.find_common_type) should be unified into an internal function, with proper handling of pandas dtypes. As of #13917, there are two such implementations:

  1. pandas/types/common.py:_lcd_dtypes, which works on numpy dtypes only, but differently than np.find_common_type.
  2. pandas/types/cast.py:_find_common_type, which just uses np.find_common_type internally.

Exchanging the first one with the second (which is, judging by the code, more proper), breaks some tests.

Proposed tasks:

  • Convert usages of _lcd_dtypes into _find_common_types.
  • Extend _find_common_types to properly evaluate pandas dtypes.

Occurences of _lcd_dtypes, the one in internals.py is removed by #13917:

[jreback-~/pandas] grin _lcd_dtype pandas
pandas/core/frame.py:
   47 :                                  _lcd_dtypes,
 3705 :                 new_dtype = _lcd_dtypes(this_dtype, other_dtype)
pandas/core/internals.py:
 4438 :     def _lcd_dtype(l):
 4473 :         lcd = _lcd_dtype(counts[IntBlock])
 4489 :         return _lcd_dtype(counts[FloatBlock] + counts[SparseBlock])
pandas/types/common.py:
  389 : def _lcd_dtypes(a_dtype, b_dtype):
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate labels Aug 9, 2016
@jreback jreback added this to the Next Major Release milestone Aug 9, 2016
@sinhrks
Copy link
Member

sinhrks commented Aug 10, 2016

_lcd_dtypes is used only in combine and looks buggy (it's called from combine_first):

df1 = pd.DataFrame({'a': [pd.Timestamp('2011-01-01'), pd.NaT]})
df2 = pd.DataFrame({'a': [1, 2]})
df1.combine_first(df2)
#                               a
# 0 2011-01-01 00:00:00.000000000
# 1 1970-01-01 00:00:00.000000002

@jreback
Copy link
Contributor

jreback commented Aug 10, 2016

that might be related to #10567

@jreback jreback modified the milestones: 0.19.0, Next Major Release Aug 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants