Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle nan values in DataFrame.update when overwrite=False #15593

Closed
pcluo opened this issue Mar 6, 2017 · 5 comments · Fixed by #16430
Closed

handle nan values in DataFrame.update when overwrite=False #15593

pcluo opened this issue Mar 6, 2017 · 5 comments · Fixed by #16430
Labels
Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@pcluo
Copy link
Contributor

pcluo commented Mar 6, 2017

Code Sample

from pandas import DataFrame, date_range
df1 = DataFrame({'A': [1,None,3], 'B': date_range('2000', periods=3)})
df2 = DataFrame({'A': [None, 2, 3]})
df1.update(df2, overwrite=False)
df1

Problem description

I got TypeError: invalid type promotion error when updating a DF with a datetime column. The 2nd DF doesn't have this column. The error message is in the details (although bad formatted).

IMHO, the culpit is in the DataFrame.update. The block checking mask.all should be outside the if block and applies to the case overwrite=False as well.

                if overwrite:
                    mask = isnull(that)

                    # don't overwrite columns unecessarily
                    if mask.all():
                        continue
                else:
                    mask = notnull(this)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () 1 df1 = DataFrame({'A': [1,None,3], 'B': date_range('2000', periods=3)}) 2 df2 = DataFrame({'A': [None, 2, 3]}) ----> 3 df1.update(df2, overwrite=False) 4 df1 5

C:\Users\pcluo\Anaconda3\lib\site-packages\pandas\core\frame.py in update(self, other, join, overwrite, filter_func, raise_conflict)
3845
3846 self[col] = expressions.where(mask, this, that,
-> 3847 raise_on_error=True)
3848
3849 # ----------------------------------------------------------------------

C:\Users\pcluo\Anaconda3\lib\site-packages\pandas\computation\expressions.py in where(cond, a, b, raise_on_error, use_numexpr)
228
229 if use_numexpr:
--> 230 return _where(cond, a, b, raise_on_error=raise_on_error)
231 return _where_standard(cond, a, b, raise_on_error=raise_on_error)
232

C:\Users\pcluo\Anaconda3\lib\site-packages\pandas\computation\expressions.py in _where_numexpr(cond, a, b, raise_on_error)
151
152 if result is None:
--> 153 result = _where_standard(cond, a, b, raise_on_error)
154
155 return result

C:\Users\pcluo\Anaconda3\lib\site-packages\pandas\computation\expressions.py in _where_standard(cond, a, b, raise_on_error)
126 def _where_standard(cond, a, b, raise_on_error=True):
127 return np.where(_values_from_object(cond), _values_from_object(a),
--> 128 _values_from_object(b))
129
130

TypeError: invalid type promotion

@jreback
Copy link
Contributor

jreback commented Mar 6, 2017

yeah this looks like a bug. .update() has not gotten much TLC. in fact this should be completely changed, see #3025 to generally fix this method.

So a short-term fix is ok if you'd want to push that.

@jreback jreback added Difficulty Intermediate Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Mar 6, 2017
@jreback jreback added this to the Next Major Release milestone Mar 6, 2017
@mayukh18
Copy link

mayukh18 commented Mar 8, 2017

@cluoren are you doing it? else I can take it up. I have looked into it already.

@pcluo
Copy link
Contributor Author

pcluo commented Mar 8, 2017

@mayukh18 just created a pull request. thx tho.

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 8, 2017
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
pcluo added a commit to pcluo/pandas that referenced this issue May 22, 2017
…s-dev#15593

add nan test for DataFrame.update
update whatsnew v0.20.2
pcluo added a commit to pcluo/pandas that referenced this issue May 22, 2017
…as-dev#15593)

BUG: handle nan values in DataFrame.update when overwrite=False (pandas-dev#15593)

add nan test for DataFrame.update

update whatsnew v0.20.2
pcluo added a commit to pcluo/pandas that referenced this issue May 22, 2017
…as-dev#15593)

BUG: handle nan values in DataFrame.update when overwrite=False (pandas-dev#15593)

add nan test for DataFrame.update

update whatsnew v0.20.2
pcluo added a commit to pcluo/pandas that referenced this issue May 24, 2017
…as-dev#15593)

add nan test for DataFrame.update

update whatsnew v0.20.2
pcluo added a commit to pcluo/pandas that referenced this issue May 24, 2017
…as-dev#15593)

add nan test for DataFrame.update

update whatsnew v0.20.2
pcluo added a commit to pcluo/pandas that referenced this issue May 24, 2017
…as-dev#15593)

add nan test for DataFrame.update

update whatsnew v0.20.2
pcluo added a commit to pcluo/pandas that referenced this issue May 24, 2017
…as-dev#15593)

add nan test for DataFrame.update

update whatsnew v0.20.2
@jreback jreback modified the milestones: 0.20.2, Next Major Release May 24, 2017
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this issue May 29, 2017
TomAugspurger pushed a commit that referenced this issue May 30, 2017
@olizhu
Copy link

olizhu commented Jun 16, 2017

The issue with NaN seems to be fixed in v0.20.2, but a similar problem still exists with NaT if it exists anywhere in the dataframe.
For example, this will throw an error:

df1 = DataFrame({'A': [1,None], 'B':[to_datetime('abc', errors='coerce'),to_datetime('2016-01-01')]})
df2 = DataFrame({'A': [2,3]})
df1.update(df2, overwrite=False)

@TomAugspurger
Copy link
Contributor

@olizhu could you search to see if we have an issue for that already (I don't recall seeing it before). If not, could you open a new issue for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
5 participants