-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFrame.update crashes with overwrite=False when NaT present #16713
Comments
I've just tested some more and it seems that the error occurs whenever there is a null object in a column containing datetimes. So replacing NaT with NaN still has the same error. |
So when we reindex In [22]: df2.reindex_like(df1).dtypes
Out[22]:
A int64
B float64
dtype: object I wonder if we could add a parameter to |
I just encountered this issue and was wondering if there were any updates or workarounds for it? Thanks. |
I'm also having this issue in sklearn.preprocessing with StandardScaler(). It definitely seems to be a datetime issue so I've just dropped that column for the type being, but eventually I'll need it back, so fingers crossed. |
take |
Hi, new contributor here so please correct me if I'm wrong! This seems to be caused by situations where the Dataframe to be updated has a Datetime column with NaT values and the input Dataframe has either
Since in the situation of the second case the created column is full of only NA values, would it be reasonable to solve this by just adding a check to the function that if a column is full of only NA values, to skip the updating of that column? I created a PR with an implementation of this as well as a couple new test cases including the one introduced above. |
How would this work? Would the dtype be taken from the
Alternatively, there could be an option to exclude null columns could be excluded from the result of Lines 8196 to 8198 in 2d126dd
to skip over columns which aren't in both At the moment, I'm struggling to see a simpler solution that that proposed in #49395 |
Maybe generally a full |
I pushed a new commit to my PR that only reindexes rows and then skips non matching columns. Does that seem right for what you were saying? |
Code Sample
Problem description
A similar problem as in issue #15593 which was fixed in pandas version 0.20.2, NaT values anywhere in the DataFrame still throws the following exception:
TypeError: invalid type promotion
Output of
pd.show_versions()
pandas: 0.20.2
pytest: 2.9.2
pip: 9.0.1
setuptools: 36.0.1
Cython: 0.24
numpy: 1.13.0
scipy: 0.17.1
xarray: None
IPython: 6.1.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.1.0
tables: 3.4.2
numexpr: 2.6.2
feather: 0.3.1
matplotlib: 1.5.1
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.0
bs4: 4.5.1
html5lib: 0.999999999
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: