Skip to content

Commit

Permalink
ENH: Allow export of mixed columns to Stata strl
Browse files Browse the repository at this point in the history
Enable export of large columns to Stata strls when the column
contains None as a null value

closes pandas-dev#23633
  • Loading branch information
bashtage committed Nov 14, 2018
1 parent a197837 commit 75f9d80
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 0 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,7 @@ Other Enhancements
- :meth:`Timestamp.tz_localize`, :meth:`DatetimeIndex.tz_localize`, and :meth:`Series.tz_localize` have gained the ``nonexistent`` argument for alternative handling of nonexistent times. See :ref:`timeseries.timezone_nonexistent` (:issue:`8917`)
- :meth:`read_excel()` now accepts ``usecols`` as a list of column names or callable (:issue:`18273`)
- :meth:`MultiIndex.to_flat_index` has been added to flatten multiple levels into a single-level :class:`Index` object.
- :meth:`DataFrame.to_stata` and :class:` pandas.io.stata.StataWriter117` can write mixed sting columns to Stata strl format (:issue:`23633`)

.. _whatsnew_0240.api_breaking:

Expand Down
2 changes: 2 additions & 0 deletions pandas/io/stata.py
Original file line number Diff line number Diff line change
Expand Up @@ -2558,6 +2558,8 @@ def generate_table(self):
for o, (idx, row) in enumerate(selected.iterrows()):
for j, (col, v) in enumerate(col_index):
val = row[col]
# Allow columns with mixed str and None (GH 23633)
val = '' if val is None else val
key = gso_table.get(val, None)
if key is None:
# Stata prefers human numbers
Expand Down
17 changes: 17 additions & 0 deletions pandas/tests/io/test_stata.py
Original file line number Diff line number Diff line change
Expand Up @@ -1505,3 +1505,20 @@ def test_unicode_dta_118(self):
expected = pd.DataFrame(values, columns=columns)

tm.assert_frame_equal(unicode_df, expected)

def test_mixed_string_strl(self):
# GH 23633
output = [
{'mixed': 'string' * 500,
'number': 0},
{'mixed': None,
'number': 1}
]

output = pd.DataFrame(output)
with tm.ensure_clean() as path:
output.to_stata(path, write_index=False, version=117)
reread = read_stata(path)
expected = output.fillna('')
expected.number = expected.number.astype('int32')
tm.assert_frame_equal(reread, expected)

0 comments on commit 75f9d80

Please sign in to comment.