-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: in Python3 MultiIndex.from_tuples cannot take "zipped" tuples #18434
Comments
I'm not sure this is a bug. An iterator is not a list/sequence of tuples. |
Pandas is supposed to have consistent performance between python2 and python3. If the code works in python2, then it is a bug. |
zip in Python 2 returns a list. Pandas will work with |
To put it another way, the "bug" is in zip which changes its behavior. Pandas |
Granted, the issue is due to different behavior of zip; but should iterators be supported here? Python3 changed behaviors of series of functions, so iterators and alikes are encountered more often than before. Should they be taken as valid arguments? |
It is hard/impossible to allocate contiguous data blocks like NumPy arrays unless you know the data size (hence the need for len). This happens in both NumPy and Pandas with iterators. |
we generally support list-likes / iterables for most operations (for certain types of indexing a tuple is distinstince). so for the @Xbar would you do a PR for this? |
An easy fix might be, at the beginning of "from_tuples", test if the argument is an instance of iterator, then convert it to a list. I can make the fix and test it. |
MultiIndex.from_tuples/from_arrays/from_product accept iterators in python 3. Ensures compatibility between 2 and 3.
Code Sample, a copy-pastable example if possible
Problem description
The code above gives an Exception
TypeError: object of type 'zip' has no len()
Because in python3, unlike in python2, the return from zip is NOT a list and cannot get length.
In pandas, there are multiple instances in MultiIndex and related classes, where the code tries to get len() from the arguments, which are valid input but no longer have len property in python3.
Expected Output
Same as in python2
In the case above, should be
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0
Cython: None
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: