title | date | tags | categories | ||
---|---|---|---|---|---|
Python 操作 Offices文档 |
2018-12-15 04:08:39 -0800 |
|
|
💠
💠 2024-04-23 13:56:29
大文件读取性能优化
- 问题: pandas读取 200M+ Excel时会耗时很久(分钟级),思路将Excel转换为CSV再读取
Fast excel python
calamine性能最快且保留类型
polars.read_excel读取Excel为DataFrame,同样使用calamine
Openpyxl
DuckDB
LibreOffice
Tablib
import xlrd
data = xlrd.open_workbook('monster.xlsx')
table = data.sheets()[0]
nrows = table.nrows
for i in range(nrows):
for cell in table.row_values(i):
print(cell, ' | ', end='')
print()