I would like to do statistical analysis on an Excel file. Rather than learn a new programming language VBA (Visual Basic for Applications), I would rather work with Excel files in Python. This post shows how to read and write Excel files with Python.
目录
1. OpenPyXL
The Python library openpyxl
is designed for reading and writing Excel xlsx/xlsm/xltx/xltm
files. Refer to openpyxl documentation for its usage.
Use the following command to install openpyxl
:
$ sudo pip install openpyxl
BTW, xlrd
and xlwt
are for reading and writing spreadsheet files compatible with older Microsoft Excel files (i.e., .xls
) respectively.
2. Read an Excel file as lists
2.1 Get a worksheet
load_workbook(...)
is used to open an Excel file and return a workbook.
from openpyxl import load_workbook
load_workbook(filename, read_only=False, keep_vba=False, data_only=False, guess_types=False)
# for instance
wb = load_workbook(file_workbook, read_only=True)
In gerneral, a workbook contains several worksheet. wb.get_sheet_names()
returns a list of sheet names. With a given name, wb.get_sheet_by_name(sheet_name)
returns a worksheet. For instance,
ws = wb.get_sheet_by_name(sheet_name)
2.2 Read to a list of lists
ws.iter_rows(...)
is used to iterate over cells in the worksheet ws
. For instance,
iter_rows(range_string=None, row_offset=0, column_offset=0)
# read the worksheet to a list of lists
lists = list()
for row in ws.iter_rows(row_offset=1): # row_offset=1, skip the header
# select fields
selected_fields = [
row[IDX_CARD_ID],
row[IDX_NAME],
row[IDX_TIME],
row[IDX_IN_OUT]
]
# deal with blank fields; some fields might contain Chinese characters
lists.append([item.value.encode('utf8') if item.value else None for item in selected_fields])
3. Write to an Excel file
赞赏微信赞赏
支付宝赞赏