Read and write Excel files with Python

I would like to do statistical analysis on an Excel file. Rather than learn a new programming language VBA (Visual Basic for Applications), I would rather work with Excel files in Python. This post shows how to read and write Excel files with Python.

1. OpenPyXL

The Python library openpyxl is designed for reading and writing Excel xlsx/xlsm/xltx/xltm files. Refer to openpyxl documentation for its usage.

Use the following command to install openpyxl:

$ sudo pip install openpyxl

BTW, xlrd and xlwt are for reading and writing spreadsheet files compatible with older Microsoft Excel files (i.e., .xls) respectively.

2. Read an Excel file as lists

2.1 Get a worksheet

load_workbook(...) is used to open an Excel file and return a workbook.

from openpyxl import load_workbook

load_workbook(filename, read_only=False, keep_vba=False, data_only=False, guess_types=False)

# for instance
wb = load_workbook(file_workbook, read_only=True)

In gerneral, a workbook contains several worksheet. wb.get_sheet_names() returns a list of sheet names. With a given name, wb.get_sheet_by_name(sheet_name) returns a worksheet. For instance,

ws = wb.get_sheet_by_name(sheet_name)

2.2 Read to a list of lists

ws.iter_rows(...) is used to iterate over cells in the worksheet ws. For instance,

iter_rows(range_string=None, row_offset=0, column_offset=0)

# read the worksheet to a list of lists
lists = list()

for row in ws.iter_rows(row_offset=1): # row_offset=1, skip the header
    # select fields
    selected_fields = [
                    row[IDX_CARD_ID],
                    row[IDX_NAME],
                    row[IDX_TIME],
                    row[IDX_IN_OUT]
                    ]

    # deal with blank fields; some fields might contain Chinese characters
    lists.append([item.value.encode('utf8') if item.value else None for item in selected_fields])

3. Write to an Excel file

发表评论

电子邮件地址不会被公开。 必填项已用*标注