Reading and Writing to Excel Spreadsheets in Python

Date Published: 28/05/2009 13:13 Python and Excel Logos

One of my primary uses of Python in the business environment is to produce reports from various data sources. The humble spreadsheet has become one of the most widely used file formats within business for its flexibility and scalability. Having the ability to produce spreadsheets using an automated system can be useful, as they can be easily integrated into existing business systems. This is possible to achieve using .NET technologies of course and when working on a Windows platform I would advise using the Jet engine. However, I primarily work in a Linux environment where the .NET framework is not available. Here is an introduction to the simple but useful, xlwt (excel write) and xlrd (excel read) modules of Python for writing to, and reading from spreadsheets. This is compatible with Python versions 2.1 to 2.6, is multi-platform and does not require an installation of excel.

Acquiring and Installing the xlwt and xlrd Modules

The xlwt and xlrd modules are not included as part of the standard Python install, therefore you will have to download them and install them yourself. The modules can be found at the following locations.

If you are using Windows the installation is as simple as downloading the executable and going through the installer. For linux users just download either the .tar.gz or .zip file, extract the contents and run the setup.py file and the setup script will do the rest.

python setup.py install

Creating Excel Spreadsheets with xlwt

Now we have the modules installed we can make a start by creating an excel spreadsheet and putting some data in it. First we must import the xlwt module we have just installed.

import xlwt

Now we can create a new workbook object which we will populate with information. In this example I have set the encoding parameter to utf-8 but this is optional.

book = xlwt.Workbook(encoding="utf-8")

With a workbook object made we can now add some sheets.

sheet1 = book.add_sheet("Python Sheet 1")
sheet2 = book.add_sheet("Python Sheet 2")
sheet3 = book.add_sheet("Python Sheet 3")

We have now formed a workbook object which we can add data to before outputting it to the file system. Adding information to a spreadsheet is simply a case of using the write() function of the sheet objects we created before.

sheet1.write(0, 0, "This is the First Cell of the First Sheet")
sheet2.write(0, 0, "This is the First Cell of the Second Sheet")
sheet3.write(0, 0, "This is the First Cell of the Third Sheet")
sheet2.write(1, 10, "This is written to the Second Sheet")
sheet3.write(0, 2, "This is part of a list of information in the Third Sheet")
sheet3.write(1, 2, "This is part of a list of information in the Third Sheet")
sheet3.write(2, 2, "This is part of a list of information in the Third Sheet")
sheet3.write(3, 2, "This is part of a list of information in the Third Sheet")

After the spreadsheet is formed, the sheets are added and the data is written it is time to commit our new spreadsheet to the file system. To do this we use the book's save() method, passing the path to the file as a parameter. If a file already exists where you intend to save your spreadsheet it will be overwritten.

book.save("python_spreadsheet.xls")

There we have a simple introduction to using Python to create excel spreadsheets. The library lacks support for many standard excel features such as formulas and macros but is very effective for simple tasks. XLWT currently supports up to Excel 2003 with support for Excel 2007 on the way, along with a wave of new functionality. Here is the script above in its entirety to make it easier for you to copy.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import xlwt

book = xlwt.Workbook(encoding="utf-8")

sheet1 = book.add_sheet("Python Sheet 1")
sheet2 = book.add_sheet("Python Sheet 2")
sheet3 = book.add_sheet("Python Sheet 3")

sheet1.write(0, 0, "This is the First Cell of the First Sheet")
sheet2.write(0, 0, "This is the First Cell of the Second Sheet")
sheet3.write(0, 0, "This is the First Cell of the Third Sheet")
sheet2.write(1, 10, "This is written to the Second Sheet")
sheet3.write(0, 2, "This is part of a list of information in the Third Sheet")
sheet3.write(1, 2, "This is part of a list of information in the Third Sheet")
sheet3.write(2, 2, "This is part of a list of information in the Third Sheet")
sheet3.write(3, 2, "This is part of a list of information in the Third Sheet")

book.save("python_spreadsheet.xls")

Reading Excel Spreadsheets with xlrd

Reading spreadsheets is just as easy as writing them but first we must first import the xlrd module we installed earlier.

import xlrd

We must now open a workbook using the open_workbook() method passing the file path as a parameter.

book = xlrd.open_workbook("python_spreadsheet.xls")

In order to access a sheet within the workbook object we have just created, we can use the books sheet_by_name() method passing the name of the sheet as a parameter. If we do not know the names of the sheets in the book we can access the workbook objects sheet_names() method which will return the names of the sheets inside the book as a list. Here is an example which iterates through the sheets of the book object printing out their names to the terminal.

for sheet_name in book.sheet_names():
   print sheet_name

Using the sheet_by_name() method will return a sheet object which we can use to browse the data within that sheet. To access a sheet's contents we can use the sheet object's row_values() method which when passed an index will return a list of the cell values in that row. This list will contain data up until the final cell and any empty cells in between will be empty elements. The below example iterates through each sheet in the workbook and prints out the value of the first cell on each sheet to the terminal (this will error if that cell has no value).

for sheet_name in book.sheet_names():
   sheet = book.sheet_by_name(sheet_name)
   print sheet.row_values(0)[0]

This is just a simple introduction to accessing spreadsheets from Python which covers the basics needed to make a start. Here is the full reading script.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import xlrd

book = xlrd.open_workbook("python_spreadsheet.xls")

for sheet_name in book.sheet_names():
   sheet = book.sheet_by_name(sheet_name)
   print sheet.row_values(0)[0]

Conclusion

Not only is this utility useful for regular tasks such as database reporting but can also be very useful for reducing time on otherwise tedious jobs. I regularly use these modules to read information from spreadsheets into XML files which would otherwise take hours of copying and pasting. I hope this has been useful to someone as there is very little support for this functionality out there on the web, and it can be quite difficult to find a quick simple tutorial to get you going. If you have any problems feel free to comment below and I'll reply as soon as I can.

8 Comments

If you have any suggestions about how this article could be improved or if you want any further information feel free to comment below. No HTML please, your email address will not be displayed and the human check (used to stop spam bots) is not case sensitive.

slide

28/05/2009 16:50

.NET is indeed available on Linux, check out http://www.go-mono.com. You could use something like the NPOI library (http://www.codeplex.com/npoi) to generate the spreadsheets. The library is open source and very nice. In fact, I use it on Windows instead of the Jet engine or the COM interop because it is so much faster.

Jeremy Boyd

28/05/2009 18:37

I've only messed around with xlwt a bit, but I think you're mistaken about the formulas. If PYDIR is your python directory, check out PYDIR/Lib/site-packages/xlwt/examples/formulas.py. The spreadsheet it produces has a number of formulas, including addition, SUM(), booleans, and even IF(). Your article gave me the impression this library was far more incomplete than it actually is.

Jeremy Boyd

28/05/2009 18:44

Sorry, I should thank you for pointing out this lib. My shop's been looking for something like this for a while, so this is a nice find. I'm actually pleasantly surprised to find that it's very capable of things like styling, formulas, etc. Thanks again. :)

mike bayer

29/05/2009 15:38

any plans to support .xlsx ? most spreadsheets are now in this format as most of the world is on MS Office 07/08

Warven Barker

30/05/2009 04:33

Thanks so much for this. The time this will save , and the automation we can perform now because of you, is pretty darn amazing. WOW! Thank-you, thank-you, thank-you. You ROCK dude!

Gabrielle Leviazvani

03/06/2009 20:52

On Windows, I use Bytescout Spreadsheet SDK (for .net). Just perfect to do anything I want with spreadsheets.

John Machin

19/07/2009 15:07

Hi, I'm the xlrd author and xlwt maintainer. A few inaccuracies; corrections: xlwt provides a big chunk of the formula functionality: excludes array formulas, named ranges, external references; includes Analysis ToolPak functions. xlwt writes 97-to-2003-compatible XLS files. xlrd reads 2.0-to-2003-compatible XLS files; data-only (no formatting) support for 2007 XLS[XM] files is scheduled for the next release.

Laurel E.

30/11/2009 17:14

John, First of all, a huge thank-you for this library. I had been doing a lot of pickling and unpickling to get my data in order, and this is SO much better. One request for future versions: Append. This would be incredibly helpful to me, as I am a researcher who invites participants to do tasks, and their data gets saved as they finish. I want to have the program open a single (consolidated) data file, and append a new participant's data into a new row. Anyway, thanks again for the great work.

5 Most Recent Articles

Manually Triggering Events in ASP.NET from JavaScript

A quick guide for ASP.NET developers on how to manually trigger ASP.NET events from JavaScript.

Advanced Use of MySQL Stored Procedures

An article for users of MySQL databases describing how they can use advanced stored procedures to improve efficiently in their applications.

Using MySQL Stored Procedures and Extending MySQLi in PHP

A guide for LAMP developers to using stored procedures in MySQL and extending the MySQLi class.

Reading and Writing to Excel Spreadsheets in Python

An introduction to using the xlwt and xlrd modules for python to interact with Microsoft Excel spreadsheets.

Interact with the Web Using Python and the HTTP Library

This is an introduction to making HTTP requests from a python script/application using httplib.

Sponsors