PythonからXLSを操作するモジュール
Extract data from Excel spreadsheets (.xls and .xlsx, versions 2.0 onwards) on any platform.
Pure Python (2.6, 2.7, 3.2+). Strong support for Excel dates. Unicode-aware.
$ cd 'C:\Python27\xls' $ pwd /c/Python27/xls $ ls -l 合計 175 -rw-r--r-- 1 python_user Administrators 178490 Aug 5 11:10 xlrd-0.9.3.tar.gz $ tar zxvf xlrd-0.9.3.tar.gz xlrd-0.9.3/ xlrd-0.9.3/PKG-INFO xlrd-0.9.3/README.html xlrd-0.9.3/scripts/ xlrd-0.9.3/setup.py .. 省略 xlrd-0.9.3/tests/test_xldate.py xlrd-0.9.3/tests/test_xldate_to_datetime.py xlrd-0.9.3/tests/test_xlsx_comments.py xlrd-0.9.3/tests/text_bar.xlsx xlrd-0.9.3/tests/xf_class.xls xlrd-0.9.3/scripts/runxlrd.py $ ls -l 合計 179 drwxr-xr-x 5 python_user Administrators 4096 Apr 9 16:24 xlrd-0.9.3 -rw-r--r-- 1 python_user Administrators 178490 Aug 5 11:10 xlrd-0.9.3.tar.gz $ cd xlrd-0.9.3 $ ls -l 合計 20 -rw-r--r-- 1 python_user Administrators 994 Apr 9 16:24 PKG-INFO -rw-r--r-- 1 python_user Administrators 4672 Jun 11 2013 README.html drwxr-xr-x 2 python_user Administrators 0 Aug 5 11:13 scripts -rwxr-xr-x 1 python_user Administrators 1887 Jun 11 2013 setup.py drwxr-xr-x 2 python_user Administrators 8192 Aug 5 11:13 tests drwxr-xr-x 2 python_user Administrators 4096 Apr 9 16:24 xlrd $ python setup.py install running install running build running build_py creating build creating build\lib creating build\lib\xlrd copying xlrd\biffh.py -> build\lib\xlrd copying xlrd\book.py -> build\lib\xlrd copying xlrd\compdoc.py -> build\lib\xlrd .. 省略 byte-compiling c:\Python27\Lib\site-packages\xlrd\info.py to info.pyc byte-compiling c:\Python27\Lib\site-packages\xlrd\licences.py to licences.pyc byte-compiling c:\Python27\Lib\site-packages\xlrd\sheet.py to sheet.pyc byte-compiling c:\Python27\Lib\site-packages\xlrd\timemachine.py to timemachine.pyc byte-compiling c:\Python27\Lib\site-packages\xlrd\xldate.py to xldate.pyc byte-compiling c:\Python27\Lib\site-packages\xlrd\xlsx.py to xlsx.pyc byte-compiling c:\Python27\Lib\site-packages\xlrd\__init__.py to __init__.pyc running install_scripts creating c:\Python27\Scripts copying build\scripts-2.7\runxlrd.py -> c:\Python27\Scripts running install_egg_info Writing c:\Python27\Lib\site-packages\xlrd-0.9.3-py2.7.egg-info $
$ cat read_xls_file.py # coding: utf-8 import xlrd import urllib def read_xls(url): webpage = urllib.urlopen(url) webdata = webpage.read() webpage.close() book = xlrd.open_workbook(file_contents=webdata) sheet1 = book.sheet_by_index(0) for col in range(sheet1.ncols): print "----------------------------" for row in range(sheet1.nrows): cell=sheet1.cell(row,col) if cell.ctype == xlrd.XL_CELL_TEXT: print 'col=', col, 'row=', row, cell.value.encode('UTF-8') else: print 'col=', col, 'row=', row, cell.value if __name__ == '__main__': import sys if len( sys.argv ) > 1: url = sys.argv[1] read_xls(url) $
電力情報をWebから取得してデータ作成して値段推移を確認してみる。
$ python read_xls_file.py 'http://www.enecho.meti.go.jp/about/whitepaper/2013html/data/whitepaper2013_214-1-7.xls' | grep 'row= 4' | awk '{print $5}' 電灯 24.805595108 24.6026352946 24.2067276624 24.4931131033 23.3280284336 23.061707533 23.0761062356 22.7901832667 21.8335163592 21.4982452593 21.2212843648 20.7917694214 20.7261067007 20.7846825484 21.8873581716 20.5422138002 20.3707924016 21.2596934385 $ python read_xls_file.py 'http://www.enecho.meti.go.jp/about/whitepaper/2013html/data/whitepaper2013_214-1-7.xls' | grep 'row= 5' | awk '{print $5}' 電力 17.1488350119 16.9583292586 16.5184136265 16.7650452054 15.8949172913 15.4675782507 15.4433739677 15.4572175847 14.3915064494 14.0749796905 13.7543726135 13.5120790035 13.6176359413 13.6556277198 15.2149111291 13.7677260803 13.6462341174 14.5917638787 $ python read_xls_file.py 'http://www.enecho.meti.go.jp/about/whitepaper/2013html/data/whitepaper2013_214-1-7.xls' | grep 'row= 6' | awk '{print $5}' 電灯・電力計 19.3784749229 19.2269525733 18.7837216319 19.0340122112 18.1364447046 17.7751731119 17.7624823149 17.7215241422 16.7213643101 16.3852613132 16.1059097541 15.8322628265 15.8420723513 15.9017966876 17.3563940883 16.0163599156 15.9032620599 16.8325761746 $
Reference:
http://d.hatena.ne.jp/addition/20140104/1388832149
http://www.python-izm.com/contents/external/xlrd.shtml
http://stackoverflow.com/questions/15588713/sheets-of-excel-workbook-from-a-url-into-a-pandas-dataframe
http://stackoverflow.com/questions/3665379/django-and-xlrd-reading-from-memory
http://geeks-squad.com/access-excel-file-in-python
http://java.dzone.com/articles/reading-excel-spreadsheets