(R-data.info)Reading Excel spreadsheets


Next: References Prev: Network interfaces Up: Top
Enter node , (file) or (file)node

9 Reading Excel spreadsheets
****************************

The most common R data import/export question seems to be 'how do I read
an Excel spreadsheet'.  This chapter collects together advice and
options given earlier.  Note that most of the advice is for pre-Excel
2007 spreadsheets and not the later '.xlsx' format.

   The first piece of advice is to avoid doing so if possible!  If you
have access to Excel, export the data you want from Excel in
tab-delimited or comma-separated form, and use 'read.delim' or
'read.csv' to import it into R. (You may need to use 'read.delim2' or
'read.csv2' in a locale that uses comma as the decimal point.)
Exporting a DIF file and reading it using 'read.DIF' is another
possibility.

   If you do not have Excel, many other programs are able to read such
spreadsheets and export in a text format on both Windows and Unix, for
example Gnumeric (<http://www.gnome.org/projects/gnumeric/>) and
OpenOffice (<https://www.openoffice.org>).  You can also cut-and-paste
between the display of a spreadsheet in such a program and R:
'read.table' will read from the R console or, under Windows, from the
clipboard (via 'file = "clipboard"' or 'readClipboard').  The 'read.DIF'
function can also read from the clipboard.

   Note that an Excel '.xls' file is not just a spreadsheet: such files
can contain many sheets, and the sheets can contain formulae, macros and
so on.  Not all readers can read other than the first sheet, and may be
confused by other contents of the file.

   Windows users (of 32-bit R) can use 'odbcConnectExcel' in package
*RODBC* (https://CRAN.R-project.org/package=RODBC). This can select rows
and columns from any of the sheets in an Excel spreadsheet file (at
least from Excel 97-2003, depending on your ODBC drivers: by calling
'odbcConnect' directly versions back to Excel 3.0 can be read).  The
version 'odbcConnectExcel2007' will read the Excel 2007 formats as well
as earlier ones (provided the drivers are installed, including with
64-bit Windows R: Note: RODBC).  macOS users can also use *RODBC*
(https://CRAN.R-project.org/package=RODBC) if they have a suitable
driver (e.g. that from Actual Technologies).

   'Perl' users have contributed a module 'OLE::SpreadSheet::ParseExcel'
and a program 'xls2csv.pl' to convert Excel 95-2003 spreadsheets to CSV
files.  Package *gdata* (https://CRAN.R-project.org/package=gdata)
provides a basic wrapper in its 'read.xls' function.  With suitable
'Perl' modules installed this function can also read Excel 2007
spreadsheets.

   Packages *dataframes2xls*
(https://CRAN.R-project.org/package=dataframes2xls) and *WriteXLS*
(https://CRAN.R-project.org/package=WriteXLS) each contain a function to
_write_ one or more data frames to an '.xls' file, using Python and Perl
respectively.

   Packages *xlsx* (https://CRAN.R-project.org/package=xlsx) can read
and and manipulate Excel 2007 and later spreadsheets: it requires Java.

   Package *XLConnect* (https://CRAN.R-project.org/package=XLConnect)
can read, write and manipulate both Excel 97-2003 and Excel 2007/10
spreadsheets, using Java.

   Package *readxl* (https://CRAN.R-project.org/package=readxl) can read
both Excel 97-2003 and Excel 2007/10 spreadsheets, using an included C
library.


automatically generated by info2www version 1.2.2.9