(R-data.info)Introduction


Next: Spreadsheet-like data Prev: Acknowledgements Up: Top
Enter node , (file) or (file)node

1 Introduction
**************

Reading data into a statistical system for analysis and exporting the
results to some other system for report writing can be frustrating tasks
that can take far more time than the statistical analysis itself, even
though most readers will find the latter far more appealing.

   This manual describes the import and export facilities available
either in R itself or via packages which are available from CRAN or
elsewhere.

   Unless otherwise stated, everything described in this manual is (at
least in principle) available on all platforms running R.

   In general, statistical systems like R are not particularly well
suited to manipulations of large-scale data.  Some other systems are
better than R at this, and part of the thrust of this manual is to
suggest that rather than duplicating functionality in R we can make
another system do the work!  (For example Therneau & Grambsch (2000)
commented that they preferred to do data manipulation in SAS and then
use package *survival* (https://CRAN.R-project.org/package=survival) in
S for the analysis.)  Database manipulation systems are often very
suitable for manipulating and extracting data: several packages to
interact with DBMSs are discussed here.

   There are packages to allow functionality developed in languages such
as 'Java', 'perl' and 'python' to be directly integrated with R code,
making the use of facilities in these languages even more appropriate.
(See the *rJava* (https://CRAN.R-project.org/package=rJava) package from
CRAN and the *SJava*, *RSPerl* and *RSPython* packages from the Omegahat
project, <http://www.omegahat.net>.)

   It is also worth remembering that R like S comes from the Unix
tradition of small re-usable tools, and it can be rewarding to use tools
such as 'awk' and 'perl' to manipulate data before import or after
export.  The case study in Becker, Chambers & Wilks (1988, Chapter 9) is
an example of this, where Unix tools were used to check and manipulate
the data before input to S. The traditional Unix tools are now much more
widely available, including for Windows.

   This manual was first written in 2000, and the number of scope of R
packages has increased a hundredfold since.  For specialist data formats
it is worth searching to see if a suitable package already exists.

Imports
Export to text files
XML

automatically generated by info2www version 1.2.2.9