(R-ints.info)Files


Next: Graphics Devices Prev: Package Structure Up: Top
Enter node , (file) or (file)node

5 Files
*******

R provides many functions to work with files and directories: many of
these have been added relatively recently to facilitate scripting in R
and in particular the replacement of Perl scripts by R scripts in the
management of R itself.

   These functions are implemented by standard C/POSIX library calls,
except on Windows.  That means that filenames must be encoded in the
current locale as the OS provides no other means to access the file
system: increasingly filenames are stored in UTF-8 and the OS will
translate filenames to UTF-8 in other locales.  So using a UTF-8 locale
gives transparent access to the whole file system.

   Windows is another story.  There the internal view of filenames is in
UTF-16LE (so-called 'Unicode'), and standard C library calls can only
access files whose names can be expressed in the current codepage.  To
circumvent that restriction, there is a parallel set of Windows-specific
calls which take wide-character arguments for filepaths.  Much of the
file-handling in R has been moved over to using these functions, so
filenames can be manipulated in R as UTF-8 encoded character strings,
converted to wide characters (which on Windows are UTF-16LE) and passed
to the OS. The utilities 'RC_fopen' and 'filenameToWchar' help this
process.  Currently 'file.copy' to a directory, 'list.files',
'list.dirs' and 'path.expand' work only with filepaths encoded in the
current codepage.

   All these functions do tilde expansion, in the same way as
'path.expand', with the deliberate exception of 'Sys.glob'.

   File names may be case sensitive or not: the latter is the norm on
Windows and macOS, the former on other Unix-alikes.  Note that this is a
property of both the OS and the file system: it is often possible to map
names to upper or lower case when mounting the file system.  This can
affect the matching of patterns in 'list.files' and 'Sys.glob'.

   File names commonly contain spaces on Windows and macOS but not
elsewhere.  As file names are handled as character strings by R, spaces
are not usually a concern unless file names are passed to other process,
e.g. by a 'system' call.

   Windows has another couple of peculiarities.  Whereas a POSIX file
system has a single root directory (and other physical file systems are
mounted onto logical directories under that root), Windows has separate
roots for each physical or logical file system ('volume'), organized
under _drives_ (with file paths starting 'D:' for an ASCII letter,
case-insensitively) and _network shares_ (with paths like
'\netname\topdir\myfiles\a file').  There is a current drive, and path
names without a drive part are relative to the current drive.  Further,
each drive has a current directory, and relative paths are relative to
that current directory, on a particular drive if one is specified.  So
'D:dir\file' and 'D:' are valid path specifications (the last being the
current directory on drive 'D:').


automatically generated by info2www version 1.2.2.9