[Rd] file.info() on file larger than 2GB

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Sep 15 17:57:39 CEST 2004


This appears to be fairly easy to solve, at least on Linux.  R-devel now 
has an option --enable-linux-lfs that sets up the appropriate flags (and a 
very few other code changes were needed).  Similar options work on 
Solaris.

I have been able to create a 2.5Gb text file, move around it and read
lines from here and there including near the end, both as a plain file and
as a gzip-ed file.  And file.info reports correctly.

On Tue, 31 Aug 2004, Prof Brian Ripley wrote:

> This is a purely OS issue: your OS has not been set up so the fopen and
> stat calls handle > 2Gb files.  There is one R issue: the size will
> overflow in file.info.
> 
> For example, under Solaris 64-bit applications can handle such files 
> whereas 32-bit ones need calls to stat64, fopen64 etc.
> 
> It seems a very exotic need, but if someone wants to find out how to use 
> the OS-specific ways to extend stat etc and supply patches, please do so.
> 
> We don't put OS-specific limitations on help pages.
> 
> On Tue, 31 Aug 2004, Roger D. Peng wrote:
> 
> > I've got a file that's approximately 2.2GB and it seems to be foiling 
> > file.info().  When I run `stat' from the shell I get
> > 
> > zooey:> stat data.csv
> >    File: `data.csv'
> >    Size: 2271197563      Blocks: 4440280    IO Block: 4096   regular file
> > Device: 342h/834d       Inode: 9994308     Links: 1
> > Access: (0644/-rw-r--r--)  Uid: (  500/   rpeng)   Gid: (  500/   rpeng)
> > Access: 2004-08-31 09:50:04.000000000 -0400
> > Modify: 2004-08-26 19:09:42.000000000 -0400
> > Change: 2004-08-31 09:53:29.000000000 -0400
> 
> Take a look at the source code for stat, in coreutils.
> 
> > But, file.info() in R-devel gives me:
> > 
> >  > file.info("data.csv")
> >           size isdir mode mtime ctime atime uid gid uname grname
> > data.csv   NA    NA <NA>  <NA>  <NA>  <NA>  NA  NA  <NA>   <NA>
> > 
> > I assume this has something to do with the underlying call to `stat' 
> > in `do_fileinfo'.
> > 
> > This alone is not much of a problem but I also can't seem to be able 
> > to open a file connection to the same file.  For example,
> > 
> >  > con <- file("data.csv")
> >  > open(con, "r")
> > Error in open.connection(con, "r") : unable to open connection
> > In addition: Warning message:
> > cannot open file `data.csv'
> > 
> > Also, interestingly,
> > 
> >  > file.exists("data.csv")
> > [1] FALSE
> > 
> > I take it all these things are related.
> > 
> > Is it possible to fix this within R?  Or should there be a note in the 
> > help pages?
> > 
> >  > version
> >           _
> > platform i686-pc-linux-gnu
> > arch     i686
> > os       linux-gnu
> > system   i686, linux-gnu
> > status   Under development (unstable)
> > major    2
> > minor    0.0
> > year     2004
> > month    08
> > day      31
> > language R
> > 
> > -roger
> > 
> > ______________________________________________
> > R-devel at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> > 
> > 
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list