[Rd] read.table bug in Mac OS X (PR#2469)
ripley@stats.ox.ac.uk
ripley@stats.ox.ac.uk
Fri Jan 17 19:40:03 2003
George has supplied an example file which is CR-terminated.
As far as I can see this is an error when using classic MacOS files on an
foreign OS, and is I presume about the Darwin port of R (confirmation
please) where the native files are LF terminated and the example file was
CR terminated.
It's a bit of a wonder that it ever worked, but it was broken in fixing
PR#2396. I've added a test example and a fix that covers this and
PR#2396, and they will be in R-patched and R-devel shortly.
It does make me wonder about the testing process: do the testers of the
Darwin port never use classic MacOS files? How does emacs manage to
create CR-terminated files on a unix-based OS? Or is this a case of using
Carbon MacOS application with Darwin R, and that's rare?
On Fri, 17 Jan 2003 gwgilc@wm.edu wrote:
> Full_Name: George W. Gilchrist
> Version: 1.6.2
> OS: OS X
> Submission from: (NULL) (128.239.124.126)
>
>
> Start with a tab-delimited or comma-delimited text file created on the Mac and
> use read.table("filename.txt", header=T) to read it in. When the first column of
> the file contains a character vector, and there is a header line, the first
> letter of the first column of the fifth row is appended to the start of the
> column name and is omitted from the data entry. See the example below. This
> appears to have something to do with the way text files are encoded on the Mac.
> Text flies created in Excel, emacs, Word, and TextEdit on OS X all seem to do
> this, even when you copy the text file over to a PC and run R 1.6.2 there under
> Windows. If you open the Mac text file in a text editor on the PC and save it
> under a different name, the problem goes away. I have tried this with a half
> dozen different files.
>
> > tmp1<-read.table("deadFly.txt", header=T)
> > tmp1[1:10,]
> VTrt Dead.X Dead.C Live.X Live.C N.X N.C P.Live.X P.Live.C
> 1 Vg 2 0 7 10 9 10 0.78 1.000
> 2 Vg 5 1 5 8 10 9 0.50 0.890
> 3 Vg 0 0 8 10 8 10 1.00 1.000
> 4 Vg 0 0 9 9 9 9 1.00 1.000
> 5 g 1 1 9 7 10 8 0.90 0.875
> 6 Vg 4 1 6 9 10 10 0.60 0.900
> 7 Vg 2 1 7 9 9 10 0.78 0.900
> 8 Vg 0 0 9 8 9 8 1.00 1.000
> 9 Vg 0 0 10 10 10 10 1.00 1.000
> 10 Vg 0 0 8 9 8 9 1.00 1.000
>
> > tmp2<-read.table("musselJen.txt", header=T)
> > tmp2[1:10,]
> LLoc Size ID Bac Sec N PC
> 1 LS 120.0 1 T 1 32.7 92.0
> 2 LS 120.0 1 T 2 33.3 92.5
> 3 LS 120.0 1 T 3 39.3 96.9
> 4 LS 120.0 2 T 1 36.1 94.3
> 5 S 120.0 2 T 2 38.3 94.5
> 6 LS 120.0 2 T 3 34.3 94.1
> 7 LS 120.0 3 T 1 22.1 83.9
> 8 LS 120.0 3 T 2 25.5 93.1
> 9 LS 120.0 3 T 3 28.7 94.6
> 10 LS 4.2 1 T 1 48.5 93.7
> >
>
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595