[R] Read.table - Less rows than original data

Philipp Pagel p.pagel at wzw.tum.de
Wed Jul 9 22:39:08 CEST 2008


> I built a 1,273,230 by 6 data set named "mydata2", it was saved in the
> following command,
> 
> write.table(mydata2, "mydata2.txt", row.name=F,col.name=T,quote=F,sep="\t")
> 
> The next day I read in above saved text file into R,
> 
> temp<-read.table("mydata2.txt",header=T,sep="\t",na.strings="NA")
> 
> However, the dimension of "temp" is 636,615 X 6.


A wild guess: does your table contain strings which include single or
double ticks? As you are not disabling quoting in read.table this can
cause problems:

> foo = data.frame(a=c("abc","5'foo","xxx", "3'bar"), b=1:4)
> foo
      a b
1   abc 1
2 5'foo 2
3   xxx 3
4 3'bar 4

> write.table(foo, "mydata2.txt", row.name=F,col.name=T,quote=F,sep="\t")
> foo <- read.table("mydata2.txt",header=T,sep="\t",na.strings="NA")
> foo
                      a b
1                   abc 1
2 5foo\t2\nxxx\t3\n3bar 4

> foo <- read.table("mydata2.txt",header=T,sep="\t",na.strings="NA", quote='')
> foo
      a b
1   abc 1
2 5'foo 2
3   xxx 3
4 3'bar 4


The same aplies to comment characters embedded in strings.

If this is not your problem, I'd first check if the file has the
expected number of lines.

cu
	Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://mips.gsf.de/staff/pagel



More information about the R-help mailing list