[R] Row limit for read.table

Frank McCown fmccown at cs.odu.edu
Wed Jan 17 18:22:40 CET 2007


> In your case, read.table behaves as documented.
> The ' - character is one of the standard quoting characters. Some (but 
> very few) of the entrys contain single ' chars, so sometimes more than 
> ten thousand lines are just treated as a single entry. Try using 
> quote="" to disable quoting, as documented on the help page:
> 
> f<-read.table("http://www.cs.odu.edu/~fmccown/R/Tchange_rates_crawled.dat",
> header=TRUE, nrows=123000, comment.char="", sep="\t",quote="")
> 
> length(f$change_rate)
> [1] 122271


So either adding quote="" works or removing sep="\t" (and not using 
quote) works.  It seems an odd side-effect that specifying the separator 
changes the default behavior of quoting (because of the ' character).  I 
don't see that association made in the help file.


> There is (colClasses, works as documented). Try
> 
> f<-read.table("http://www.cs.odu.edu/~fmccown/R/Tchange_rates_crawled.dat",
> + header=TRUE, nrows=123000, comment.char="", 
> sep="\t",quote="",colClasses=c("character","NULL","NULL","NULL","NULL"))
>  > dim(f)
> [1] 122271      1

> Did you read the help page?

Of course I did.  For me the definition of colClasses wasn't clear... 
"A vector of classes to be assumed for the columns" didn't seem to be 
the same thing as "the columns you would like to be read."  I may have 
made the association if the help page had contained a simple example of 
using colClasses.

Thanks for the help,
Frank


-- 
Frank McCown
Old Dominion University
http://www.cs.odu.edu/~fmccown/



More information about the R-help mailing list