[R] read.table bug

Thu, 9 Sep 1999 16:35:05 +0200 (CEST)

>>>>> Peter Dalgaard BSA writes:

> Li Dongfeng <mavip5@inet.polyu.edu.hk> writes:
>> Suppose we have a data file containing:
>> 
>> "Smith, John", 120, 90
>> "Thomson, Peter", 110, 85
>> 
>> there are 3 variables in it. If we use 
>> 
>> x <- read.table("tmp.txt", sep=",")
>> 
>> to read the data to a data.frame,
>> the result will be 4 columns.
>> Splus 4.0 have no problem with this kind
>> of data.

> Splus 3.4 has:

>> read.table("data", sep=",")
>                V2  V3 V4 
>   "Smith   John\" 120 90
> "Thomson  Peter\" 110 85

> i.e. 3 variables but with row.names '"Smith' and '"Thomson'

> Or, closer to what R does:

>> read.table("data", sep=",",row.names=NULL)
>          V1       V2  V3 V4 
> 1   \"Smith   John\" 120 90
> 2 \"Thomson  Peter\" 110 85

> By its definition, this is what sep=',' must do, but it obviously will
> not handle all CSV files properly. Anyone want to write a read.csv()
> function or something of the sorts? It would be very useful.

So
	read.table(sep = ",")

would be different from

	read.csv()

?

PS.  Btw, a typical problem when dumping from certain commercial
spreadsheets to csv and trying to read that in using read.table() is
that the dump does not always produce trailing commas.  I use a trivial
Perl script for adding these, but maybe read.table() should be smarter
about that (new option doing if_not_enough_then_add_as_NA?)

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._