[R] read.table bug
Kurt Hornik
Kurt.Hornik@ci.tuwien.ac.at
Thu, 9 Sep 1999 16:35:05 +0200 (CEST)
>>>>> Peter Dalgaard BSA writes:
> Li Dongfeng <mavip5@inet.polyu.edu.hk> writes:
>> Suppose we have a data file containing:
>>
>> "Smith, John", 120, 90
>> "Thomson, Peter", 110, 85
>>
>> there are 3 variables in it. If we use
>>
>> x <- read.table("tmp.txt", sep=",")
>>
>> to read the data to a data.frame,
>> the result will be 4 columns.
>> Splus 4.0 have no problem with this kind
>> of data.
> Splus 3.4 has:
>> read.table("data", sep=",")
> V2 V3 V4
> "Smith John\" 120 90
> "Thomson Peter\" 110 85
> i.e. 3 variables but with row.names '"Smith' and '"Thomson'
> Or, closer to what R does:
>> read.table("data", sep=",",row.names=NULL)
> V1 V2 V3 V4
> 1 \"Smith John\" 120 90
> 2 \"Thomson Peter\" 110 85
> By its definition, this is what sep=',' must do, but it obviously will
> not handle all CSV files properly. Anyone want to write a read.csv()
> function or something of the sorts? It would be very useful.
So
read.table(sep = ",")
would be different from
read.csv()
?
PS. Btw, a typical problem when dumping from certain commercial
spreadsheets to csv and trying to read that in using read.table() is
that the dump does not always produce trailing commas. I use a trivial
Perl script for adding these, but maybe read.table() should be smarter
about that (new option doing if_not_enough_then_add_as_NA?)
-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._