[R] read.table problems

Heberto Ghezzo heberto at MEAKINS.Lan.McGill.CA
Wed Nov 10 09:03:08 CET 1999


Yesterday I asked for help about read.table with a CSV file. I 
received the following help. As always many thanks for the prompt 
responses. Now I load my csv file in a text editor ( pfe) and delete 
all spaces. 
My original mail is at the end.

From:             Peter Dalgaard BSA <p.dalgaard at biostat.ku.dk>

Do you have spaces before the commas in your file?

> Yes, number , space, comma, new number
> 1 ,98.53 ,98.33 ,99.82 
> don't tell me that the space before the comma changes the behaviour
> of read.table?

The space is not part of the separator, so the fields are "1 " etc.
which contain nondigits, and hence are character variables...

Now this is not the .csv standard (if one exists), but read.table
weren't really written for that. We do need a real .csv reader, but
for now just get rid of the spaces.


From:             Guido Masarotto <guido at hal.stat.unipd.it>

On the other hand, if you have a mix of factor e/o character 
variables 
and numeric variables, things are a more complicated (and perhaps 
does exist a better solution than the following one):

> system("cat Bartok");cat("\n")
1 , two , 3
4 , five, 6
> a <- read.table("Bartok",sep=",")
> var.numeric <- c("V1","V3")
> index <- match(var.numeric,colnames(a))
> names <- colnames(a)
> a <- data.frame(apply(a[,index],2,as.numeric),a[,-index])
> colnames(a) <- c(names[index],names[-index])
> a
  V1 V3    V2
1  1  3  two 
2  4  6  five
> attach(a)
> mean(V1)
[1] 2.5

Hoping this help,
guido

From:             Douglas Bates <bates at stat.wisc.edu>

There is a shortcut method for converting all the variables in a data 
frame
to numeric variables.  If a is your data frame you use
 newa <- 
   do.call("data.frame", lapply(a, function(x) 
as.numeric(as.character(x))))

----------my original querry follows

Hi I am using R65.1 in Windows 95
I have a CSV file from Excell
> 
a<-read.table("c:/heberto/mgc/tst.csv",header=T,sep=",")
> attach(a)
> a
  manolo  fvcpp  fevpp fvvcpp   tlcpp    rvpp  rvtlpp plmaxpp
1     1  99.28  97.67  98.38   91.14   102.9  111.25  117.64 
2     1  86.97  68.56  78.89   94.60  112.34  118.53   159.20
3     1  81.12  71.76  88.37   89.16  114.17  126.86   60.71 
4     1  98.12  86.05  87.73  102.34  127.41  123.05  102.13 
5     1  90.50  80.87  89.47   85.60   93.27  107.35   86.03 

--This is correct

> mean(fvcpp)
Error: "sum" not meaningful for factors
> fvcpp
[1] 99.28  86.97  81.12  98.12  90.50 
Levels:  81.12  86.97  90.50  98.12  99.28  

-- it reads the columns as factors and not as numeric

>  rm(a)
> a<-read.table("c:/heberto/mgc/tst.csv",header=T,sep=",",as.is=F)
> attach(a)
> mean(fvcpp)
Error: "sum" not meaningful for factors
> ls()
[1] "a"
> rm(a)
> a<-read.table("c:/heberto/mgc/tst.csv",header=T,sep=",",as.is=T)
> attach(a)
> mean(fvcpp)
Error in sum(..., na.rm = na.rm) : invalid "mode" of argument
>

-- so now, how can I read the file as numeric vectors?

> fvcpp
[1] "99.28 " "86.97 " "81.12 " "98.12 " "90.50 "
> fvcpp<-as.numeric(fvcpp)
> fvcpp
[1] 99.28 86.97 81.12 98.12 90.50
> mean(fvcpp)
[1] 91.198
>

-- this is obviously not the way to do it, for each variable
change it into numeric.

Can somebody tell me what I am doing wrong?, I used to follow the 
same 
procedure a<-read.table(...  attach(a)  and have all my variables as 
vectors with or without NA's .
This file has no NA's is all complete.
Thanks.
..

R. Heberto Ghezzo  Ph.D.
Meakins-Christie Labs
McGill University
Montreal - Canada
heberto at meakins.lan.mcgill.ca
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list