[R] read.table problems
Heberto Ghezzo
heberto at MEAKINS.Lan.McGill.CA
Wed Nov 10 09:03:08 CET 1999
Yesterday I asked for help about read.table with a CSV file. I
received the following help. As always many thanks for the prompt
responses. Now I load my csv file in a text editor ( pfe) and delete
all spaces.
My original mail is at the end.
From: Peter Dalgaard BSA <p.dalgaard at biostat.ku.dk>
Do you have spaces before the commas in your file?
> Yes, number , space, comma, new number
> 1 ,98.53 ,98.33 ,99.82
> don't tell me that the space before the comma changes the behaviour
> of read.table?
The space is not part of the separator, so the fields are "1 " etc.
which contain nondigits, and hence are character variables...
Now this is not the .csv standard (if one exists), but read.table
weren't really written for that. We do need a real .csv reader, but
for now just get rid of the spaces.
From: Guido Masarotto <guido at hal.stat.unipd.it>
On the other hand, if you have a mix of factor e/o character
variables
and numeric variables, things are a more complicated (and perhaps
does exist a better solution than the following one):
> system("cat Bartok");cat("\n")
1 , two , 3
4 , five, 6
> a <- read.table("Bartok",sep=",")
> var.numeric <- c("V1","V3")
> index <- match(var.numeric,colnames(a))
> names <- colnames(a)
> a <- data.frame(apply(a[,index],2,as.numeric),a[,-index])
> colnames(a) <- c(names[index],names[-index])
> a
V1 V3 V2
1 1 3 two
2 4 6 five
> attach(a)
> mean(V1)
[1] 2.5
Hoping this help,
guido
From: Douglas Bates <bates at stat.wisc.edu>
There is a shortcut method for converting all the variables in a data
frame
to numeric variables. If a is your data frame you use
newa <-
do.call("data.frame", lapply(a, function(x)
as.numeric(as.character(x))))
----------my original querry follows
Hi I am using R65.1 in Windows 95
I have a CSV file from Excell
>
a<-read.table("c:/heberto/mgc/tst.csv",header=T,sep=",")
> attach(a)
> a
manolo fvcpp fevpp fvvcpp tlcpp rvpp rvtlpp plmaxpp
1 1 99.28 97.67 98.38 91.14 102.9 111.25 117.64
2 1 86.97 68.56 78.89 94.60 112.34 118.53 159.20
3 1 81.12 71.76 88.37 89.16 114.17 126.86 60.71
4 1 98.12 86.05 87.73 102.34 127.41 123.05 102.13
5 1 90.50 80.87 89.47 85.60 93.27 107.35 86.03
--This is correct
> mean(fvcpp)
Error: "sum" not meaningful for factors
> fvcpp
[1] 99.28 86.97 81.12 98.12 90.50
Levels: 81.12 86.97 90.50 98.12 99.28
-- it reads the columns as factors and not as numeric
> rm(a)
> a<-read.table("c:/heberto/mgc/tst.csv",header=T,sep=",",as.is=F)
> attach(a)
> mean(fvcpp)
Error: "sum" not meaningful for factors
> ls()
[1] "a"
> rm(a)
> a<-read.table("c:/heberto/mgc/tst.csv",header=T,sep=",",as.is=T)
> attach(a)
> mean(fvcpp)
Error in sum(..., na.rm = na.rm) : invalid "mode" of argument
>
-- so now, how can I read the file as numeric vectors?
> fvcpp
[1] "99.28 " "86.97 " "81.12 " "98.12 " "90.50 "
> fvcpp<-as.numeric(fvcpp)
> fvcpp
[1] 99.28 86.97 81.12 98.12 90.50
> mean(fvcpp)
[1] 91.198
>
-- this is obviously not the way to do it, for each variable
change it into numeric.
Can somebody tell me what I am doing wrong?, I used to follow the
same
procedure a<-read.table(... attach(a) and have all my variables as
vectors with or without NA's .
This file has no NA's is all complete.
Thanks.
..
R. Heberto Ghezzo Ph.D.
Meakins-Christie Labs
McGill University
Montreal - Canada
heberto at meakins.lan.mcgill.ca
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list