[R] Dataframes and text identifier columns
Brian Willis
b.h.willis at bham.ac.uk
Wed Jul 2 13:33:10 CEST 2014
Apologies I was trying to simplify the programme and missed out four input
files. The files on Andrew, Burt, Charlie and Dave have the same format of
one factor and 13 numeric variables with repeated measurements eg.
Study v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13
A 153 4.0 2.00 2.00 145.00 0.67 0.01 49.00 0.34 0.04 0.96 -3.24 0.04
B 96 33 3.0 13.0 47.0 0.9 0.2 4.2 0.1 0.5 0.5 -0.7 -0.7
Inp_dat is
Case r p SE n
Andrew 0.03 0.01 0.0004 500
Burt 0.08 0.111 0.04 50
Charlie 0.04 0.022 0.0005 200
Dave 0.2 0.028 0.006 85
out_put starts as empty data frame and rows are added incrementally one for
Andrew, one for Burt etc.
If the code is
Andrew<-read.csv("/File /Andrew.csv")
Burt<-read.csv("/File /Burt.csv")
Charlie<-read.csv("/File /Charlie.csv")
Dave<-read.csv("/File /Dave.csv")
Inp_dat<- read.csv("/File/Input data.csv")
out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(),
StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(),
stringsAsFactors=FALSE)
for(i in 1:4)
{
if (i==1) b<-Andrew
if (i==2) b<-Burt
if (i==3) b<-Charlie
if (i==4) b<-Dave
pr <- Inp_dat$p[i]
SE_pr <- Inp_dat$SE[i]
r<- Inp_dat$r[i]
n<- Inp_dat$n[i]
Case<- Inp_dat$Case[i]
…
out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE)
}
out_put
Case StdL StdPP StdSE L
MRPP MRSE
1 1 19.466823 0.16432300 0.03137456 26.002294 0.2080145
0.03804692
2 2 2.334130 0.22566939 0.08962662 5.095703 0.3888451
0.08399101
3 3 2.588678 0.05502765 0.00454159 42.058326 0.4861511
0.02128030
4 4 7.857898 0.18457822 0.04372297 4.705487 0.1193687
0.01921609
The Cases are labelled as integers 1 corresponding to Andrew, 2
corresponding to Burt etc. instead of the intended text labels Andrew, Burt,
Charlie and Dave.
Note all other columns are correct. Furthermore
str(Case)
Factor w/ 4 levels "Andrew","Burt",..: 4
str(out_put)
'data.frame': 4 obs. of 7 variables:
$ Case : chr "1" "2" "3" "4"
$ StdL : num 19.47 2.33 2.59 7.86
etc
I have tried changing the line
Case<- Inp_dat$Case[i]
to
Case<- levels(Inp_dat$Case)[i]
and this gives the following output
Case StdL StdPP StdSE L
MRPP MRSE
1 1 19.466823 0.16432300 0.03137456 26.002294 0.2080145
0.03804692
2 1 2.334130 0.22566939 0.08962662 5.095703 0.3888451
0.08399101
3 1 2.588678 0.05502765 0.00454159 42.058326 0.4861511
0.02128030
4 1 7.857898 0.18457822 0.04372297 4.705487 0.1193687
0.01921609
str(Case)
chr "Dave"
and
str(out_put)
'data.frame': 4 obs. of 7 variables:
$ Case : chr "1" "1" "1" "1"
$ StdL : num 19.47 2.33 2.59 7.86
etc
I’ve also tried adding, as suggested the stringsAsFactors=FALSE to the
Inp_dat<- read.csv("/File/Input data.csv", stringsAsFactors=FALSE)
This gives the same as the 2nd output above.
--
View this message in context: http://r.789695.n4.nabble.com/Dataframes-and-text-identifier-columns-tp4693184p4693389.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list