[R] concatenate values of two columns
kMan
kchamberln at gmail.com
Thu May 6 05:49:58 CEST 2010
Dear n.vialma,
Good question! Your columns are of type factor(). Watch out for strange
things with coercion (and so much for the 3 minute reply)! In this solution,
you need a pre-allocated vector to store the results, and your approach is
different depending on the data type you want the resulting vector to be.
Say your data.frame() is defined below:
df<-data.frame(x=c(1:6), var1=c("",1,2,"","",4),var2=c(2,"","",3,4,"")) #
without stringsAsFactors=F
The coercion gets fishy if you try to jump to numeric()
as.numeric(df$var1) # didn't work. as.numeric(factor()) seems to cause
problems
[1] 1 2 3 1 1 4
as.character(df$var1) # works
[1] "" "1" "2" "" "" "4"
as.numeric(as.character(df$var1)) # works
[1] NA 1 2 NA NA 4
Starting with factors, and resulting in either character() or numeric()
dfm<-vector("character", 6) #pre allocate
index1<-df$var1!=""
index2<-df$var2!=""
dfm[index1]<-as.character(df$var1[index1])
dfm[index2]<-as.character(df$var2[index2])
dfm<-as.numeric(dfm)
Starting with character data (stringAsFactors=F), coercion works better with
these data.
df<-data.frame(x=c(1:6), var1=c("",1,2,"","",4),var2=c(2,"","",3,4,""),
stringsAsFactors=F)
as.numeric(df$var1) # works
[1] NA 1 2 NA NA 4
If the data are numeric, the empty character fields coerce NA, so instead,
you test for is.na()
df<-data.frame(x=c(1:6), var1=c(NA,1,2,NA,NA,4),var2=c(2,NA,NA,3,4,NA))
dfm<-vector("numeric",6)
index1<-!is.na(df$var1)
index2<-!is.na(df$var2)
dfm[index1]<-df$var1[index1]
dfm[index2]<-df$var2[index2]
Sincerely,
KeithC.
-----Original Message-----
From: n.vialma at libero.it [mailto:n.vialma at libero.it]
Sent: Wednesday, May 05, 2010 3:47 AM
To: r-help at r-project.org
Subject: [R] concatenate values of two columns
Dear list,
I'm trying to concatenate the values of two columns but im not able to do
it:
i have a dataframe with the following two columns:
X VAR1 VAR2
1 2
2 1
3 2
4 3
5 4
6 4
what i would like to obtain is:
X VAR3
1 2
2 1
3 2
4 3
5 4
6 4
I try with paste but what I obtain is:
X VAR3
1 NA2
2 1NA
3 2NA
4 NA3
5 NA4
6 4NA
Thanks a lot!!
[[alternative HTML version deleted]]
More information about the R-help
mailing list