[R] Odp: an unsophisticated question about recoding in a data frame with control structure if {}

Petr PIKAL petr.pikal at precheza.cz
Mon Oct 6 15:28:55 CEST 2008


Hi

yourm question has two aspects. one is simple

data.frame$thevector[dataframe$factor=='3'] <- an arithmetic.mean

> zdrz
    doba otac sklon
1  189.0  0.6   110
2  256.0  0.6    80
3  286.0  0.6    50
4  105.0  1.2    50
zdrz$otac==0.6
zdrz$sklon[zdrz$otac==0.6]<-11

> zdrz
    doba otac sklon
1  189.0  0.6    11
2  256.0  0.6    11
3  286.0  0.6    11
4  105.0  1.2    50

The second part is not so simple and depends on how you want to compute 
mean. As I understand you want to sum 9 differences but some of them can 
be NA. Then you need to think what you want to do when few results will be 
NA

> sum(c(1:8,NA), na.rm=T)/9
[1] 4
> sum(c(1:8,NA), na.rm=T)/8
[1] 4.5
> 

Which one is correct from your point of view. 

I would split data frame according to condition

nes.split<-split(nes2004, nes2004$Rrace2 == "1")

then I would compute desired mean and make a new column

iris.spl<-split(iris, iris$Species)
mmm<-sapply(iris.spl, mean)

here you can use your function instead of mean

mmm
             setosa versicolor virginica
Sepal.Length  5.006      5.936     6.588
Sepal.Width   3.428      2.770     2.974
Petal.Length  1.462      4.260     5.552
Petal.Width   0.246      1.326     2.026
Species          NA         NA        NA
iris$means<-mmm[1,][match(iris$Species,colnames(mmm))]
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species means
1          5.1         3.5          1.4         0.2  setosa 5.006
2          4.9         3.0          1.4         0.2  setosa 5.006
3          4.7         3.2          1.3         0.2  setosa 5.006


Regards
Petr
 

r-help-bounces at r-project.org napsal dne 01.10.2008 11:05:17:

> Hello all, 
> 
> I apologize for a terribly simple question.  I'm used to using Stata and 
am 
> trying to `switch' over to R. 
> 
> I would like to recode a vector in a data frame when the value of it 
meets the
> following condition:  if 
(dataframe$factor=='3'){dataframe$thevector<-(an 
> arithmetic mean).  What I would like to result is the creation of a new 
> variable within the data frame for which all observations that =='3' get 
the 
> mean.  Yet, I receive an error message that R is only reading the first 
> element of the factor: 
> 
> Warning message:
> In if (nes2004$Rrace2 == "1") { :
>   the condition has length > 1 and only the first element will be used
> 
> As the command in question actually appears: 
> if (nes2004$Rrace2=='3') {
> 
nes2004$Hethno<-(((nes2004$Hint-nes2004$Wint)+(nes2004$Hint-nes2004$Bint)+
> 
(nes2004$Hint-nes2004$Aint)+(nes2004$Hwork-nes2004$Wwork)+(nes2004$Hwork-
> 
nes2004$Bwork)+(nes2004$Hwork-nes2004$Awork)+(nes2004$Htrus-nes2004$Wtrus)+
> (nes2004$Htrus-nes2004$Btrus)+(nes2004$Htrus-nes2004$Atrus))/9) }
> 
> The other problem here too, is that each of these variables contain 
varying 
> counts of NA observations. 
> 
> I've tried to work around this by using ifelse, and other commands, but 
I can 
> not figure out where to begin re-thinking how I do this.  Any guidance 
would 
> be appreciated. 
> 
> Thank you. 
> Whitt 
> 
> 
> 
> ****************************************
> H. Whitt Kilburn, Ph.D.
> Assistant Professor
> Grand Valley State University
> Political Science Department
> 1124 AuSable Hall
> 1 Campus Drive
> Allendale, MI 49401-9403
> Phone:(616) 331-8831
> Fax:(616) 331-3550
> http://faculty.gvsu.edu/kilburnw
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list