[R] Odp: an unsophisticated question about recoding in a data frame with control structure if {}
Petr PIKAL
petr.pikal at precheza.cz
Mon Oct 6 15:28:55 CEST 2008
Hi
yourm question has two aspects. one is simple
data.frame$thevector[dataframe$factor=='3'] <- an arithmetic.mean
> zdrz
doba otac sklon
1 189.0 0.6 110
2 256.0 0.6 80
3 286.0 0.6 50
4 105.0 1.2 50
zdrz$otac==0.6
zdrz$sklon[zdrz$otac==0.6]<-11
> zdrz
doba otac sklon
1 189.0 0.6 11
2 256.0 0.6 11
3 286.0 0.6 11
4 105.0 1.2 50
The second part is not so simple and depends on how you want to compute
mean. As I understand you want to sum 9 differences but some of them can
be NA. Then you need to think what you want to do when few results will be
NA
> sum(c(1:8,NA), na.rm=T)/9
[1] 4
> sum(c(1:8,NA), na.rm=T)/8
[1] 4.5
>
Which one is correct from your point of view.
I would split data frame according to condition
nes.split<-split(nes2004, nes2004$Rrace2 == "1")
then I would compute desired mean and make a new column
iris.spl<-split(iris, iris$Species)
mmm<-sapply(iris.spl, mean)
here you can use your function instead of mean
mmm
setosa versicolor virginica
Sepal.Length 5.006 5.936 6.588
Sepal.Width 3.428 2.770 2.974
Petal.Length 1.462 4.260 5.552
Petal.Width 0.246 1.326 2.026
Species NA NA NA
iris$means<-mmm[1,][match(iris$Species,colnames(mmm))]
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species means
1 5.1 3.5 1.4 0.2 setosa 5.006
2 4.9 3.0 1.4 0.2 setosa 5.006
3 4.7 3.2 1.3 0.2 setosa 5.006
Regards
Petr
r-help-bounces at r-project.org napsal dne 01.10.2008 11:05:17:
> Hello all,
>
> I apologize for a terribly simple question. I'm used to using Stata and
am
> trying to `switch' over to R.
>
> I would like to recode a vector in a data frame when the value of it
meets the
> following condition: if
(dataframe$factor=='3'){dataframe$thevector<-(an
> arithmetic mean). What I would like to result is the creation of a new
> variable within the data frame for which all observations that =='3' get
the
> mean. Yet, I receive an error message that R is only reading the first
> element of the factor:
>
> Warning message:
> In if (nes2004$Rrace2 == "1") { :
> the condition has length > 1 and only the first element will be used
>
> As the command in question actually appears:
> if (nes2004$Rrace2=='3') {
>
nes2004$Hethno<-(((nes2004$Hint-nes2004$Wint)+(nes2004$Hint-nes2004$Bint)+
>
(nes2004$Hint-nes2004$Aint)+(nes2004$Hwork-nes2004$Wwork)+(nes2004$Hwork-
>
nes2004$Bwork)+(nes2004$Hwork-nes2004$Awork)+(nes2004$Htrus-nes2004$Wtrus)+
> (nes2004$Htrus-nes2004$Btrus)+(nes2004$Htrus-nes2004$Atrus))/9) }
>
> The other problem here too, is that each of these variables contain
varying
> counts of NA observations.
>
> I've tried to work around this by using ifelse, and other commands, but
I can
> not figure out where to begin re-thinking how I do this. Any guidance
would
> be appreciated.
>
> Thank you.
> Whitt
>
>
>
> ****************************************
> H. Whitt Kilburn, Ph.D.
> Assistant Professor
> Grand Valley State University
> Political Science Department
> 1124 AuSable Hall
> 1 Campus Drive
> Allendale, MI 49401-9403
> Phone:(616) 331-8831
> Fax:(616) 331-3550
> http://faculty.gvsu.edu/kilburnw
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list