[R] apply mean function to a subset of data

Pedro Mardones mardones.p at gmail.com
Mon Apr 4 14:07:45 CEST 2016


Thanks David
It works perfectly
Pedro

Sent from my iPhone

> On Apr 3, 2016, at 17:44, David L Carlson <dcarlson at tamu.edu> wrote:
> 
> Here are several ways to get there, but your original loop is fine once it is corrected:
> 
>> for (i in 1:2)  smean[i] <- mean(toy$diam[toy$group==i][1:nsel[i]])
>> smean
> [1] 0.271489 1.117015
> 
> Using sapply() to hide the loop:
>> smean <- sapply(1:2, function(x) mean((toy$diam[toy$group==x])[1:nsel[x]]))
>> smean
> [1] 0.271489 1.117015
> 
> Or use head()
>> smean <- sapply(1:2, function(x) mean(head(toy$diam[toy$group==x], nsel[x])))
>> smean
> [1] 0.271489 1.117015
> 
> Or mapply() instead of sapply
>> smean <- mapply(function(x, y) mean(head(x, y)) , x=split(toy$diam, toy$group), y=nsel)
>> smean
>       1        2 
> 0.271489 1.117015
> 
> ------------------------------
> David L. Carlson
> Department of Anthropology
> Texas A&M University
> 
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
> Sent: Saturday, April 2, 2016 6:14 PM
> To: Pedro Mardones <mardones.p at gmail.com>
> Cc: r-help mailing list <r-help at r-project.org>
> Subject: Re: [R] apply mean function to a subset of data
> 
> Hi Pedro,
> This may not be much of an improvement, but it was a challenge.
> 
> selvec<-as.vector(matrix(c(nsel,unlist(by(toy$diam,toy$group,length))-nsel),
> ncol=2,byrow=TRUE))
> TFvec<-rep(c(TRUE,FALSE),length.out=length(selvec))
> toynsel<-rep(TFvec,selvec)
> by(toy[toynsel,]$diam,toy[toynsel,]$group,mean)
> 
> Jim
> 
>> On 4/3/16, Pedro Mardones <mardones.p at gmail.com> wrote:
>> Dear all;
>> 
>> This must have a rather simple answer but haven't been able to figure it
>> out: I have a data frame with say 2 groups (group 1 & 2). I want to select
>> from group 1 say "n" rows and calculate the mean; then select "m" rows from
>> group 2 and calculate the mean as well. So far I've been using a for loop
>> for doing it but when it comes to a large data set is rather inefficient.
>> Any hint to vectorize this would be appreciated.
>> 
>> toy = data.frame(group = c(rep(1,10),rep(2,8)), diam =
>> c(rnorm(10),rnorm(8)))
>> nsel = c(6,4)
>> smean <- c(0,0)
>> for (i in 1:2)  smean[i] <- mean(toy$diam[1:nsel[i]])
>> 
>> Thanks
>> 
>> Pedro
>> 
>>    [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list