[R] hairy indexing problem

Bill.Venables@cmis.csiro.au Bill.Venables at cmis.csiro.au
Wed Jun 5 07:03:06 CEST 2002



>  -----Original Message-----
> From: 	Ian.Saunders at csiro.au [mailto:Ian.Saunders at csiro.au] 
> Sent:	Wednesday, June 05, 2002 2:32 PM
> To:	seniorr at aracnet.com; R-help at stat.math.ethz.ch
> Subject:	RE: [R] hairy indexing problem
> 
> There's probably a better way, but ...
> 
> 	apply(outer(subject,subject,FUN="=="),1,sum)
> 
> will give you a vector of the counts for each value of subject, so would
> be 
> 	2 2 4 4 4 4 2 2 ...
> in your example.
	[WNV]  I think there could be a better way.  

	Make sure subject is a factor:

		subject <- as.factor(data$subject)

	and then your "replications class" factor is

		reps <- factor(table(subject)[subject])

	The next step could be 

		fooMeans <- tapply(data$foo, reps, mean) 
>  
> You could add this as a column of the data frame and use gsummary to get
> the
> summary statistics.
	[WNV]  Yep, that too.   gsummary is part of the nlme package which
has to be loaded.

> Ian.
	[WNV]  Bill.

> > -----Original Message-----
> > From: Russell Senior [mailto:seniorr at aracnet.com]
> > Sent: Wednesday, 5 June 2002 10:31 AM
> > To: R-help at stat.math.ethz.ch
> > Subject: [R] hairy indexing problem
> > 
> > 
> > 
> > I've got a data frame that looks like this:
> > 
> >    subject   foo   bar
> >       2      1.7   3.2
> >       2      2.3   4.1
> >       3      7.6   2.3
> >       3      7.1   3.3
> >       3      7.3   2.3
> >       3      7.4   1.3
> >       5      6.2   6.1
> >       5      3.4   6.9
> >      ...
> > 
> > That is, I've got multiple rows per subject.  I need to compute
> > summaries within categories where the subject has the same number of
> > rows.  For example, subject 2 and 5 both have two rows.  I need to
> > compute mean for those four values of foo.  This looks like a good
> > candidate for index vectors, but I need some help.  I've tried
> > something like:
> > 
> >   table(data) -> tmp
> >  
> > and:
> > 
> >   tmp[tmp == 2]
> > 
> > and even:
> > 
> >   as.numeric(attr(tmp[tmp == 2],"names"))
> > 
> > to get a vector of subject numbers that have two rows in the original
> > data frame.  But I am getting stuck there.  I want some kind of
> > "is.member" function to use in a subsequent index vector expression,
> > like:
> > 
> >   i <- as.numeric(attr(tmp[tmp == 2],"names"))
> >   data[is.member($subject,i)]$foo
> > 
> > but there isn't an is.member() function.  Can someone please give me a
> > pointer on the canonical way to do this?
> > 
> > Thanks!
> > 
> > -- 
> > Russell Senior         ``The two chiefs turned to each other.        
> > seniorr at aracnet.com      Bellison uncorked a flood of horrible       
> >                          profanity, which, translated meant, `This is
> >                          extremely unusual.' ''                      
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> > -.-.-.-.-.-.-.-.-
> > r-help mailing list -- Read 
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _.
> _._
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> -.-.-
> r-help mailing list -- Read
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list