[R] Calculation of group summaries
Francisco J. Zagmutt
gerifalte28 at hotmail.com
Tue Jul 12 20:34:39 CEST 2005
Take a look at ?aggregate ?ave and ?tapply
Cheers
Francisco
>From: Seeliger.Curt at epamail.epa.gov
>To: R-Help <r-help at stat.math.ethz.ch>
>Subject: [R] Calculation of group summaries
>Date: Tue, 12 Jul 2005 10:51:03 -0700
>
>I know R has a steep learning curve, but from where I stand the slope
>looks like a sheer cliff. I'm pawing through the available docs and
>have come across examples which come close to what I want but are
>proving difficult for me to modify for my use.
>
>Calculating simple group means is fairly straight forward:
> data(PlantGrowth)
> attach(PlantGrowth)
> stack(mean(unstack(PlantGrowth)))
>
>I'd like to do something slightly more complex, using a data frame and
>groups identified by unique combinations of three id variables. There
>may be thousands of such combinations in the data. This is easy in SQL:
>
> select year,
> site_id,
> visit_no,
> mean(undercut) AS meanUndercut,
> count(undercut) AS nUndercut,
> std(undercut) AS stdUndercut
> from channelMorphology
> group by year, site_id, visit_no
> ;
>
>Reading a CSV written by SAS and selecting only records expected to have
>values is also straight forward in R, but getting those summary values
>for each site visit is currently beyond me:
>
> sub<-read.csv('c:/data/channelMorphology.csv'
> ,header=TRUE
> ,na.strings='.'
> ,sep=','
> ,strip.white=TRUE
> )
>
> undercut<-subset(sub,
> ,TRANSDIR %in% c('LF','RT')
>
>,select=c('YEAR','SITE_ID','VISIT_NO','TRANSECT','TRANSDIR'
> ,'UNDERCUT'
> )
> ,drop=TRUE
> )
>
>
>Thanks all for your help.
>cur
>--
>Curt Seeliger, Data Ranger
>CSC, EPA/WED contractor
>541/754-4638
>seeliger.curt at epa.gov
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
>http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list