[R] Calculation of group summaries

Frank E Harrell Jr f.harrell at vanderbilt.edu
Tue Jul 12 21:57:06 CEST 2005


See http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/SasByMeansExample
for one example.

Frank


Seeliger.Curt at epamail.epa.gov wrote:
> I know R has a steep learning curve, but from where I stand the slope
> looks like a sheer cliff.  I'm pawing through the available docs and
> have come across examples which come close to what I want but are
> proving difficult for me to modify for my use.
> 
> Calculating simple group means is fairly straight forward:
>   data(PlantGrowth)
>   attach(PlantGrowth)
>   stack(mean(unstack(PlantGrowth)))
> 
> I'd like to do something slightly more complex, using a data frame and
> groups identified by unique combinations of three id variables.  There
> may be thousands of such combinations in the data.  This is easy in SQL:
> 
>   select year,
>          site_id,
>          visit_no,
>          mean(undercut) AS meanUndercut,
>          count(undercut) AS nUndercut,
>          std(undercut) AS stdUndercut
>   from channelMorphology
>   group by year, site_id, visit_no
>       ;
> 
> Reading a CSV written by SAS and selecting only records expected to have
> values is also straight forward in R, but getting those summary values
> for each site visit is currently beyond me:
> 
>   sub<-read.csv('c:/data/channelMorphology.csv'
>                ,header=TRUE
>                ,na.strings='.'
>                ,sep=','
>                ,strip.white=TRUE
>                )
> 
>   undercut<-subset(sub,
>                   ,TRANSDIR %in% c('LF','RT')
> 
> ,select=c('YEAR','SITE_ID','VISIT_NO','TRANSECT','TRANSDIR'
>                            ,'UNDERCUT'
>                            )
>                   ,drop=TRUE
>                   )
> 
> 
> Thanks all for your help.
> cur
> --
> Curt Seeliger, Data Ranger
> CSC, EPA/WED contractor
> 541/754-4638
> seeliger.curt at epa.gov
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list