[R] problem with FUN in Hmisc::summarize

Frank E Harrell Jr f.harrell at Vanderbilt.Edu
Fri Apr 16 21:04:33 CEST 2010


arnaud chozo wrote:
> Hi all,
> 
> I'd like to use the Hmisc::summarize function, but it uses a function (FUN)
> of a single vector argument to create the statistical summaries.
> 
> Consider an easy case: I'd like to compute the correlation between two
> variables in my dataframe, grouped according to other variables in the same
> dataframe.
> 
> For exemple, consider the following dataframe D:
> V1  V2   V3
> A     1    -1
> A     1     1
> A    -1    -1
> B     1     1
> B     1     1
> 
> I'd like to use Hmisc::summarize(X=D, by=llist(myvar=D$V1), FUN=corr.V2.V3)
> 
> where corr.V2.V3 is defined as follows:
> 
> corr.V2.V3 = function(x) {
>   d = cbind(x$V2, x$V3)
> 
>   out = c(cor(d))
>   names(out) = c("CORR")
>   return(out)
> }
> 
> I was not able to use Hmisc::summarize in this case because FUN should be a
> function of a matrix argument. Any idea?
> 
> Thanks in advance,
> Arnaud

See the Hmisc mApply or summary.formula functions, or use tapply using a 
vector of possible subscripts (1:n) as the first argument; then you can 
use the subscripts selected to address multiple variables.

Frank

-- 
Frank E Harrell Jr   Professor and Chairman        School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list