[R] combining output from several operations
Frank E Harrell Jr
fharrell at virginia.edu
Fri Aug 23 06:04:26 CEST 2002
On Thu, 22 Aug 2002 17:09:34 -0500
Tim Wilson <wilson at visi.com> wrote:
> Hi everyone,
>
> I wonder if there's a patient soul out there who has a minute to look at
> the following.
>
> I've got a set of summary statistics I need to perform many times.
> Naturally, I've looked at writing a function to automate the process as
> much as possible. (These are the data I mentioned recently in my
> question about weighted means.) I'm having trouble figuring out the
> proper syntax for taking the results of several different functions and
> combining them into a single function. I'm pasting an example below of the
> analysis I need to do for each column of a number of dataframes. This
> works perfectly, but repeating this procedure a couple hundred times
> doesn't thrill me.
>
> The only thing that isn't complete below is that I need the describe
> function (from the Hmisc library) to give me the standard deviation
> as well as the mean. Is it possible to do that without modifying the
> describe function directly?
>
> I'd be glad to hear any suggestions from the R gurus on the list.
>
> -Tim
>
> > lapply(split(faculty$Q8, list(faculty$TWOYROR4, faculty$FACULTY)),
> describe)
> $"2.1"
> X[[1]]
> n missing unique Mean
> 47 0 3 3.362
>
> 3 (38, 81%), 4 (1, 2%), 5 (8, 17%)
>
> $"4.1"
> X[[2]]
> n missing unique Mean
> 147 0 5 1.837
>
> 0 1 2 3 4
> Frequency 1 59 57 23 7
> % 1 40 39 16 5
>
> $"2.2"
> X[[3]]
> n missing unique Mean
> 2 0 1 3
>
> $"4.2"
> X[[4]]
> n missing unique Mean
> 25 0 5 1.8
>
> 0 1 2 3 4
> Frequency 2 8 9 5 1
> % 8 32 36 20 4
>
> > a <- aggregate(faculty$Q8, list(CETP=faculty$CETP), mean)
>
> NOTE: I'm using the aggregate function to weight the means so that each
> CETP contributes equally to an overall mean and standard deviation. I
> need to use this procedure on each of the four results of lapply above.
> I can't figure that out at all.
>
> > a
> CETP x
> 1 ACEPT 2.521739
> 2 LaCEPT 1.666667
> 3 MASTEP 2.442308
> 4 MMSTEC 1.900000
> 5 NYCETP 1.875000
> 6 PETE 1.600000
> 7 STEMTEC 2.428571
> 8 Temple/Philadelphia 2.750000
> 9 TxCETP 2.218182
> 10 VCEPT 2.222222
> > mean(a$x)
> [1] 2.162469
> > a <- aggregate(faculty$Q8, list(CETP=faculty$CETP), sd)
> > mean(a$x)
> [1] 1.041506
> >
>
> --
> Tim Wilson | Visit Sibley online: | Check out:
> Henry Sibley HS | http://www.isd197.org | http://www.zope.com
> W. St. Paul, MN | | http://slashdot.org
> wilson at visi.com | <dtml-var pithy_quote> | http://linux.com
Tim - describe takes a weights= argument, but you're right - describe does not compute the SD [due to my bias against SD as a descriptive statistic, especially for skewed data].
Frank Harrell
--
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list