[R] a question about "by" and "ddply"
David Winsemius
dwinsemius at comcast.net
Wed May 30 06:58:36 CEST 2012
On May 29, 2012, at 6:32 PM, jacaranda tree wrote:
> Hi all,
> I have a data set (df, n=10 for the sake of simplicity here) where I
> have two continuous variables (age and weight) and I also have a
> grouping variable (group, with two levels). I want to run
> correlations for each group separately (kind of similar to "split
> file" in SPSS). I've been experimenting with different functions,
> and I was able to do this correctly using ddply function, but output
> is a little bit difficult to read when I do the cor.test to get all
> the data with p values, df, and pearson r (see below). I also tried
> to do it with by function. Although, with by, it shows the data for
> two groups separately, it seems like it calculates the same r for
> both groups. Here is my code for both ddply and by, and the output
> as well. I was wondering if there is a way to display the output
> better with ddply or run the correlations correctly for each group
> using by.
> Thanks in advance,
>
I would have imagined something along the lines of
lapply( split( df, df$group, function(x) cor.test(x[["age"]],
x[["weight")] )
... but without an example it's only a guess.
--
David
> 1.with "ddply"
> r<-ddply(df, .(group), summarise, "corr" = cor.test(age, weight,
> method = "pearson"))
>
> Output:
> Group corr
> 1 1 Inf
> 2 1 3
> 3 1 0
> 4 1 1
> 5 1 0
> 6 1 two.sided
> 7 1 Pearson's product-moment correlation
> 8 1 age and weight
> 9 1 1, 1
> 10 2 9.722211
> 11 2 3
> 12 2 0.002311412
> 13 2 0.9844986
> 14 2 0
> 15 2 two.sided
> 16 2 Pearson's product-moment correlation
> 17 2 age and weight
> 18 2 0.7779640, 0.9990233
>
> 2. with "by"
> r <- by(df, group, FUN = function(x) cor.test(age, weight, method =
> "pearson"))
>
> Output:
> Group: 1
>
> Pearson's product-moment correlation
>
> data: age and weight
> t = 6.4475, df = 8, p-value = 0.0001988
> alternative hypothesis: true correlation is not equal to 0
> 95 percent confidence interval:
> 0.6757758 0.9802100
> sample estimates:
> cor
> 0.9157592
>
> ------------------------------------------------------------
> Group: 2
>
> Pearson's product-moment correlation
>
> data: age and weight
> t = 6.4475, df = 8, p-value = 0.0001988
> alternative hypothesis: true correlation is not equal to 0
> 95 percent confidence interval:
> 0.6757758 0.9802100
> sample estimates:
> cor
> 0.9157592
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list