[R] a question about "by" and "ddply"
R. Michael Weylandt
michael.weylandt at gmail.com
Wed May 30 07:04:37 CEST 2012
On Wed, May 30, 2012 at 12:58 AM, David Winsemius
<dwinsemius at comcast.net> wrote:
>
> On May 29, 2012, at 6:32 PM, jacaranda tree wrote:
>
>> Hi all,
>> I have a data set (df, n=10 for the sake of simplicity here) where I have
>> two continuous variables (age and weight) and I also have a grouping
>> variable (group, with two levels). I want to run correlations for each group
>> separately (kind of similar to "split file" in SPSS). I've been
>> experimenting with different functions, and I was able to do this correctly
>> using ddply function, but output is a little bit difficult to read when I do
>> the cor.test to get all the data with p values, df, and pearson r (see
>> below). I also tried to do it with by function. Although, with by, it shows
>> the data for two groups separately, it seems like it calculates the same r
>> for both groups. Here is my code for both ddply and by, and the output as
>> well. I was wondering if there is a way to display the output better with
>> ddply or run the correlations correctly for each group using by.
>> Thanks in advance,
>>
>
> I would have imagined something along the lines of
>
> lapply( split( df, df$group, function(x) cor.test(x[["age"]], x[["weight")]
> )
lapply( split( df, df$group), function(x) cor.test(x[["age"]], x[["weight"]]) )
I'd imagine (I've been hunting down missing parentheses all night so
excuse the pedantry)
Repeating David's disclaimer "... but without an example it's only a guess."
Best,
M
>
> ... but without an example it's only a guess.
>
> --
> David
>
>> 1.with "ddply"
>> r<-ddply(df, .(group), summarise, "corr" = cor.test(age, weight, method =
>> "pearson"))
>>
>> Output:
>> Group corr
>> 1 1 Inf
>> 2 1 3
>> 3 1 0
>> 4 1 1
>> 5 1 0
>> 6 1 two.sided
>> 7 1 Pearson's product-moment correlation
>> 8 1 age and weight
>> 9 1 1, 1
>> 10 2 9.722211
>> 11 2 3
>> 12 2 0.002311412
>> 13 2 0.9844986
>> 14 2 0
>> 15 2 two.sided
>> 16 2 Pearson's product-moment correlation
>> 17 2 age and weight
>> 18 2 0.7779640, 0.9990233
>>
>> 2. with "by"
>> r <- by(df, group, FUN = function(x) cor.test(age, weight, method =
>> "pearson"))
>>
>> Output:
>> Group: 1
>>
>> Pearson's product-moment correlation
>>
>> data: age and weight
>> t = 6.4475, df = 8, p-value = 0.0001988
>> alternative hypothesis: true correlation is not equal to 0
>> 95 percent confidence interval:
>> 0.6757758 0.9802100
>> sample estimates:
>> cor
>> 0.9157592
>>
>> ------------------------------------------------------------
>> Group: 2
>>
>> Pearson's product-moment correlation
>>
>> data: age and weight
>> t = 6.4475, df = 8, p-value = 0.0001988
>> alternative hypothesis: true correlation is not equal to 0
>> 95 percent confidence interval:
>> 0.6757758 0.9802100
>> sample estimates:
>> cor
>> 0.9157592
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list