# [R] correlation by factor

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Jul 28 07:51:52 CEST 2021

```Hello,

And here are three more ways. I will put the data, corrected in Bert's
post, in a data.frame.

R <- c(1,8,3,6,7,2,3,7,2,3,3,4,3,7,3)
Day <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
Freq <- paste0("a", rep(1:5,3))
df1 <- data.frame(R, Day, Freq)

# Base R, as for the function, see Bert's post
sapply(split(df1[-3], df1\$Freq), \(x) cor(x)[1,2])

# tidyverse
library(dplyr)
df1 %>%
group_by(Freq) %>%
summarise(Cor = cor(R, Day))

# data.table
library(data.table)
setDT(df1)[, .(Cor = cor(R, Day)), by = Freq]

Hope this helps,

Às 03:30 de 28/07/21, Bert Gunter escreveu:
> Well, first of all, your example is messed up. You missed the "c" in front
> of the ( in Freq <-; and all of the Freq entries need to be enclosed in
> quotes for proper syntax. A simpler way to do it is just to use paste() and
> rep():
>
> Freq <- paste0("a", rep(1:5,3))
> (If you are not familiar with such "utility" functions, you should consider
> spending time with a basic R tutorial or two.)
>
> Ordinarily, your individual vectors, R, Day and Freq, would be in a data
> frame or similar (e.g. a tibble or data.table) structure and you would use
> functions like by() in base R; or "tidyverse" or "data.table" package
> equivalents/elaborations of these.
>
> Here is a base R version (you must have version 4.1.x for the anonymous
> function shortcut, \(x)  ) using by, but you may prefer tidyverse or
> data.table versions that others may  provide:
>
>> out <- by(cbind(R,Day), factor(Freq), FUN = \(x)cor(x)[1,2]) ## to just
> get the off-diagonal of the 2x2 cor matrix
>> as.list(out)
> \$a1
> [1] 1
>
> \$a2
> [1] -0.7559289
>
> \$a3
> [1] 0
>
> \$a4
> [1] 0.1889822
>
> \$a5
> [1] -0.8660254
>
> See ?by and ?cor for details as needed.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Jul 27, 2021 at 5:30 PM Marlin Keith Cox <marlinkcox using gmail.com>
> wrote:
>
>> I am having problems making a correlation/association between two variables
>> by a factor.
>>
>> In the case below, I need to know the correlation between R and Day at each
>> frequency (a1-a5). Each frequency would have a corresponding correlation
>> between R and day.
>>
>> I have found a lm function that is similar to what I need.
>> lm(R~Day*Freq), but this wont apply to the cor function.
>>
>> Mind you, I have hundreds of these to with these same three columns, so if
>> there is an association package, I would be interested in those too.  I did
>> research it, but it quickly went over my head, so I thought I would
>> approach my problem this way.
>>
>> Data is below.
>>
>> Keith
>>
>> R<-c(1,8,3,6,7,2,3,7,2,3,3,4,3,7,3)
>> Day<-c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
>> Freq<-(a1,a2,a3,a4,a5,a1,a2,a3,a4,a5,a1,a2,a3,a4,a5,)
>>
>>
>>
>> M. Keith Cox, Ph.D.
>> Principal
>> MKConsulting
>> 17415 Christine Ave.
>> Juneau, AK 99801
>> U.S. 907.957.4606
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help