[R] Pairwise cross correlation from data set

David Winsemius dwinsemius at comcast.net
Sun Jun 13 20:18:05 CEST 2010


On Jun 13, 2010, at 1:47 PM, Claus O'Rourke wrote:

> Dear list,
>
> Following up on an earlier post, I would like to reorder a dataset and
> compute pairwise correlations. But I'm having some real problems
> getting this done.
>
> My data looks something like:
>
> Participant Stimulus Measurement
> p1                 s`1            5
> p1                 s`2            6.1
> p1                 s`3            7
> p2                 s`1            4.8
> p2                 s`2            6
> p2                 s`3            6.5
> p3                 s`1            4
> p3                 s`2            7
> p3                 s`3            6
>
> As a first step I would imagine that I have to rearrange my data into
> a frame more like this

It is not particularly natural to do the re-ordering to my mind as  
long asyou preserve the ordering of the Stimulus variable. Personally,  
I would avoid using back-quotes in values as they may have special  
syntactic implications.
>
> Stimulus  p1   p2      p3
>   s1       5      4.8     4
>   s2       6.1   6        7
>   s3       7      6.5     6
>
> And then do the pairwise correlations between {p1,p2},{p2,p3}.{p2,p3}
>
> I can do all of this manually, i.e., using some messy case specific
> code, but can anyone please point out the best way to do this in a
> more generalizable way.

 > rd.txt
function(txt, header=TRUE, ...) {
      rd <- read.table(textConnection(txt), header=header, ...)
        closeAllConnections()
      rd }
 > dtat <- rd.txt("Participant Stimulus Measurement
+ p1                 s`1            5
+ p1                 s`2            6.1
+ p1                 s`3            7
+ p2                 s`1            4.8
+ p2                 s`2            6
+ p2                 s`3            6.5
+ p3                 s`1            4
+ p3                 s`2            7
+ p3                 s`3            6", stringsAsFactors=F)


 > str(dtat)
'data.frame':	9 obs. of  3 variables:
  $ Participant: chr  "p1" "p1" "p1" "p2" ...
  $ Stimulus   : chr  "s`1" "s`2" "s`3" "s`1" ...
  $ Measurement: num  5 6.1 7 4.8 6 6.5 4 7 6
 > combn(unique(dtat$Participant), 2)
      [,1] [,2] [,3]
[1,] "p1" "p1" "p2"
[2,] "p2" "p3" "p3"

 >  apply( combn(unique(dtat$Participant), 2), 2,
# read combinations by columns
               function(x) {with(subset(dtat, Participant %in% x),
# used only the desired combo's
                     cor(Measurement, as.numeric(factor(Stimulus))))
# needed to turn Stimulus into factor to get an ordering
                          } )

# [1] 0.9696635 0.7627701 0.7424791

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list