[R] Correlation between matrices

Kaiyin Zhong kindlychung at gmail.com
Sun Nov 6 07:06:50 CET 2011


Thank you Dennis, your tips are really helpful.
I don't quite understand the lm(y~mouse) part; my intention was -- in
pseudo code -- lm(y(Enzyme) ~ y(each elem)).

In addition, attach(d) seems necessary before using lm(y~mouse), and
since d$mouse has a length 125, while each elem for each region has a
length 5, it generates the following error:

> coefs = ddply(d, .(regions, elem), coefun)
Error in model.frame.default(formula = y ~ mouse, drop.unused.levels = TRUE) :
  variable lengths differ (found for 'mouse')


On Sun, Nov 6, 2011 at 12:53 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>
> Hi:
>
> I don't think you want to keep these objects separate; it's better to
> combine everything into a data frame. Here's a variation of your
> example - the x variable ends up being a mouse, but you may have
> another variable that's more appropriate to plot so take this as a
> starting point. One plot uses the ggplot2 package, the other uses the
> lattice and latticeExtra packages.
>
> library('ggplot2')
> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
>            'cerebellum')
> mice = paste('mouse', 1:5, sep='')
> elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
>
> # Generate a data frame from the combinations of
> # mice, regions and elem:
> d <- data.frame(expand.grid(mice = mice, regions = regions,
>                            elem = elem), y = rnorm(125))
> # Create a numeric version of mice
> d$mouse <- as.numeric(d$mice)
>
> # A function to return regression coefficients
> coefun <- function(df) coef(lm(y ~ mouse), data = df)
> # Apply to all regions * elem combinations
> coefs <- ddply(d, .(regions, elem), coefun)
> names(coefs) <- c('regions', 'elem', 'b0', 'b1')
>
> # Generate the plot using package ggplot2:
> ggplot(d, aes(x = mouse, y = y)) +
>   geom_point(size = 2.5) +
>   geom_abline(data = coefs, aes(intercept = b0, slope = b1),
>                             size = 1) +
>   facet_grid(elem ~ regions)
>
> # Same plot in lattice:
> library('lattice')
> library('latticeExtra')
> p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'),
>         layout = c(5, 5))
>
>
> HTH,
> Dennis
>
> On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
> >> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
> > 'cerebellum')
> >> mice = paste('mouse', 1:5, sep='')
> >> for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
> > +   assign(n, as.data.frame(replicate(5, rnorm(5))))
> > + }
> >> names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions
> >> row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) =
> > row.names(Enzyme) = mice
> >> Cu
> >           cortex hippocampus brain_stem  mid_brain cerebellum
> > mouse1 -0.5436573 -0.31486713  0.1039148 -0.3908665 -1.0849112
> > mouse2  1.4559136  1.75731752 -2.1195118 -0.9894767  0.3609033
> > mouse3 -0.6735427 -0.04666507  0.9641000  0.4683339  0.7419944
> > mouse4  0.6926557 -0.47820023  1.3560802  0.9967562 -1.3727874
> > mouse5  0.2371585  0.20031393 -1.4978517  0.7535148  0.5632443
> >> Zn
> >            cortex hippocampus brain_stem  mid_brain  cerebellum
> > mouse1 -0.66424043   0.6664478  1.1983546  0.0319403  0.41955740
> > mouse2 -1.14510448   1.5612235  0.3210821  0.4094753  1.01637466
> > mouse3 -0.85954416   2.8275458 -0.6922565 -0.8182307 -0.06961242
> > mouse4  0.03606034  -0.7177256  0.7067217  0.2036655 -0.25542524
> > mouse5  0.67427572   0.6171704  0.1044267 -1.8636174 -0.07654666
> >> Fe
> >           cortex hippocampus  brain_stem  mid_brain cerebellum
> > mouse1  1.8337008   2.0884261  0.29730413 -1.6884804  0.8336137
> > mouse2 -0.2734139  -0.5728439  0.63791556 -0.6232828 -1.1352224
> > mouse3 -0.4795082   0.1627235  0.21775206  1.0751584 -0.5581422
> > mouse4  1.7125147  -0.5830600  1.40597896 -0.2815305  0.3776360
> > mouse5 -0.3469067  -0.4813120 -0.09606797  1.0970077 -1.1234038
> >> Ca
> >           cortex hippocampus  brain_stem   mid_brain cerebellum
> > mouse1 -0.7663354   0.8595091  1.33803798 -1.17651576  0.8299963
> > mouse2 -0.7132260  -0.2626811  0.08025079 -2.40924271  0.7883005
> > mouse3 -0.7988904  -0.1144639 -0.65901136  0.42462227  0.7068755
> > mouse4  0.3880393   0.5570068 -0.49969135  0.06633009 -1.3497228
> > mouse5  1.0077684   0.6023264 -0.57387762  0.25919461 -0.9337281
> >> Enzyme
> >           cortex hippocampus  brain_stem  mid_brain cerebellum
> > mouse1  1.3430936   0.5335819 -0.56992947  1.3565803 -0.8323391
> > mouse2  1.0520850  -1.0201124  0.89600005  1.4719880  1.0854768
> > mouse3 -0.2802482   0.6863323 -1.37483570 -0.7790174  0.2446761
> > mouse4 -0.1916415  -0.4566571  1.93365932  1.3493848  0.2130424
> > mouse5 -1.0349593  -0.1940268 -0.07216321 -0.2968288  1.7406905
> >
> > In each anatomic region, I would like to calculate the correlation between
> > Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and
> > do a scatter plot with a tendency line, organizing those plots into a grid.
> > See the image below for the desired effect:
> > http://postimage.org/image/62brra6jn/
> > How can I achieve this?
> >
> > Thank you in advance.
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >



More information about the R-help mailing list