[R] Correlation between matrices
Kaiyin Zhong
kindlychung at gmail.com
Sun Nov 6 07:06:50 CET 2011
Thank you Dennis, your tips are really helpful.
I don't quite understand the lm(y~mouse) part; my intention was -- in
pseudo code -- lm(y(Enzyme) ~ y(each elem)).
In addition, attach(d) seems necessary before using lm(y~mouse), and
since d$mouse has a length 125, while each elem for each region has a
length 5, it generates the following error:
> coefs = ddply(d, .(regions, elem), coefun)
Error in model.frame.default(formula = y ~ mouse, drop.unused.levels = TRUE) :
variable lengths differ (found for 'mouse')
On Sun, Nov 6, 2011 at 12:53 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>
> Hi:
>
> I don't think you want to keep these objects separate; it's better to
> combine everything into a data frame. Here's a variation of your
> example - the x variable ends up being a mouse, but you may have
> another variable that's more appropriate to plot so take this as a
> starting point. One plot uses the ggplot2 package, the other uses the
> lattice and latticeExtra packages.
>
> library('ggplot2')
> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
> 'cerebellum')
> mice = paste('mouse', 1:5, sep='')
> elem <- c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')
>
> # Generate a data frame from the combinations of
> # mice, regions and elem:
> d <- data.frame(expand.grid(mice = mice, regions = regions,
> elem = elem), y = rnorm(125))
> # Create a numeric version of mice
> d$mouse <- as.numeric(d$mice)
>
> # A function to return regression coefficients
> coefun <- function(df) coef(lm(y ~ mouse), data = df)
> # Apply to all regions * elem combinations
> coefs <- ddply(d, .(regions, elem), coefun)
> names(coefs) <- c('regions', 'elem', 'b0', 'b1')
>
> # Generate the plot using package ggplot2:
> ggplot(d, aes(x = mouse, y = y)) +
> geom_point(size = 2.5) +
> geom_abline(data = coefs, aes(intercept = b0, slope = b1),
> size = 1) +
> facet_grid(elem ~ regions)
>
> # Same plot in lattice:
> library('lattice')
> library('latticeExtra')
> p <- xyplot(y ~ mouse | elem + regions, data = d, type = c('p', 'r'),
> layout = c(5, 5))
>
>
> HTH,
> Dennis
>
> On Sat, Nov 5, 2011 at 10:49 AM, Kaiyin Zhong <kindlychung at gmail.com> wrote:
> >> regions = c('cortex', 'hippocampus', 'brain_stem', 'mid_brain',
> > 'cerebellum')
> >> mice = paste('mouse', 1:5, sep='')
> >> for (n in c('Cu', 'Fe', 'Zn', 'Ca', 'Enzyme')) {
> > + assign(n, as.data.frame(replicate(5, rnorm(5))))
> > + }
> >> names(Cu) = names(Zn) = names(Fe) = names(Ca) = names(Enzyme) = regions
> >> row.names(Cu) = row.names(Zn) = row.names(Fe) = row.names(Ca) =
> > row.names(Enzyme) = mice
> >> Cu
> > cortex hippocampus brain_stem mid_brain cerebellum
> > mouse1 -0.5436573 -0.31486713 0.1039148 -0.3908665 -1.0849112
> > mouse2 1.4559136 1.75731752 -2.1195118 -0.9894767 0.3609033
> > mouse3 -0.6735427 -0.04666507 0.9641000 0.4683339 0.7419944
> > mouse4 0.6926557 -0.47820023 1.3560802 0.9967562 -1.3727874
> > mouse5 0.2371585 0.20031393 -1.4978517 0.7535148 0.5632443
> >> Zn
> > cortex hippocampus brain_stem mid_brain cerebellum
> > mouse1 -0.66424043 0.6664478 1.1983546 0.0319403 0.41955740
> > mouse2 -1.14510448 1.5612235 0.3210821 0.4094753 1.01637466
> > mouse3 -0.85954416 2.8275458 -0.6922565 -0.8182307 -0.06961242
> > mouse4 0.03606034 -0.7177256 0.7067217 0.2036655 -0.25542524
> > mouse5 0.67427572 0.6171704 0.1044267 -1.8636174 -0.07654666
> >> Fe
> > cortex hippocampus brain_stem mid_brain cerebellum
> > mouse1 1.8337008 2.0884261 0.29730413 -1.6884804 0.8336137
> > mouse2 -0.2734139 -0.5728439 0.63791556 -0.6232828 -1.1352224
> > mouse3 -0.4795082 0.1627235 0.21775206 1.0751584 -0.5581422
> > mouse4 1.7125147 -0.5830600 1.40597896 -0.2815305 0.3776360
> > mouse5 -0.3469067 -0.4813120 -0.09606797 1.0970077 -1.1234038
> >> Ca
> > cortex hippocampus brain_stem mid_brain cerebellum
> > mouse1 -0.7663354 0.8595091 1.33803798 -1.17651576 0.8299963
> > mouse2 -0.7132260 -0.2626811 0.08025079 -2.40924271 0.7883005
> > mouse3 -0.7988904 -0.1144639 -0.65901136 0.42462227 0.7068755
> > mouse4 0.3880393 0.5570068 -0.49969135 0.06633009 -1.3497228
> > mouse5 1.0077684 0.6023264 -0.57387762 0.25919461 -0.9337281
> >> Enzyme
> > cortex hippocampus brain_stem mid_brain cerebellum
> > mouse1 1.3430936 0.5335819 -0.56992947 1.3565803 -0.8323391
> > mouse2 1.0520850 -1.0201124 0.89600005 1.4719880 1.0854768
> > mouse3 -0.2802482 0.6863323 -1.37483570 -0.7790174 0.2446761
> > mouse4 -0.1916415 -0.4566571 1.93365932 1.3493848 0.2130424
> > mouse5 -1.0349593 -0.1940268 -0.07216321 -0.2968288 1.7406905
> >
> > In each anatomic region, I would like to calculate the correlation between
> > Enzyme activity and each of the concentrations of Cu, Zn, Fe, and Ca, and
> > do a scatter plot with a tendency line, organizing those plots into a grid.
> > See the image below for the desired effect:
> > http://postimage.org/image/62brra6jn/
> > How can I achieve this?
> >
> > Thank you in advance.
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
More information about the R-help
mailing list