[R] Determining variance components of classed covariates
Stephen Montgomery
sm8 at sanger.ac.uk
Mon Jan 12 17:36:27 CET 2009
Hi -
I am interested in solving variance components for the data below with
respect to the response variable, Expression within R.
However, the covariates aren't independent and they also have a class
(of which the total variance explained by covariates in that class I am
most interested in).
Very naively, I have tried to look at each individual covariates
variance like this
> lm<-lmer(Expression ~ 1 + (1|rs11834524) + (1|rs7074431),
data=input_new)
> lm
Linear mixed-effects model fit by REML
Formula: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431)
Data: input
AIC BIC logLik MLdeviance REMLdeviance
108.4 116.5 -51.22 102.5 102.4
Random effects:
Groups Name Variance Std.Dev.
rs11834524 (Intercept) 0.485538 0.69681
rs7074431 (Intercept) 0.013720 0.11713
Residual 0.128853 0.35896
number of obs: 109, groups: rs11834524, 3; rs7074431, 3
Fixed effects:
Estimate Std. Error t value
(Intercept) 9.9524 0.4098 24.29
My assumption is that this is telling me that rs11834524 explains
0.485538 of the variance and rs7074431 explains 0.013720 of the variance
in Expression when looked at independently.
However, I would like to know how to write a model where I know how much
of the total variance (in Expression) is described by covariates
rs11834524, rs1682421, rs13383869 and rs9457141 (call it class A) and
covariates rs9459617, rs7074431, rs12450785, rs592724 (call it class B).
Assuming an additive model within the class. The caveats are that there
is missing data and again that there may be correlation between all the
covariates.
Such that a theoretical result may be that
Class A: Explains 60% of the total variance in expression (response)
Class B: Explains 10% of the total variance in expression
Thanks for the help! I am sorry I am R challenged here...I really
appreciate the guidance!
Stephen
> dump("input_new", file=stdout())
input_new <-
structure(list(Individual = structure(1:109, .Label = c("NA06984",
"NA06985", "NA06986", "NA06989", "NA06993", "NA06994", "NA07000",
"NA07022", "NA07037", "NA07045", "NA07051", "NA07055", "NA07056",
"NA07345", "NA07346", "NA07347", "NA07357", "NA07435", "NA11829",
"NA11830", "NA11831", "NA11832", "NA11839", "NA11840", "NA11843",
"NA11881", "NA11882", "NA11892", "NA11893", "NA11894", "NA11917",
"NA11918", "NA11919", "NA11920", "NA11930", "NA11931", "NA11992",
"NA11993", "NA11994", "NA11995", "NA12003", "NA12005", "NA12006",
"NA12043", "NA12044", "NA12056", "NA12057", "NA12144", "NA12145",
"NA12146", "NA12154", "NA12155", "NA12156", "NA12234", "NA12239",
"NA12248", "NA12249", "NA12264", "NA12272", "NA12273", "NA12274",
"NA12275", "NA12282", "NA12283", "NA12286", "NA12287", "NA12340",
"NA12341", "NA12342", "NA12343", "NA12347", "NA12348", "NA12383",
"NA12399", "NA12400", "NA12414", "NA12489", "NA12546", "NA12716",
"NA12718", "NA12748", "NA12749", "NA12750", "NA12751", "NA12760",
"NA12761", "NA12762", "NA12763", "NA12775", "NA12776", "NA12777",
"NA12778", "NA12812", "NA12813", "NA12814", "NA12815", "NA12827",
"NA12828", "NA12829", "NA12830", "NA12842", "NA12843", "NA12872",
"NA12873", "NA12874", "NA12875", "NA12889", "NA12891", "NA12892"
), class = "factor"), Expression = c(9.46026823453575, 10.0788903323991,
9.20330296497174, 10.038741467793, 9.33092349416463, 11.0273957217919,
10.5498875891745, 9.81137299592747, 11.2023261987976, 9.90559354069027,
10.1524696609679, 10.3171767665993, 9.02155519577685, 9.84917871051438,
10.658877473136, 9.88895551011107, 8.62335008726357, 9.21529114100886,
10.7896248923916, 10.1302992505869, 8.64584282787018, 9.56057795866654,
9.89810004078774, 10.2557482141576, 8.95588077688637, 9.56452454115857,
9.26525135092154, 10.5438780642797, 9.8468571349548, 10.7416169225352,
10.5623721612979, 10.6565276881443, 9.67758493445612, 9.75385553511462,
8.997797236767, 11.0106882086179, 10.362578597992, 9.2745507212906,
10.7453355016181, 9.75998268015348, 9.45003620116962, 10.055504292376,
10.7072220720564, 10.0934686444392, 10.0472832129727, 10.1185615033486,
10.3340911031131, 9.70618910683157, 10.5953304905529, 10.4246307909547,
9.91463202635336, 10.249081562168, 10.9252022586474, 10.295544143525,
11.4838109797985, 10.5286570234792, 9.78692800868132, 10.0397050809162,
9.27914623343747, 10.37600233389, 9.27341681588134, 9.40195375611303,
10.8979822929135, 9.03922228977389, 10.3911745662505, 10.4345408213054,
9.8548491618724, 10.1897729275437, 10.2881888849609, 8.9656977165014,
9.81595398472166, 10.1856794532084, 9.3763789479684, 10.1712420020647,
10.2964594680427, 10.3515965292101, 8.94492585275159, 11.2529257614993,
9.25146912450726, 10.1904309237525, 10.7490591053023, 10.3883924463568,
10.097023765247, 10.0824730785217, 10.0828512817661, 10.6371064852226,
10.5831044752098, 10.4484786486601, 8.50264408341596, 10.3468670812262,
9.46061433005316, 8.90027436167269, 9.73630671555279, 9.40555522408144,
10.3220768104446, 8.55132985773453, 10.1678182524815, 10.6145417864386,
10.4169948161073, 10.0253039670548, 10.2568017077865, 10.5045847076951,
9.75993936712448, 8.99997092895909, 10.6742222414794, 10.8640943324257,
10.4295384371541, 10.1987862649656, 10.6744617172313), rs11834524 =
structure(c(1L,
2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 2L, 1L,
1L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 3L, 3L, 3L, 2L,
1L, 1L, 3L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 1L, 2L, 1L, 1L, 3L, 1L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 3L, 1L, 2L, 3L,
2L, 3L, 2L, 1L, 3L, 3L, 3L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 3L, 3L), .Label = c("AA",
"AG", "GG"), class = "factor"), rs1682421 = structure(c(1L, 2L,
1L, 2L, 2L, 3L, 2L, 2L, 3L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 3L, 2L, 3L, 1L, 1L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 2L,
1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L,
3L, 1L, 1L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L,
1L, 2L, 2L, 1L, 1L, NA, 3L, 2L, 3L, 2L, 2L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs13383869 = structure(c(2L,
2L, 2L, 2L, 2L, NA, 2L, 2L, 1L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 3L,
2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 2L,
2L, 3L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 1L,
1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 3L, 2L, NA, 2L, 2L, 3L, 2L,
2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 3L, 2L,
2L, 1L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L,
1L, 1L, 2L, 2L, NA, 2L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("AA",
"AG", "GG"), class = "factor"), rs9457141 = structure(c(1L, 2L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L,
3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 1L,
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, NA, 2L, 1L, 2L, NA, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs9459617 = structure(c(1L, 2L,
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L,
3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L,
2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, NA, 1L, 3L, 3L, 2L, 1L,
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 2L, 2L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs7074431 = structure(c(2L, 3L,
2L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 3L, 2L,
2L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 3L, 2L,
1L, 2L, 1L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 1L, 1L,
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 2L, 2L, 1L,
1L, 1L, 3L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L), .Label = c("CC",
"CT", "TT"), class = "factor"), rs12450785 = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 2L, 1L, 2L, 3L, 2L,
3L, 1L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 2L,
2L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 2L,
1L, 2L, 2L, 1L, 1L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 1L, 3L, 2L,
2L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 2L,
1L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 3L, 1L,
2L, 2L, 3L, 2L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 2L), .Label = c("AA",
"AG", "GG"), class = "factor"), rs592724 = structure(c(1L, 2L,
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 2L,
2L, 2L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 3L,
1L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 2L,
3L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 1L, 3L, 3L,
2L, 2L, 1L, 1L, 3L, 2L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 3L, 3L, 2L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L), .Label = c("CC",
"CT", "TT"), class = "factor"), Grp = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1", class =
"factor")), .Names = c("Individual",
"Expression", "rs11834524", "rs1682421", "rs13383869", "rs9457141",
"rs9459617", "rs7074431", "rs12450785", "rs592724", "Grp"), row.names =
c(NA,
-109L), class = "data.frame")
Stephen B. Montgomery
Postdoctoral Researcher, Population and Comparative Genomics
Wellcome Trust Sanger Institute
Hinxton, Cambridge CB10 1SA
Skype: stephen.b.montgomery
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the R-help
mailing list