[R] Determining variance components of classed covariates

Stephen Montgomery sm8 at sanger.ac.uk
Mon Jan 12 17:36:27 CET 2009


Hi -

I am interested in solving variance components for the data below with
respect to the response variable, Expression within R.

However, the covariates aren't independent and they also have a class
(of which the total variance explained by covariates in that class I am
most interested in).

Very naively, I have tried to look at each individual covariates
variance like this

> lm<-lmer(Expression ~ 1 + (1|rs11834524) + (1|rs7074431),
data=input_new)
> lm
Linear mixed-effects model fit by REML 
Formula: Expression ~ 1 + (1 | rs11834524) + (1 | rs7074431) 
   Data: input 
   AIC   BIC logLik MLdeviance REMLdeviance
 108.4 116.5 -51.22      102.5        102.4
Random effects:
 Groups     Name        Variance Std.Dev.
 rs11834524 (Intercept) 0.485538 0.69681 
 rs7074431  (Intercept) 0.013720 0.11713 
 Residual               0.128853 0.35896 
number of obs: 109, groups: rs11834524, 3; rs7074431, 3

Fixed effects:
            Estimate Std. Error t value
(Intercept)   9.9524     0.4098   24.29

My assumption is that this is telling me that rs11834524 explains
0.485538 of the variance and rs7074431 explains 0.013720 of the variance
in Expression when looked at independently.  

However, I would like to know how to write a model where I know how much
of the total variance (in Expression) is described by covariates
rs11834524, rs1682421, rs13383869 and rs9457141 (call it class A) and
covariates rs9459617, rs7074431, rs12450785, rs592724 (call it class B).
Assuming an additive model within the class.  The caveats are that there
is missing data and again that there may be correlation between all the
covariates.

Such that a theoretical result may be that
Class A: Explains 60% of the total variance in expression (response)
Class B: Explains 10% of the total variance in expression

Thanks for the help!  I am sorry I am R challenged here...I really
appreciate the guidance!

Stephen


> dump("input_new", file=stdout()) 
input_new <-
structure(list(Individual = structure(1:109, .Label = c("NA06984", 
"NA06985", "NA06986", "NA06989", "NA06993", "NA06994", "NA07000", 
"NA07022", "NA07037", "NA07045", "NA07051", "NA07055", "NA07056", 
"NA07345", "NA07346", "NA07347", "NA07357", "NA07435", "NA11829", 
"NA11830", "NA11831", "NA11832", "NA11839", "NA11840", "NA11843", 
"NA11881", "NA11882", "NA11892", "NA11893", "NA11894", "NA11917", 
"NA11918", "NA11919", "NA11920", "NA11930", "NA11931", "NA11992", 
"NA11993", "NA11994", "NA11995", "NA12003", "NA12005", "NA12006", 
"NA12043", "NA12044", "NA12056", "NA12057", "NA12144", "NA12145", 
"NA12146", "NA12154", "NA12155", "NA12156", "NA12234", "NA12239", 
"NA12248", "NA12249", "NA12264", "NA12272", "NA12273", "NA12274", 
"NA12275", "NA12282", "NA12283", "NA12286", "NA12287", "NA12340", 
"NA12341", "NA12342", "NA12343", "NA12347", "NA12348", "NA12383", 
"NA12399", "NA12400", "NA12414", "NA12489", "NA12546", "NA12716", 
"NA12718", "NA12748", "NA12749", "NA12750", "NA12751", "NA12760", 
"NA12761", "NA12762", "NA12763", "NA12775", "NA12776", "NA12777", 
"NA12778", "NA12812", "NA12813", "NA12814", "NA12815", "NA12827", 
"NA12828", "NA12829", "NA12830", "NA12842", "NA12843", "NA12872", 
"NA12873", "NA12874", "NA12875", "NA12889", "NA12891", "NA12892"
), class = "factor"), Expression = c(9.46026823453575, 10.0788903323991,

9.20330296497174, 10.038741467793, 9.33092349416463, 11.0273957217919, 
10.5498875891745, 9.81137299592747, 11.2023261987976, 9.90559354069027, 
10.1524696609679, 10.3171767665993, 9.02155519577685, 9.84917871051438, 
10.658877473136, 9.88895551011107, 8.62335008726357, 9.21529114100886, 
10.7896248923916, 10.1302992505869, 8.64584282787018, 9.56057795866654, 
9.89810004078774, 10.2557482141576, 8.95588077688637, 9.56452454115857, 
9.26525135092154, 10.5438780642797, 9.8468571349548, 10.7416169225352, 
10.5623721612979, 10.6565276881443, 9.67758493445612, 9.75385553511462, 
8.997797236767, 11.0106882086179, 10.362578597992, 9.2745507212906, 
10.7453355016181, 9.75998268015348, 9.45003620116962, 10.055504292376, 
10.7072220720564, 10.0934686444392, 10.0472832129727, 10.1185615033486, 
10.3340911031131, 9.70618910683157, 10.5953304905529, 10.4246307909547, 
9.91463202635336, 10.249081562168, 10.9252022586474, 10.295544143525, 
11.4838109797985, 10.5286570234792, 9.78692800868132, 10.0397050809162, 
9.27914623343747, 10.37600233389, 9.27341681588134, 9.40195375611303, 
10.8979822929135, 9.03922228977389, 10.3911745662505, 10.4345408213054, 
9.8548491618724, 10.1897729275437, 10.2881888849609, 8.9656977165014, 
9.81595398472166, 10.1856794532084, 9.3763789479684, 10.1712420020647, 
10.2964594680427, 10.3515965292101, 8.94492585275159, 11.2529257614993, 
9.25146912450726, 10.1904309237525, 10.7490591053023, 10.3883924463568, 
10.097023765247, 10.0824730785217, 10.0828512817661, 10.6371064852226, 
10.5831044752098, 10.4484786486601, 8.50264408341596, 10.3468670812262, 
9.46061433005316, 8.90027436167269, 9.73630671555279, 9.40555522408144, 
10.3220768104446, 8.55132985773453, 10.1678182524815, 10.6145417864386, 
10.4169948161073, 10.0253039670548, 10.2568017077865, 10.5045847076951, 
9.75993936712448, 8.99997092895909, 10.6742222414794, 10.8640943324257, 
10.4295384371541, 10.1987862649656, 10.6744617172313), rs11834524 =
structure(c(1L, 
2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 2L, 2L, 1L, 
1L, 3L, 3L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 
1L, 1L, 3L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
2L, 2L, 2L, 3L, 2L, 3L, 3L, 2L, 3L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 
2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 3L, 1L, 2L, 3L, 
2L, 3L, 2L, 1L, 3L, 3L, 3L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 3L, 3L), .Label = c("AA", 
"AG", "GG"), class = "factor"), rs1682421 = structure(c(1L, 2L, 
1L, 2L, 2L, 3L, 2L, 2L, 3L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 
2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 3L, 2L, 3L, 1L, 1L, 
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 2L, 2L, 
1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 
3L, 1L, 1L, 2L, 3L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 
1L, 2L, 2L, 1L, 1L, NA, 3L, 2L, 3L, 2L, 2L), .Label = c("CC", 
"CT", "TT"), class = "factor"), rs13383869 = structure(c(2L, 
2L, 2L, 2L, 2L, NA, 2L, 2L, 1L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 3L, 
2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 
2L, 3L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 2L, 1L, 
1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 3L, 2L, NA, 2L, 2L, 3L, 2L, 
2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 3L, 2L, 
2L, 1L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 
1L, 1L, 2L, 2L, NA, 2L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("AA", 
"AG", "GG"), class = "factor"), rs9457141 = structure(c(1L, 2L, 
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 
3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 
2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 1L, 
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, NA, 2L, 1L, 2L, NA, 1L, 
1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC", 
"CT", "TT"), class = "factor"), rs9459617 = structure(c(1L, 2L, 
1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 3L, 1L, 
3L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 
2L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, NA, 1L, 3L, 3L, 2L, 1L, 
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 2L, 1L, 2L, 2L, 1L, 
1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("CC", 
"CT", "TT"), class = "factor"), rs7074431 = structure(c(2L, 3L, 
2L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 3L, 2L, 
2L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 1L, 1L, 3L, 2L, 1L, 2L, 3L, 2L, 
1L, 2L, 1L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 1L, 1L, 
2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 2L, 2L, 1L, 
1L, 1L, 3L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 
1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L), .Label = c("CC", 
"CT", "TT"), class = "factor"), rs12450785 = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 1L, 3L, 1L, 3L, 3L, 2L, 2L, 1L, 2L, 3L, 2L, 
3L, 1L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 3L, 2L, 2L, 
2L, 2L, 2L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 2L, 
1L, 2L, 2L, 1L, 1L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 2L, 1L, 3L, 2L, 
2L, 3L, 2L, 2L, 3L, 3L, 2L, 2L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 2L, 
1L, 3L, 2L, 3L, 1L, 3L, 2L, 3L, 3L, 2L, 2L, 2L, 3L, 2L, 3L, 1L, 
2L, 2L, 3L, 2L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 2L), .Label = c("AA", 
"AG", "GG"), class = "factor"), rs592724 = structure(c(1L, 2L, 
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 2L, 
2L, 2L, 1L, 2L, 2L, 1L, 1L, 3L, 1L, 1L, 2L, 2L, 2L, 3L, 1L, 3L, 
1L, 3L, 2L, 1L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 
3L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 3L, 1L, 3L, 3L, 
2L, 2L, 1L, 1L, 3L, 2L, 2L, 2L, 1L, 3L, 2L, 3L, 1L, 3L, 3L, 2L, 
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 
1L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 2L, 2L, 2L), .Label = c("CC", 
"CT", "TT"), class = "factor"), Grp = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "1", class =
"factor")), .Names = c("Individual", 
"Expression", "rs11834524", "rs1682421", "rs13383869", "rs9457141", 
"rs9459617", "rs7074431", "rs12450785", "rs592724", "Grp"), row.names =
c(NA, 
-109L), class = "data.frame")



Stephen B. Montgomery
Postdoctoral Researcher, Population and Comparative Genomics
Wellcome Trust Sanger Institute
Hinxton, Cambridge CB10 1SA
Skype: stephen.b.montgomery
 



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.




More information about the R-help mailing list