[R-sig-ME] need help with mixed effects model

Tue Mar 4 01:00:45 CET 2008

On 01/03/2008, at 4:29 AM, Mark W Kimpel wrote:

> Doug and other mixed-models aficionados,
>
> I have made some progress on my own on the problem I posted in this
> thread. Briefly, I am analyzing a multifactoral genomic experiment and
> wish to look at gene-gene correlations independent of Strain. Because
> multiple measurements are taken per rat, I wish to use lmer. What  
> seems
> to be working is the following.
>
> mod1 <- lmer(gene2 ~ -1 + Strain + (1|Rat) + gene1)
> mod2 <- lmer(gene2 ~ -1 + Strain + (1|Rat))
> anova.sum <- anova(mod1, mod2)
>
> I look to see if adding the expression of the other gene of interest  
> as
> a covariate significantly improves the model, if it does, then I take
> that as an indicator of gene-gene correlation/dependence.
>

The concern that Doug had is I assume that gene1 and gene2 are both  
measured with error, and this type of model assumes that the  
covariates are measured without error or for practical purposes much  
lower than the error in the dependent variable. Ignoring this problem  
biases the coefficients towards zero with consequent loss of power. I  
don't have any idea how important this is, it all depends on the error  
of your measurements. The usual solution is structural equation  
modelling (SEM). This is something I haven't tried, so I have no idea  
how easy or how well it will work.

Ken

> I am not doing this, of course for just two genes, but build an
> adjacency matrix out of the p-values for all gene-gene interactions  
> in a
> list of about 400 sig. genes. I then adjust the p-values for FDR and
> pick a suitable FDR (0.001 in this case) as a threshold and create
> another adjacency matrix with 1's for significant correlation and 0's
> for non-significant. I then visualize this using Rgraphviz.
>
> As I was tearing my hair out trying to make sure this was sensical, it
> occurred to me that within my list of 400 genes I have positive
> controls. About 40 of the genes are represented by 2 or more  
> probesets,
> which should be highly correlated if they are measuring the same  
> thing.
> So, I subjected just genes with duplicate probesets to the above
> procedure and, sure enough, in an overwhelming number of cases,
> probesets from the same gene plot next to each other.
>
> My conclusion from this exercise is that what I am doing is  
> empirically
> correct, although I am open to suggestions as to how it could be
> improved or comments as to how I may be just plain wrong.
>
> Doug, I am reading your book and appreciate your contributions.
>
> Mark
>
> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
>
> 15032 Hunter Court, Westfield, IN  46074
>
> (317) 490-5129 Work, & Mobile & VoiceMail
> (317) 204-4202 Home (no voice mail please)
>
> mwkimpel<at>gmail<dot>com
>
> ******************************************************************
>
>
> Douglas Bates wrote:
>> On Fri, Feb 22, 2008 at 11:57 AM, Mark W Kimpel  
>> <mwkimpel at gmail.com> wrote:
>>> This is my first foray into in mixed models and, while awaiting the
>>> arrival of:
>>
>>> Extending the Linear Model with R: Generalized Linear, Mixed Effects
>>> and     Nonparametric Regression Models
>>> Mixed Effects Models in S and S-Plus
>>
>>> I am in need to some advice.
>>
>>> I would like to look at gene-gene correlations within a multi- 
>>> factorial,
>>> mixed effects experiment. Here are the factors, with levels:
>>
>>> Gene Expression: 2 different genes per Animal, continuous variable
>>> Animals: 6 per Strain
>>> Tissues: 3 per animal
>>
>>> Strain: 2
>>
>>> I thus have 6*3*2 = 36 samples
>>
>>> I do not care, for this analysis, about differences between Tissues,
>>> Strains, or Animals, in fact, I want to control for them while  
>>> examining
>>> the correlation of expression of the two genes. In other words, I  
>>> want
>>> look at something very much like the Pearson correlation coefficient
>>> controlled for these other factors.
>>
>>> I guess the first question I should ask is: "is a mixed model the  
>>> way to
>>> go, and, if not, what would be the correct approach?"
>>
>> Perhaps.  How do you plan to incorporate the two genes?
>>
>>> Assuming mixed models will work, as I see it through my newbie eyes,
>>> Tissue and strain are fixed effects and animals are random effects.
>>
>> If you were interested in just 1 gene than I would say that this  
>> looks
>> like a good approach.  I'm just not sure what to do about the  
>> multiple
>> genes.
>>
>>> Any suggestions for an approach and a model?
>>
>> The model specification (assuming that each animal has a distinct
>> number) would be something like
>>
>> gene1 ~ Tissue * Strain + (1|Animal)
>>
>> In your earlier message to the Bioconductor list you had a
>> specification that looked like
>>
>> gene1 ~ gene2 + ...
>>
>> which makes me a little queasy because you are assuming that gene2 is
>> "known" relative to the variability in gene1 and most of the time  
>> that
>> is not a reasonable approach.
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>