[R-sig-ME] Fitting known var-cov matrix in logistic regression model

Dominick Samperi djsamperi at gmail.com
Sat Apr 23 23:53:52 CEST 2011

On Fri, Apr 22, 2011 at 2:31 PM, Margaret Mackinnon
<mmackinnon at kilifi.kemri-wellcome.org> wrote:
> Dear David
> Thanks for your response.
> I tried geeglm and glmPQL and MCMCglmm.  I still can't work out how to fit the fixed var-cov structure. Here is what I did for MCMCglmm (my preferred option):
>  gpdata<-data.frame(pos=as.integer(freq*n*2),neg=as.integer(n*2-freq*n*2),pop=c(1:15),env)
> prior<-list(R=list(V=diag(1),nu=0.002),G=list(G1=list(V=10,nu=0.002)))
> out<-MCMCglmm(cbind(pos,neg)~1+env,random=~pop,data=gpdata,family="multinomial2",prior=prior,burnin=3000,nitt=20000)
> summary(out$VCV)
> plot(out$VCV)
> plot(out$Sol)
> summary(out$Sol)
> This gives sensible answers for the random effect (pop) variance and the fixed effects (intercept + env), i.e. those given by lmer and glm.  However I can't understand the manual on how to fit a predfined value of the var-cov matrix that describes the different variances for each pop and the covariances between them.  Is it in rcov, or in the priors (R or G?) with the fix option on.

This is an interesting question. I think packages like nlme permit you
to define the
form of the var-cov matrix, but not to specify it completely. Thus
there is a trade-off
between how much domain knowledge you can inject and how much the software
will discover in the data. If I understand the docs correctly any
specified values for
the var-cov params are used as initial conditions, not as fixed values.


> I also could not get the family="zibinomial" to work because it needs idh or us variance functions which I don't know how to set up for this particular problem.
> I would really appreciate some help on this.
> Margaret
>>>> "David Duffy" <davidD at qimr.edu.au> 22 April 2011 04:49 >>>
> On Thu, 21 Apr 2011, Margaret Mackinnon wrote:
>> I want to estimate the relationship between population allele
>> frequencies at a certain locus of interest and an environmental
>> variable.  The hypothesis is that this environmental variable has
>> generated these different allele frequencies through differential
>> selection pressure on a locus under balancing selection (i.e., there is
>> negative selection which balances the differential positive selection on
>> the allele of interest).   The allele frequencies are measured in 15
>> different populations, as are the environmental variables.  However,
>> this environmental variable has been aggregated at the population level
>> so there are only 15 values that it can take.
>>  The populations are genetically related to each other to different
>> degrees and so I want to reflect this structuring by including in the
>> model a variance-covariance matrix which captures this relatedness.  I
>> estimated this variance-covariance matrix from a set of independent data
>> on SNPs at a whole lot of putatively neutral loci that are not expected
> The easiest way might be to use the SNP principal component scores for the
> populations as covariates in an ordinary logistic regression. Otherwise
> a GEE with your specified matrix, glmmPQL with a corSymm as per your model,
> MCMCglmm.
> --
> | David Duffy (MBBS PhD)                                         ,-_|\
> | email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
> | Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
> | 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
> This e-mail (including any attachment to it) contains information which is confidential. It is intended only for the use of the named recipient. If you have received this e-mail in error, please let us know by replying to the sender, and immediately delete it from your system.  Please note, that in these circumstances, the use, disclosure, distribution or copying of this information is strictly prohibited. We apologize for any inconvenience that may have been caused to you. KEMRI-Wellcome Trust Programme cannot accept any responsibility for the  accuracy or completeness of this message as it has been transmitted over a public network. KEMRI-Wellcome Trust Programme reserves the right to monitor all incoming and outgoing email traffic. Although the Programme has taken reasonable precautions to ensure no viruses are present in emails, it cannot accept responsibility for any loss or damage arising from the use of the email or attachments. Any views expressed in this message are!
>  those of the individual sender, except where the sender specifically states them to be the views of KEMRI-Wellcome Trust Programme.
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list