[R-sig-ME] MCMCglmm 'undefined columns' error
Jarrod Hadfield
j.hadfield at ed.ac.uk
Mon May 11 16:38:57 CEST 2009
Hi Greg,
There are two ways of fitting multinomial models when the number of
observations per unit is 1 (as in your case). You specified
family="multinomial4" so MCMCglmm is expecting a 4 column response
variable with the counts in each column. In your case each row should
contain 3 zeros and 1 one. The response should be specified something
like:
cbind(category1,category2,category3,category4)~
Alternatively, you can pass a single column of factors as a response
and use family="categorical" and keep your original syntax:
count ~
Both models are parameterised as a log-odds ratio against a baseline
category. In the first model the final category is the baseline
category. I've done this because for binomial traits ("multinomial2")
the standard glm syntax is to pass cbind(successes, failures) and the
results are the log odds of succeeding vs failing (ie failing is the
base-line category). In the second model the first factor level is
the baseline category. I've done this because for binary traits it is
natural to have zero's as the base-line and interpret the parameters
as predicting 1's (successes).
For your data there will be three response variables in both cases
({cat1-cat4, cat2-cat4, cat3-cat4} or {cat2-cat1, cat3-cat1, cat4-
cat1}). You will want to use the reserved variable "trait" so that
different intercepts and effects are fitted for the three responses.
For example count~trait-1 or possibly count~trait+trait:cane-1 etc.
A multivariate error structure must also be defined. You could use:
~ idh(trait):units or ~ us(trait):units
However, for multinomial models with a single count you can't estimate
these from the data so you may as well fix them. I recommend fixing
the residual variances to 1, and the covariances to either 0 or 0.5.
This is specified in the prior:
prior=list(R=list(V=diag(3), n=1, fix=1), G=Gprior)
where V=diag(3) is 3X3 (number of categories -1) identity matrix and
fix=1 fixes the matrix at these values. The random effects should also
have priors which are also passed as a list (Gprior)
Its up to you what type of structure you use for vine. The vine
variance structure is also 3X3 and you could use any of these options:
~ vine # 1 parameter
v1=v2=v3=c12=c13=c23 Gprior=list(G1=list(V=diag(1), n=1))
~ vine:trait # 1 parameter v1=v2=v3
c12=c13=c23=0 Gprior=list(G1=list(V=diag(1), n=1))
~ vine+vine:trait # 2 parameters v1=v2=v3 c12=c13=c23
Gprior=list(G1=list(V=diag(1), n=1)), G2=list(V=diag(1), n=1))
~idh(vine) # 3 parameters v1 v2 v3
c12=c13=c23=0 Gprior=list(G1=list(V=diag(3), n=3))
~idh(vine)+vine:trait # 4 parameters v1 v2 v3 c12=c13=c23
Gprior=list(G1=list(V=diag(3), n=3), G2=list(V=diag(1), n=1))
~ us(vine) # 6 parameters v1 v2 v3 c12 c13
c23 Gprior=list(G1=list(V=diag(3), n=3))
where v1 v2 and v3 are the variances for each logg-odds ratio due to
vine, and c12 c13 c23 are the covarainces between them. I've made
up the prior (co)variances but I've put in the minimum degree of
belief parameter (n) that is required to make the priors proper.
One obvious problem is that the multinomial model in this form looses
the information about the ordering 0<1<2<3.
Cheers,
Jarrod
On 11 May 2009, at 06:00, Greg Lee wrote:
> Dear List,
>
> I am attempting to fit a GLMM using MCMCglmm(), but receive and
> 'undefined columns' error.
>
> The data are counts of bunches of grapes at bud positions (1:10) on
> canes (basal/distal) within vines within (vine type) clones within
> (vineyard) sites. Two treatments relating to degree of pruning have
> been imposed, and there are 5 replicates at each level combination.
> The data are as follows:
>
>> head(budfruit)
>
> site clone treat vine cane bud count
> 1 M 2051 1 1 b 1 1
> 2 M 2051 1 2 b 1 2
> 3 M 2051 1 3 b 1 1
> 4 M 2051 1 4 b 1 2
> 5 M 2051 1 5 b 1 1
> 6 M 2051 2 1 b 1 2
> ...
>
> with counts taking values 0,1,2 or (very rarely) 3.
>
> Applying a Poisson-based GLM, such as:
>
> glm(count ~ treat + cane + bud, family = poisson, data = budfruit)
>
> suggests serious under-dispersion (Residual deviance: 370.09 on 1184
> degrees of freedom), from which I concluded that a multinomial-based
> mixed model might be a better option. However, my initial attempt
> using MCMCglmm
>
> fit <- MCMCglmm(count ~ site + clone + treat + cane + bud,
> random =~ vine, family = 'multinomial4',
> data = budfruit, verbose = TRUE)
>
> produces the error:
>
> Error in `[.data.frame`(data, , match(response.names[0:nJ + nt],
> names(data))) :
> undefined columns selected
>
> Is there a problem with my understanding of the call to MCMCglmm(), or
> is this perhaps a bug?
>
> Any insight appreciated. Are there perhaps other packages within which
> I could fit this model?
>
> Regards,
> Greg
>
>> sessionInfo()
>
> R version 2.9.0 (2009-04-17)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.
> 1252;LC_MONETARY=English_Australia.
> 1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] MCMCglmm_1.09 gtools_2.5.0-1 combinat_0.0-6
> orthopolynom_1.0-1
> [5] polynom_1.3-5 pscl_1.03 mvtnorm_0.9-5 ape_2.3
> [9] coda_0.13-4 Matrix_0.999375-26 lattice_0.17-22
> tensorA_0.31
> [13] corpcor_1.5.2 MASS_7.2-46
>
> loaded via a namespace (and not attached):
> [1] gee_4.13-13 grid_2.9.0 nlme_3.1-91
>
>
> --------------
> Greg Lee
> Biometrician
> Tasmanian Institute of Agricultural Research
> New Town Research Laboratories
> University of Tasmania
> 13 St Johns Avenue,
> New Town, 7008
> Australia
> Ph: +613 6233 6858
> Fax: +613 6233 6145
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list