[R-sig-ME] increasing nAGQ causes error

Sat May 30 19:25:36 CEST 2009

First, I apologize for the large attachment--I hope it's not a problem.

Considering that Lancaster seems to be the culprit, I think the problem 
could be a result of different degrees of spatial autocorrelation among my 
observations.  I attached a figure to help illustrate my sampling scheme 
that may help to demonstrate the spatial autocorrelation.

I have 22 cities, with from 83 to ~2000 used points and (always) 1000 unused 
points within each city.  The used points are clustered together (with 
regular 30 m spacing in a grid configuration).  In contrast, the 1000 unused 
points are randomly selected points within the contiguous urban area.  In 
the figure, Waterbury has the minimum number of used points (i.e. ~83), 
while Lancaster has the maximum (~2000).  The used points are always more 
spatially autocorrelated than the unused points.  In some cities, I have 
only one observed used area (e.g. Waterbury), while in other cities, I have 
many observed areas (e.g. Lancaster).  A single used area produces the 
minimum # of used points for a city (83-85).

Sets of unused points from different cities can exhibit varying degrees of 
spatial autocorrelation because the contiguous urban area from which the 
unused points are sampled can vary in size.  Thus, on average, unused points 
are closer together (higher spatial autocorrelation) in small cities, and 
farther apart in large cities.  Not ideal, but probably not too terribly 
problematic when compared to the varying degrees of correlation among used 
points.

In cities with only one used area (e.g. Waterbury), all used points are 
approximately equally spatially autocorrelated.  In cities with numerous 
used areas (e.g. Lancaster), used points within the same used area are 
approximately equally (and highly) correlated with each other, but are 
considerably less correlated with used points in separate used areas within 
the city (which, in turn, are ~equally and highly autocorrelated with each 
other).

I'm toying around with the data to see if I can account for spatial 
autocorrelation by modeling it as a random effect.  So far, I haven't had 
any luck, but I'll keep you posted.  Of course, suggestions are more than 
welcome.  I realize that the "experimental" design here is a mess, and I may 
need to make some major changes.  Fortunately, the covariates are all 
generated in a GIS, so I can change the sampling scheme relatively easily. 
That the covariates have difficult distributions may also be problematic, as 
pointed out by Andrew Dolman.  I suppose I could use city as a fixed effect, 
and maybe that's what I'll need to do.  But since I want to make general 
inferences about roost-site selection across cities, using city as a random 
effect seems more suitable to me (?).

I should probably mention that my goal here is to estimate the fixed effects 
so that urban crow roost management programs in cities across the country 
could predict relative probabilities of where crows are likely to roost.  It 
would be very helpful to know what areas crows are likely to use if they are 
"evicted" from problematic urban roosting areas by harassment techniques 
(e.g. pyrotechnics, distress calls, etc.).  It's probably not realistic to 
expect highly accurate predictions, but I think being able to estimate 
relative probabilities of use would be a big improvement.

Thanks!

-Grant

----- Original Message ----- 
From: "Ben Bolker" <bolker at ufl.edu>
To: "Andrew Dolman" <andydolman at gmail.com>
Cc: "Grant T. Stokke" <gts127 at psu.edu>; "R Mixed Models" 
<r-sig-mixed-models at r-project.org>
Sent: Saturday, May 30, 2009 11:29 AM
Subject: Re: [R-sig-ME] increasing nAGQ causes error

>
>  I think the problem is most likely in the higher _intercept_
> in Lancaster, PA (City #15) -- if we fit glm(IMP*CITY) and
> evaluate the mean and sd of the intercepts for the first 14
> cities, Lancaster PA has a Z-score of >5 ... For what it's
> worth, I can get the error to occur just with cities 14 and 15 ...
>
>  With a data set this large, you could also probably simply
> fit with a fixed effect of city (although technically your
> inference would be different): the differences in coefficients
> are very small ...
>
>
> test3<-glmer(USED~1+IMPS+(1+IMPS|CITY),family=binomial,data=subset2)
> test3B <- glm(USED~IMPS*CITY,family=binomial,data=subset2)
>
> revals <- coef(test3)[[1]]
>
> fixvals <- data.frame(int=coef(test3B)[1]+c(0,coef(test3B)[3:16]),
>           IMPS=coef(test3B)[2]+c(0,coef(test3B)[17:30]))
>
> par(mfrow=c(1,2),mar=c(5,3,1,0.5))
> plot(revals[[1]],fixvals[[1]])
> abline(a=0,b=1)
> plot(revals[[2]],fixvals[[2]])
>
>  Ben Bolker
>
>
>
> Andrew Dolman wrote:
>> Could the convergence issues be due to the large number of very low IMP 
>> values?
>>
>> histogram(~IMP|CITY, data=subset2)
>>
>>
>>
>>
>> andydolman at gmail.com<mailto:andydolman at gmail.com>
>>
>>
>> 2009/5/29 Ben Bolker <bolker at ufl.edu<mailto:bolker at ufl.edu>>
>>
>>  I updated to version xxx-31 and am still getting the "downdated X'X
>> not positive definite" error rather than a "not symmetric" error (weird.
>> platform-dependent numerical issues?)
>>
>>  Based on a certain amount more messing around, I think there may
>> be an issue with city #15 (by Murphy's Law?) --  convergence failed
>> if I left out any city *except* #15.  This suggests looking at the
>> data (which of course I should have done earlier!) to see if something
>> suggests itself ...
>>
>> library(ggplot2)
>>
>> ggplot(subset2,aes(x=IMPS,y=USED)) + geom_point() + geom_smooth() +
>> facet_wrap(~CITY)
>>
>>  doesn't suggest anything really odd about Lancaster, PA except the
>> highest overall fraction of sites used ...
>>
>>
>> =================
>> library(lme4)
>> subset2 <- read.table("~/subset2.txt",header=TRUE)
>> subset2$IMPS<-with(subset2,(IMP-mean(IMP))/sd(IMP))
>>
>> if (FALSE) {
>>
>> test2<-glmer(USED~1+IMPS+(1+IMPS|CITY),family=binomial,data=subset2,nAGQ=8)
>>  ## Error in mer_finalize(ans) : Downdated X'X is not positive definite, 
>> 1.
>> }
>> testvals <- list()
>> nlev <- length(levels(subset2$CITY))
>> subvals <- 9:nlev
>> agqvals <- c(2,3,4,5,8,15)
>> for (i in 1:length(subvals)) {
>>  testvals[[i]] <- list()
>>  subval <- subvals[i]
>>  for (j in 1:length(agqvals)) {
>>    agq <- agqvals[j]
>>    cat("***",subval,agq,"\n")
>>    testvals[[i]][[j]]<-try(glmer(USED~1+IMPS+(1+IMPS|CITY),
>>                                    family=binomial,data=subset2,
>>
>> subset=as.numeric(CITY)<=subval,nAGQ=agq))
>>  }
>>  save("testvals",file="agqsubs.RData")
>> }
>>
>>
>> Grant T. Stokke wrote:
>>> I'm using lme4 version lme4_0.999375-30 (as reported in sessionInfo();
>>> that's what you're referring to, right?).  I downloaded lme4 only about 
>>> a
>>> week ago, so I would imagine it's up-to-date.
>>>
>>> Okay, I suppose maybe I should stick to the Laplacian approximation.  I
>>> tried using AGQ not so much for accuracy, but because one of my models
>>> refuses to converge, and I was checking to see if increasing nAGQ would
>>> alleviate that issue.  (Maybe there's no way that would happen--I'll 
>>> readily
>>> admit that my understanding of the approximation methods is poor, but I
>>> thought I'd give it a try.)  I was encouraged by the fact that changing 
>>> nAGQ
>>> didn't give convergence errors, but maybe convergence would still be a
>>> problem anyway if glmer could get past the current error.
>>>
>>> Thanks so much for trying to help me figure this out.  I'm enjoying 
>>> learning
>>> about GLMMs because they seem to be very useful in ecology, and I'm sure
>>> I'll want to use them again in the future.
>>>
>>> -Grant
>>>
>>> ----- Original Message -----
>>> From: "Ben Bolker" <bolker at ufl.edu<mailto:bolker at ufl.edu>>
>>> To: "Grant T. Stokke" <gts127 at psu.edu<mailto:gts127 at psu.edu>>
>>> Cc: 
>>> <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>
>>> Sent: Friday, May 29, 2009 2:08 PM
>>> Subject: Re: [R-sig-ME] increasing nAGQ causes error
>>>
>>>
>>>>  Hmmm.  What version of lme4 are you using?  I have a
>>>> perhaps-more-recent version which gives me a different error:
>>>>
>>>>  from sessionInfo()
>>>>
>>>> other attached packages:
>>>> [1] lme4_0.999375-29   Matrix_0.999375-25 lattice_0.17-25
>>>>
>>>>> test2<-glmer(USED~1+IMPS+(1+IMPS|CITY),family=binomial,data=subset2)
>>>>> test2A <- update(test2,nAGQ=2)
>>>> Error in mer_finalize(ans) : Downdated X'X is not positive definite, 1.
>>>>
>>>>  This is a more intentional and more standard error message which
>>>> basically means (?) that there has been a numerical/convergence problem
>>>> along the way ... (I'm not sure what the "1." means, haven't looked 
>>>> into
>>>> it).
>>>>
>>>>  If I were in your position I might be satisfied with Laplace -- is
>>>> there a particular reason you need the greater accuracy of AGQ ... ?
>>>>
>>>>  I am running the test code below and have so far figured out that
>>>> it happens for nAGQ=8 (but not for 2, 3, 4, 5) when one uses the
>>>> first 10 cities (but OK for up to 9).  It's OK for 11 cities
>>>> for all nAGQ tried (oddly enough, but this is not terribly surprising
>>>> when things are on th edge of numerical stability).  Haven't yet seen
>>>> how the rest of the pattern plays out.
>>>>
>>>>  -------------------
>>>> testvals <- list()
>>>> nlev <- length(levels(subset2$CITY))
>>>> agqvals <- c(2,3,4,5,8)
>>>> for (i in 9:nlev) {
>>>>  testvals[[i-8]] <- list()
>>>>  for (j in 1:length(agqvals)) {
>>>>    agq <- agqvals[j]
>>>>    cat("***",i,agq,"\n")
>>>>    testvals[[i-8]][[j]]<-try(glmer(USED~1+IMPS+(1+IMPS|CITY),
>>>>                                family=binomial,data=subset2,
>>>>                                subset=as.numeric(CITY)<i,nAGQ=agq))
>>>>
>>>>   ## should have used <=i rather than <i ...
>>>>  }
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Grant T. Stokke wrote:
>>>>> Dr. Bolker,
>>>>>
>>>>> Thanks for your prompt reply.  I created a subset of my dataset
>>>>> (attached).
>>>>> This subset contains 14 of the 22 cities in the full dataset, and one
>>>>> covariate (% impervious surfaces [IMP]).  I had convergence issues 
>>>>> before
>>>>> standardizing covariates, so I used:
>>>>>
>>>>> subset2$IMPS<-(subset2$IMP-mean(subset2$IMP))/(sqrt(var(subset2$IMP)))
>>>>>
>>>>> Fitting all 3 covariates takes quite a long time, and I get the same
>>>>> error
>>>>> when using just one of them:
>>>>>
>>>>>> test2<-glmer(USED~1+IMPS+(1+IMPS|CITY),family=binomial,data=subset2,nAGQ=8)
>>>>>> test2
>>>>> Error in asMethod(object) : matrix is not symmetric [1,2]
>>>>>
>>>>> Perhaps it's noteworthy that I did not receive the error when using a
>>>>> smaller subset with only the first 8 cities (in alphabetical order).
>>>>>
>>>>> I look forward to your response.  Thanks again...
>>>>>
>>>>> -Grant
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "Ben Bolker" <bolker at ufl.edu<mailto:bolker at ufl.edu>>
>>>>> To: "Grant T. Stokke" <gts127 at psu.edu<mailto:gts127 at psu.edu>>
>>>>> Cc: 
>>>>> <r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>>
>>>>> Sent: Friday, May 29, 2009 12:11 PM
>>>>> Subject: Re: [R-sig-ME] increasing nAGQ causes error
>>>>>
>>>>>
>>>>>>  This sounds worth digging into, but it's hard to dig
>>>>>> into without a reproducible example.  I don't get the
>>>>>> problem with the GLMM example in the lme4 package:
>>>>>>
>>>>>> example(lmer)
>>>>>> update(gm1,nAGQ=8)
>>>>>> update(gm1,nAGQ=10)
>>>>>>
>>>>>> etc.
>>>>>>
>>>>>>  Can you post your data set, or a subset or simulated
>>>>>> data set that gives the same problem, somewhere?
>>>>>>
>>>>>>  Ben Bolker
>>>>>>
>>>>>> Grant T. Stokke wrote:
>>>>>>> Hello All,
>>>>>>>
>>>>>>> I'm new to R and new to this mailing list, so I hope I've presented
>>>>>>> the proper info in this post.  I'm using GLMMs to model the 
>>>>>>> selection
>>>>>>> of urban roosting locations by crows.  My dataset consists of 22
>>>>>>> cities, with each city containing 1000 unused locations and from 83
>>>>>>> to 2000 used locations.  I have three covariates for each used or
>>>>>>> unused location which I've standardized across all observations: %
>>>>>>> canopy (CANS), % impervious surfaces (IMPS), and nighttime light
>>>>>>> level (LTS).  Using the default setting nAGQ=1, my full model is
>>>>>>> fitted without error:
>>>>>>>
>>>>>>>> CIL_CIL<-glmer(USED~1+CANS+IMPS+LTS+(1+CANS+IMPS+LTS|CITY),family=binomial,data=sitescale)
>>>>>>>>  CIL_CIL
>>>>>>> Generalized linear mixed model fit by the Laplace approximation
>>>>>>> Formula: USED ~ 1 + CANS + IMPS + LTS + (1 + CANS + IMPS + LTS |
>>>>>>> CITY) Data: sitescale AIC   BIC logLik deviance 15122 15237  -7547
>>>>>>> 15094 Random effects: Groups Name        Variance   Std.Dev. Corr
>>>>>>>  CITY   (Intercept) 321.698184 17.93595 CANS          0.073271
>>>>>>> 0.27069  0.026 IMPS          0.661947  0.81360  0.080 -0.698 LTS
>>>>>>> 455.829122 21.35016 -0.988  0.031 -0.132 Number of obs: 27128,
>>>>>>> groups: CITY, 22
>>>>>>>
>>>>>>> Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept)
>>>>>>> -16.21878    4.01227  -4.042 5.29e-05 *** CANS          0.07881
>>>>>>> 0.07162   1.100 0.271147 IMPS          0.63137    0.17750   3.557
>>>>>>> 0.000375 *** LTS          19.50827    4.78822   4.074 4.62e-05 ***
>>>>>>> --- Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>>>>>>
>>>>>>> Correlation of Fixed Effects: (Intr) CANS   IMPS CANS  0.020
>>>>>>>  IMPS  0.074 -0.500 LTS  -0.989  0.025 -0.124
>>>>>>>
>>>>>>> When I increase nAGQ to 8, however, I get the following error:
>>>>>>>
>>>>>>>> CIL_CIL.nAGQ8<-glmer(USED~1+CANS+IMPS+LTS+(1+CANS+IMPS+LTS|CITY),family=binomial,data=sitescale,nAGQ=8)
>>>>>>>>  CIL_CIL.nAGQ8
>>>>>>> Error in asMethod(object) : matrix is not symmetric [1,2]
>>>>>>>
>>>>>>> I get the same error message with other values for nAGQ (I tried 
>>>>>>> nAGQ
>>>>>>> = 2, 3, 5, and 50).  Is there anything I can do to fit the model
>>>>>>> using nAGQ > 1 without error?  Thanks in advance for your help!
>>>>>>>
>>>>>>> -Grant Stokke
>>>>>>>
>>>>>>>
>>>>>>>> sessionInfo()
>>>>>>> R version 2.9.0 (2009-04-17) i386-pc-mingw32
>>>>>>>
>>>>>>> locale: LC_COLLATE=English_United 
>>>>>>> States.1252;LC_CTYPE=English_United
>>>>>>> States.1252;LC_MONETARY=English_United
>>>>>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>>>>>
>>>>>>> attached base packages: [1] stats     graphics  grDevices utils
>>>>>>> datasets  methods   base
>>>>>>>
>>>>>>> other attached packages: [1] mgcv_1.5-5         lme4_0.999375-30
>>>>>>> Matrix_0.999375-26 lattice_0.17-22
>>>>>>>
>>>>>>> loaded via a namespace (and not attached): [1] grid_2.9.0
>>>>>>> nlme_3.1-90 tools_2.9.0
>>>>>>>> CIL_CIL.nAGQ8
>>>>>>> Error in asMethod(object) : matrix is not symmetric [1,2]
>>>>>>>> traceback()
>>>>>>> 18: .Call(dense_to_symmetric, from, "U", TRUE) 17: asMethod(object)
>>>>>>> 16: as(from, "symmetricMatrix") 15: .class1(object) 14: as(as(from,
>>>>>>> "symmetricMatrix"), "dMatrix") 13: .class1(object) 12: 
>>>>>>> as(as(as(from,
>>>>>>> "symmetricMatrix"), "dMatrix"), "denseMatrix") 11: .class1(object)
>>>>>>> 10: as(as(as(as(from, "symmetricMatrix"), "dMatrix"), 
>>>>>>> "denseMatrix"),
>>>>>>>  "dpoMatrix") 9: asMethod(object) 8: as(sigma(object)^2 *
>>>>>>> chol2inv(object at RX, size = object at dims["p"]), "dpoMatrix") 7:
>>>>>>> vcov(object) 6: vcov(object) 5: summary(x) 4: summary(x) 3:
>>>>>>> printMer(object) 2: function (object) standardGeneric("show")(<S4
>>>>>>> object of class "mer">) 1: function (object)
>>>>>>> standardGeneric("show")(<S4 object of class "mer">)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> 
>>>>>>> mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>> --
>>>>>> Ben Bolker
>>>>>> Associate professor, Biology Dep't, Univ. of Florida
>>>>>> bolker at ufl.edu<mailto:bolker at ufl.edu> / 
>>>>>> www.zoology.ufl.edu/bolker<http://www.zoology.ufl.edu/bolker>
>>>>>> GPG key: 
>>>>>> www.zoology.ufl.edu/bolker/benbolker-publickey.asc<http://www.zoology.ufl.edu/bolker/benbolker-publickey.asc>
>>>>>>
>>>> --
>>>> Ben Bolker
>>>> Associate professor, Biology Dep't, Univ. of Florida
>>>> bolker at ufl.edu<mailto:bolker at ufl.edu> / 
>>>> www.zoology.ufl.edu/bolker<http://www.zoology.ufl.edu/bolker>
>>>> GPG key: 
>>>> www.zoology.ufl.edu/bolker/benbolker-publickey.asc<http://www.zoology.ufl.edu/bolker/benbolker-publickey.asc>
>>>>
>>
>>
>> --
>> Ben Bolker
>> Associate professor, Biology Dep't, Univ. of Florida
>> bolker at ufl.edu<mailto:bolker at ufl.edu> / 
>> www.zoology.ufl.edu/bolker<http://www.zoology.ufl.edu/bolker>
>> GPG key: 
>> www.zoology.ufl.edu/bolker/benbolker-publickey.asc<http://www.zoology.ufl.edu/bolker/benbolker-publickey.asc>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> 
>> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
> -- 
> Ben Bolker
> Associate professor, Biology Dep't, Univ. of Florida
> bolker at ufl.edu / www.zoology.ufl.edu/bolker
> GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
>