[R-sig-ME] GAMM4 error

Ben Bolker bbolker at gmail.com
Tue Aug 15 22:38:24 CEST 2017


  A couple of things to try first.  Not sure if either will work but
both are easy.

  Use na.omit() to remove values up front (do this only on the subset of
columns that you're actually going to use in your model, e.g.

  newdat <- na.omit(olddat[c("y","x1","x2","x3","x4","pupil","neigh")])

(this is irrelevant if your data set is already pared down to the
necessary set of columns).

  Change your y variable to numeric, i.e. data$y =
as.numeric(as.character(data$y))   *or* as.numeric(data$y)-1 (base R's
glm is very permissive about this, but GAMM4 might be stricter).



On 17-08-15 04:20 PM, dani wrote:
> Hello Ben,
> 
> 
> Thank you so much for your kind and prompt response. I provided a little
> bit more detail about my data. I really appreciate you taking the time
> to take a look over this information.
> 
> 
> Yes, y is a 0/1 variable.
> 
> 
> I have a total of  185,236 observations nested in n=2,206 pupils and n =
> 2,314 neighbourhoods. This results in a structure with many empty cells.
> Out of a total of 6120860 cells (considering the cross-classification),
> n=6118216 are empty cells. Out of the n=2644  non-empty cells, n=1835
> (69.40%) have 84 observations per cell and n=4 have one observation per
> cell. 
> 
> 
> I suspect the issue is the fact that I have many NA in my data. My two
> variables with the smoothers have the following stats:
> 
> 
> Variable 	N 	Mean 	Std Dev 	Minimum 	Maximum
> x3
> x4
> 
> 	
> 153369
> 148319
> 
> 	
> 13.01
> 30.28
> 
> 	
> 1.77
> 2.72
> 
> 	
> 0
> 18.06
> 
> 	
> 14.85
> 38.32
> 
> 
> *
> *
> 
> **
> 
> *summary(newdata)*
>  y                          x1                   x2                    
>  x3                         x4                              neigh      
>                STUDYID      
>  0:183335   Min.   : 2.000      F: 77972        Min.   : 0.00        
> Min.   :18.06             J0L 1B0:  2000       35     :    84  
>  1:  1901   1st Qu.: 4.195      M:107264     1st Qu.:12.19      1st
> Qu.:28.98            J0S 1K0:   526       122    :    84  
>                    Median : 7.047                           Median
> :13.85    Median :30.29          J0J 1K0:   504       193    :    84  
>                     Mean   : 7.866                            Mean  
> :13.01    Mean   :30.28           H4B 1N2:   480      231    :    84  
>                    3rd Qu.:11.044                            3rd
> Qu.:14.40    3rd Qu.:31.93          J0P 1P0:   336        248    :    84  
>                     Max.   :17.981                             Max.  
> :14.85    Max.   :38.33              J7T 2A1:   336       257    :    84  
>                                                                        
>    NA's   :31867    NA's   :36917          (Other):181054(Other):184732
> 
> * str(newdata)*
> *
> *
> 'data.frame':185236 obs. of  7 variables:
>  $ y      : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
>  $ x1     : num  16.9 16.9 16.9 16.9 16.9 ...
>  $ x2     : Factor w/ 2 levels "F","M": 2 2 2 2 2 2 2 2 2 2 ...
>  $ x3     : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ x4     : num  NA NA NA NA NA NA NA NA NA NA ...
>  $ neigh  : Factor w/ 2314 levels "A3J 1A8","A3K 2V9",..: 802 802 802
> 802 802 802 802 802 802 802 ...
>  $ STUDYID: Factor w/ 2206 levels "35","122","193",..: 1 1 1 1 1 1 1 1 1
> 1 ...
>  
> Thank you so much for all your help!
> 
> Best regards, everyone!
> 
> <http://aka.ms/weboutlook>
> ------------------------------------------------------------------------
> *From:* R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on
> behalf of Ben Bolker <bbolker at gmail.com>
> *Sent:* Tuesday, August 15, 2017 12:36:19 PM
> *To:* r-sig-mixed-models at r-project.org
> *Subject:* Re: [R-sig-ME] GAMM4 error
>  
> 
>   We'd love to help, but it's really, really hard without a reproducible
> example.  All the error message really tells us is that somewhere in the
> guts there was something like a divide-by-zero error or an infinity
> produced (because your data were weird, or because some value got really
> small or really large and under/overflowed).
> 
>   A reproducible example would be ideal (see e.g.
> <https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example>
> ), but in its absence, `summary(mydata)` or `str(mydata)` would be
> useful.  For example:
> 
> - is y a 0/1 variable?
> - are all of your x variables numeric, and not super-large in magnitude?
> - do you have NA values in your data?
> - how many distinct values (levels) of pupil and neigh do you have?
> - how many observations overall?
> 
> On 17-08-15 03:07 PM, dani wrote:
>> Hello everyone,
>> 
>> 
>> I am a beginner struggling with GAMM4. I employed a GAMM4 model using
>> a binomial distribution involving two smoothers and two random
>> intercepts (corresponding to a structure involving observations
>> cross-classified into two groups: pupils and neighbourhoods):
>> 
>> 
>> model <- gamm4(y ~ x1+x2+s(x3)+s(x4), random=~ (1|pupil)+(1|neigh),
>> data=mydata, family= binomial)
>> 
>> I received the following error message: Error in
>> smooth.construct.tp.smooth.spec(object, dk$data, dk$knots) : 
>> NA/NaN/Inf in foreign function call (arg 1)
>> 
>> I was wondering if anyone can please help me elucidate what might
>> this mean.
>> 
>> Best regards, everyone! Nicole-Miki
>> 
>> 
>> 
>> 
>> <http://aka.ms/weboutlook>
>> 
>> [[alternative HTML version deleted]]
>> 
>> _______________________________________________ 
>> R-sig-mixed-models at r-project.org mailing list 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list