[R-sig-ME] gamm4 error with large dataset

Daniel Hocking dhocking at umass.edu
Wed Apr 30 18:03:44 CEST 2014

I am trying to predict daily water temperature from air temperature primarily but ideally would include other factors such as precipitation and landscape characteristics. I have paired air and water temperatures from 600+ sites over a ~10 year period. Some sites have daily temperatures for just a couple months and others for years, and some for a couple months sporadically in different years. I am trying to use a mixed effects gamm so I can include random effects of site and year and smooth over day of the year (dOY). My dataframe is 

and I get the following error when I run this code

system.time(gamm4Full <- gamm4(temp ~ airTemp + airTempLagged1 + airTempLagged2 + prcp + prcpLagged1 + prcpLagged2 + Latitude + Longitude + Forest + Agriculture + BasinElevationM + ReachSlopePCNT + CONUSWetland + SurficialCoarseC + s(dOY) + prcp*airTemp, random = ~(1| site) + (1 | year), data = etS)) 

# Error in crossprod(root.phi %*% Zt) : 
# Cholmod error 'problem too large' at file ../Core/cholmod_aat.c, line 173
# In addition: Warning message:
#   In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp),  :
#               convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded

My plan was to try gamm4 and if there was autocorrelation issues to switch to gamm within mgcv. I know bam is designed for large data but I’m not sure how I would code the random effects using bam. I know in general it’s s(dOY, bs = ‘re’) but I’m not sure how to relate this to site and year. Ideally I would have random slopes for airTemp effects for each site because of things like ground water inputs that we don’t measure.

Any advice would be appreciated,
Daniel Hocking
Department of Environmental Conservation
Northeast Climate Science Center
University of Massachusetts

dhocking at umass.edu

More information about the R-sig-mixed-models mailing list