[R-sig-ME] help with the logistic formula using nlme/nlmer

Hans Ekbrand hans.ekbrand at gmail.com
Mon Jun 8 15:07:52 CEST 2015


Dear list,

I model the effect of child labour on the childs probability of being
in school. The data comes from 22 countries. Countries have different
means on the outcome variable, ie. the probability is in school is to
a large part determined on in which country the child resides. The
sample includes only children aged 7-14 years.

Child labour is a numerical covariate measured in hours, being in
school is binary variable, age is a numerical covariate, measured in
years.

Data is available here: http://hansekbrand.se/code/cl.df.RData

> str(cl.df)
'data.frame':	345321 obs. of  8 variables:
 $ country           : Factor w/ 23 levels "Armenia","Burkina Faso",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ areaID            : Factor w/ 14584 levels "Armenia.1","Armenia.10",..: 3 3 3 3 3 3 2 2 2 2 ...
 $ school            : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
 $ age               : num  9 10 8 11 14 10 12 10 14 12 ...
 $ total.hours.worked: num  1 1 1 0 1 0 0 0 0 0 ...
 $ Chorehours        : num  1 1 1 0 1 0 0 0 0 0 ...
 $ Chwkhours         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Chothwkhours      : num  0 0 0 0 0 0 0 0 0 0 ...

The distribution of child labour, which is indicated by the variable
total.hours.worked, has a positive skew.

> quantile(cl.df$total.hours.worked, probs = seq(from = 0, to = 1, by = 0.1))
  0%  10%  20%  30%  40%  50%  60%  70%  80%  90% 100% 
   0    0    0    0    3    5    7   10   14   28  133 

I have manually defined classes for this variable, 

library(car)
cl.df$hours.class <- recode(cl.df$total.hours.worked, recodes = ("lo:7=1;
 7:14=2; 14:21=3; 21:28=4; 28:35=5; 35:42=6; 42:49=7; 49:56=8; 56:63=9;
 63:hi='more than ten'"), as.factor.result=TRUE)

and used them like this:

fm1 <- glmer(school ~ age + hours.class + (1|country) + (1|areaID), data = cl.df, family = binomial)

this works, but I would prefer to fit a non-linear regression with a
polynomal form instead. I think a simple exponential function would
work.

E.g.

fm1 <- glmer(school ~ age + I(total.hours.worked^2) + (1|country) + (1|areaID), data = cl.df, family = binomial)

However, I *think* nlmer() could be used to find the optimal number
instead of "2" here. But I don't know how to do that. I have searched
the archive, but found rather few posts concerning nlmer(), so any
help is much appreciated.

If you can solve the problem with nlme() or anything else for that
matter, that's perfectly fine, I'm used to lme4, but I'm happy to
learn new stuff.



More information about the R-sig-mixed-models mailing list