[R-sig-ME] help with the logistic formula using nlme/nlmer

Thierry Onkelinx thierry.onkelinx at inbo.be
Mon Jun 8 15:37:45 CEST 2015


Dear Hans,

I'd rather use a gamm with a penalized regression spline for
total.hours.worked with a small basis for the smoother (k = 3 of k = 4).

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-06-08 15:07 GMT+02:00 Hans Ekbrand <hans.ekbrand op gmail.com>:

> Dear list,
>
> I model the effect of child labour on the childs probability of being
> in school. The data comes from 22 countries. Countries have different
> means on the outcome variable, ie. the probability is in school is to
> a large part determined on in which country the child resides. The
> sample includes only children aged 7-14 years.
>
> Child labour is a numerical covariate measured in hours, being in
> school is binary variable, age is a numerical covariate, measured in
> years.
>
> Data is available here: http://hansekbrand.se/code/cl.df.RData
>
> > str(cl.df)
> 'data.frame':   345321 obs. of  8 variables:
>  $ country           : Factor w/ 23 levels "Armenia","Burkina Faso",..: 1
> 1 1 1 1 1 1 1 1 1 ...
>  $ areaID            : Factor w/ 14584 levels "Armenia.1","Armenia.10",..:
> 3 3 3 3 3 3 2 2 2 2 ...
>  $ school            : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2
> ...
>  $ age               : num  9 10 8 11 14 10 12 10 14 12 ...
>  $ total.hours.worked: num  1 1 1 0 1 0 0 0 0 0 ...
>  $ Chorehours        : num  1 1 1 0 1 0 0 0 0 0 ...
>  $ Chwkhours         : num  0 0 0 0 0 0 0 0 0 0 ...
>  $ Chothwkhours      : num  0 0 0 0 0 0 0 0 0 0 ...
>
> The distribution of child labour, which is indicated by the variable
> total.hours.worked, has a positive skew.
>
> > quantile(cl.df$total.hours.worked, probs = seq(from = 0, to = 1, by =
> 0.1))
>   0%  10%  20%  30%  40%  50%  60%  70%  80%  90% 100%
>    0    0    0    0    3    5    7   10   14   28  133
>
> I have manually defined classes for this variable,
>
> library(car)
> cl.df$hours.class <- recode(cl.df$total.hours.worked, recodes = ("lo:7=1;
>  7:14=2; 14:21=3; 21:28=4; 28:35=5; 35:42=6; 42:49=7; 49:56=8; 56:63=9;
>  63:hi='more than ten'"), as.factor.result=TRUE)
>
> and used them like this:
>
> fm1 <- glmer(school ~ age + hours.class + (1|country) + (1|areaID), data =
> cl.df, family = binomial)
>
> this works, but I would prefer to fit a non-linear regression with a
> polynomal form instead. I think a simple exponential function would
> work.
>
> E.g.
>
> fm1 <- glmer(school ~ age + I(total.hours.worked^2) + (1|country) +
> (1|areaID), data = cl.df, family = binomial)
>
> However, I *think* nlmer() could be used to find the optimal number
> instead of "2" here. But I don't know how to do that. I have searched
> the archive, but found rather few posts concerning nlmer(), so any
> help is much appreciated.
>
> If you can solve the problem with nlme() or anything else for that
> matter, that's perfectly fine, I'm used to lme4, but I'm happy to
> learn new stuff.
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list