[R-sig-ME] Error with nlme and fixed effects defined with a minus 1

Fri Aug 13 05:06:39 CEST 2021

Thanks for the quick reply.

lm() doesn't have the same issue (at least using what I think is the equivalent construction).  See the first test below for that.  And, when I try to manually add the contrasts, nlme drops the contrasts due to missing levels; see the second example with simdata4.

Other than regenerating the model or using a contrast setup that I don't want (due to the ease of interpreting the contrasts with -1), I don't see a way around it other than augmenting the data with some data that I don't want to use.

Also, I am apparently a consistent person because I had the same issue about 4 years ago (https://bugs.r-project.org/show_bug.cgi?id=17226), and in that bug, I found that someone on stackoverflow had the same issue (https://stats.stackexchange.com/questions/29513/error-in-getting-predictions-from-a-lme-object).  In the stackoverflow answer, there is a patch that may help.  (I hesitate to make that fix into a real fix myself because I don't fully understand the internals of nlme.)

``` r
simdata <-
  merge(
    merge(
      data.frame(treatment=factor(c("A", "B"))),
      data.frame(ID=factor(1:10))
    ),
    data.frame(time=1:10)
  )
set.seed(5)
simdata$obs <- rnorm(nrow(simdata))
model_minus1 <- lm(obs~treatment-1, data=simdata)
model <- lm(obs~treatment, data=simdata)

# Generate new data with the correct treatment factor levels, but only one of
# the levels represented within the data.
simdata2 <-
  merge(
    data.frame(treatment=factor("A", levels=c("A", "B"))),
    data.frame(time=1:10)
  )
# Generate new data with all factor levels represented
simdata3 <-
  merge(
    data.frame(treatment=factor(c("A", "B"))),
    data.frame(time=1:10)
  )
predict(model, newdata=simdata2)
#>          1          2          3          4          5          6          7 
#> 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 
#>          8          9         10 
#> 0.02690558 0.02690558 0.02690558
predict(model_minus1, newdata=simdata2)
#>          1          2          3          4          5          6          7 
#> 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 
#>          8          9         10 
#> 0.02690558 0.02690558 0.02690558
predict(model_minus1, newdata=simdata3)
#>          1          2          3          4          5          6          7 
#> 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 
#>          8          9         10         11         12         13         14 
#> 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989 
#>         15         16         17         18         19         20 
#> 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989
```

``` r
library(nlme)
simdata <-
  merge(
    merge(
      data.frame(treatment=factor(c("A", "B"))),
      data.frame(ID=factor(1:10))
    ),
    data.frame(time=1:10)
  )
set.seed(5)
simdata$obs <- rnorm(nrow(simdata))
model_minus1 <- nlme(obs~e0, data=simdata, fixed=e0~treatment - 1, random=e0~1|ID, start=c(e0=c(0, 0)))
model <- nlme(obs~e0, data=simdata, fixed=e0~treatment, random=e0~1|ID, start=c(e0=c(0, 0)))

# Generate new data with the correct treatment factor levels, but only one of
# the levels represented within the data.
simdata2 <-
  merge(
    data.frame(treatment=factor("A", levels=c("A", "B"))),
    data.frame(time=1:10)
  )
# Generate new data with all factor levels represented
simdata3 <-
  merge(
    data.frame(treatment=factor(c("A", "B"))),
    data.frame(time=1:10)
  )
simdata4 <- simdata2
contrasts(simdata4$treatment) <- contrasts(simdata$treatment)
predict(model, newdata=simdata2, level=0)
#>  [1] 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558
#>  [7] 0.02690558 0.02690558 0.02690558 0.02690558
#> attr(,"label")
#> [1] "Predicted values"
# Without all levels in newdata$treatment, it fails
predict(model_minus1, newdata=simdata2, level=0)
#> Error in pars[, nm] <- f %*% beta[fmap[[nm]]]: number of items to replace is not a multiple of replacement length
# With all levels in newdata$treatment, it works
predict(model_minus1, newdata=simdata3, level=0)
#>  [1] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989
#>  [7] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989
#> [13] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 0.02123989
#> [19] 0.02690558 0.02123989
#> attr(,"label")
#> [1] "Predicted values"
# Same error as simdata2, but now a warning that factor levels are dropped (when
# I don't think they should be)
predict(model_minus1, newdata=simdata4, level=0)
#> Warning: contrasts dropped from factor treatment due to missing levels
#> Error in pars[, nm] <- f %*% beta[fmap[[nm]]]: number of items to replace is not a multiple of replacement length
```

<sup>Created on 2021-08-12 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup>

-----Original Message-----
From: Phillip Alday <me using phillipalday.com> 
Sent: Thursday, August 12, 2021 10:08 PM
To: bill using denney.ws; r-sig-mixed-models using r-project.org
Subject: Re: [R-sig-ME] Error with nlme and fixed effects defined with a minus 1

This has more to do with how R expands formulae and contrast coding than anything specific to nlme, I think. I suspect that if you did this with just a normal (non mixed) linear model, the same problem would arise.
When you have -1, the intercept is suppressed, which in the presence of a categorical variable means that all k levels of that variable are represented as coefficients instead of k-1 (with the final contrast thrown in with the intercept). This is of course not ideal behavior and could probably be addressed or worked around in nlme, but it seems like a bit of an edge case. Maybe setting the contrasts manually will help?

contrasts(simdata2$treatment) <- contrasts(simdata$treatment)

(Good call on setting the factor levels in simdata2!)

On 12/8/21 6:59 pm, bill using denney.ws wrote:
> Hello,
> 
>  
> 
> I think that I have found a bug in nlme.  When I generate a model with 
> fixed effects defined with a factor - 1 (see model_minus1 below) and I 
> simulate with level=0 in a data.frame that does not have all the 
> levels (see predict into simdata2 with model_minus1 below), there is 
> an error.  But, if the model is defined without the minus 1 or all 
> factor levels are represented within the simulation data, there is no error.
> 
>  
> 
> Did I find a bug in nlme?
> 
>  
> 
> Thanks,
> 
>  
> 
> Bill
> 
>  
> 
> ``` r
> 
> library(nlme)
> 
> simdata <-
> 
>   merge(
> 
>     merge(
> 
>       data.frame(treatment=factor(c("A", "B"))),
> 
>       data.frame(ID=factor(1:10))
> 
>     ),
> 
>     data.frame(time=1:10)
> 
>   )
> 
> set.seed(5)
> 
> simdata$obs <- rnorm(nrow(simdata))
> 
> model_minus1 <- nlme(obs~e0, data=simdata, fixed=e0~treatment - 1, 
> random=e0~1|ID, start=c(e0=c(0, 0)))
> 
> model <- nlme(obs~e0, data=simdata, fixed=e0~treatment, 
> random=e0~1|ID, start=c(e0=c(0, 0)))
> 
>  
> 
> # Generate new data with the correct treatment factor levels, but only 
> one of
> 
> # the levels represented within the data.
> 
> simdata2 <-
> 
>   merge(
> 
>     data.frame(treatment=factor("A", levels=c("A", "B"))),
> 
>     data.frame(time=1:10)
> 
>   )
> 
> # Generate new data with all factor levels represented
> 
> simdata3 <-
> 
>   merge(
> 
>     data.frame(treatment=factor(c("A", "B"))),
> 
>     data.frame(time=1:10)
> 
>   )
> 
> predict(model, newdata=simdata2, level=0)
> 
> #>  [1] 0.02690558 0.02690558 0.02690558 0.02690558 0.02690558 
> 0.02690558
> 
> #>  [7] 0.02690558 0.02690558 0.02690558 0.02690558
> 
> #> attr(,"label")
> 
> #> [1] "Predicted values"
> 
> # Without all levels in newdata$treatment, it fails
> 
> predict(model_minus1, newdata=simdata2, level=0)
> 
> #> Error in pars[, nm] <- f %*% beta[fmap[[nm]]]: number of items to 
> replace is not a multiple of replacement length
> 
> # With all levels in newdata$treatment, it works
> 
> predict(model_minus1, newdata=simdata3, level=0)
> 
> #>  [1] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 
> 0.02123989
> 
> #>  [7] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 
> 0.02123989
> 
> #> [13] 0.02690558 0.02123989 0.02690558 0.02123989 0.02690558 
> 0.02123989
> 
> #> [19] 0.02690558 0.02123989
> 
> #> attr(,"label")
> 
> #> [1] "Predicted values"
> 
> ```
> 
>  
> 
> <sup>Created on 2021-08-12 by the [reprex
> package](https://reprex.tidyverse.org) (v2.0.0)</sup>
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>