[R-sig-ME] Unidentifiable model in lmer

Fri Sep 18 17:15:56 CEST 2015

Yes, this works.

library(lme4)
result <- lmer(y ~ A * B + (1|ID), data = dataset)
summary(result) # works fine
library(lmerTest)
summary(result) # works fine
result <- lmer(y ~ A * B + (1|ID), data = dataset)
summary(result) # yields "Model is not identifiable..."

This might be due to the fact that ID is nested in A. I recommend that
you contact the authors of lmerTest about this problem.

Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey

2015-09-18 17:08 GMT+02:00 Takahiro Fushimi <taka.6765 op gmail.com>:
> Sorry about it. Does this work?
>
> dataset <- data.frame(y = c(60625.19, 51088.38, 67283.03, 56550.54, NA,
> NA, 58720.44,63939.01, 66298.32, 62662.15, 53604.41, 88338.55, 62818.78,
> 68696.71,57433.98, 59105.99, 44801.97, 68655.48, 62645.16, 69409.15,
> 43568.70,55693.21, 57322.63, 51095.55, 57660.41, 64128.63, 55216.42,
> 59945.17,67284.31, 55401.13, 60471.59, NA, 63633.07, 65939.14,NA, 69488.38,
> 43950.50, 52782.99, 55674.94, 70130.25, 72560.25, 69297.95, 53188.98,
> 75687.53, 68242.23, 60696.15, 65087.04, 59377.71, 55055.45, 66673.40,
> 65933.37, 62636.56, 49606.80, 58668.13, 56694.13, NA, NA, 61928.95,63208.84,
> 83786.71, 72727.53, 78873.45, 47482.32, 65967.84, 53228.25, 65699.59,
> 61792.81, 44816.72, 71048.06, 59671.84, 89513.43, 56972.30, 45439.25,
> 51357.83, NA, 60146.86, 56845.88, 63181.13, 63704.80, 49140.22, 54538.14,
> 49411.32, 66059.73, 73147.55, 55665.96, 66821.27, 66935.12, 54350.63,
> 48116.25, 67664.66, 64278.54, 64555.03, 62463.60, 55831.53, 57392.61,
> 75727.00, 64839.50, 51009.78, 65274.59, 63339.91, 62276.49, 66159.42,
> 58333.76, 60275.09) , A = as.factor(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)) , B = as.factor(c(1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ,1 ,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0)), ID = as.factor(c(72,  73,  74,  75,  76, 78,  79,  86,  88,  89,  90,
> 91,  92,  93,  94,  95,  96,  121, 122, 123, 124, 77,  80,  81, 82,  83,
> 84,  98,  116, 117, 118, 119, 120, 125, 128, 100, 101, 102, 103, 104, 105,
> 106, 107, 108, 109, 110, 112, 113, 114, 115, 126, 127, 72, 73, 74, 75, 76,
> 78, 79, 86, 88, 89, 90, 91, 92, 93, 94, 95, 96, 121, 122, 123, 124, 77, 80,
> 81, 82, 83, 84, 98, 116, 117, 118, 119, 120, 125, 128, 100, 101, 102, 103,
> 104, 105, 106, 107, 108, 109, 110, 112, 113, 114, 115, 126, 127)))
>
> Best regards,
> Takahiro
>
>
> On 9/18/15 10:41 AM, Thierry Onkelinx wrote:
>>
>> Dear Takahiro,
>>
>> Please send the output of dput(dataset). That's a lot easier to
>> copy-paste into an R session.
>>
>> Best regards,
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no
>> more than asking him to perform a post-mortem examination: he may be
>> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does
>> not ensure that a reasonable answer can be extracted from a given body
>> of data. ~ John Tukey
>>
>>
>> 2015-09-18 16:40 GMT+02:00 Takahiro Fushimi <taka.6765 op gmail.com>:
>>>
>>> The data set is as follows:
>>>>
>>>> dataset
>>>
>>>             y A B  ID
>>> 1   60625.19 0 1  72
>>> 2   51088.38 0 1  73
>>> 3   67283.03 0 1  74
>>> 4   56550.54 0 1  75
>>> 5         NA 0 1  76
>>> 6         NA 0 1  78
>>> 7   58720.44 0 1  79
>>> 8   63939.01 0 1  86
>>> 9   66298.32 0 1  88
>>> 10  62662.15 0 1  89
>>> 11  53604.41 0 1  90
>>> 12  88338.55 0 1  91
>>> 13  62818.78 0 1  92
>>> 14  68696.71 0 1  93
>>> 15  57433.98 0 1  94
>>> 16  59105.99 0 1  95
>>> 17  44801.97 0 1  96
>>> 18  68655.48 0 1 121
>>> 19  62645.16 0 1 122
>>> 20  69409.15 0 1 123
>>> 21  43568.70 0 1 124
>>> 22  55693.21 1 1  77
>>> 23  57322.63 1 1  80
>>> 24  51095.55 1 1  81
>>> 25  57660.41 1 1  82
>>> 26  64128.63 1 1  83
>>> 27  55216.42 1 1  84
>>> 28  59945.17 1 1  98
>>> 29  67284.31 1 1 116
>>> 30  55401.13 1 1 117
>>> 31  60471.59 1 1 118
>>> 32        NA 1 1 119
>>> 33  63633.07 1 1 120
>>> 34  65939.14 1 1 125
>>> 35        NA 1 1 128
>>> 36  69488.38 2 1 100
>>> 37  43950.50 2 1 101
>>> 38  52782.99 2 1 102
>>> 39  55674.94 2 1 103
>>> 40  70130.25 2 1 104
>>> 41  72560.25 2 1 105
>>> 42  69297.95 2 1 106
>>> 43  53188.98 2 1 107
>>> 44  75687.53 2 1 108
>>> 45  68242.23 2 1 109
>>> 46  60696.15 2 1 110
>>> 47  65087.04 2 1 112
>>> 48  59377.71 2 1 113
>>> 49  55055.45 2 1 114
>>> 50  66673.40 2 1 115
>>> 51  65933.37 2 1 126
>>> 52  62636.56 2 1 127
>>> 53  49606.80 0 0  72
>>> 54  58668.13 0 0  73
>>> 55  56694.13 0 0  74
>>> 56        NA 0 0  75
>>> 57        NA 0 0  76
>>> 58  61928.95 0 0  78
>>> 59  63208.84 0 0  79
>>> 60  83786.71 0 0  86
>>> 61  72727.53 0 0  88
>>> 62  78873.45 0 0  89
>>> 63  47482.32 0 0  90
>>> 64  65967.84 0 0  91
>>> 65  53228.25 0 0  92
>>> 66  65699.59 0 0  93
>>> 67  61792.81 0 0  94
>>> 68  44816.72 0 0  95
>>> 69  71048.06 0 0  96
>>> 70  59671.84 0 0 121
>>> 71  89513.43 0 0 122
>>> 72  56972.30 0 0 123
>>> 73  45439.25 0 0 124
>>> 74  51357.83 1 0  77
>>> 75        NA 1 0  80
>>> 76  60146.86 1 0  81
>>> 77  56845.88 1 0  82
>>> 78  63181.13 1 0  83
>>> 79  63704.80 1 0  84
>>> 80  49140.22 1 0  98
>>> 81  54538.14 1 0 116
>>> 82  49411.32 1 0 117
>>> 83  66059.73 1 0 118
>>> 84  73147.55 1 0 119
>>> 85  55665.96 1 0 120
>>> 86  66821.27 1 0 125
>>> 87  66935.12 1 0 128
>>> 88  54350.63 2 0 100
>>> 89  48116.25 2 0 101
>>> 90  67664.66 2 0 102
>>> 91  64278.54 2 0 103
>>> 92  64555.03 2 0 104
>>> 93  62463.60 2 0 105
>>> 94  55831.53 2 0 106
>>> 95  57392.61 2 0 107
>>> 96  75727.00 2 0 108
>>> 97  64839.50 2 0 109
>>> 98  51009.78 2 0 110
>>> 99  65274.59 2 0 112
>>> 100 63339.91 2 0 113
>>> 101 62276.49 2 0 114
>>> 102 66159.42 2 0 115
>>> 103 58333.76 2 0 126
>>> 104 60275.09 2 0 127
>>>
>>> Best regards,
>>> Takahiro
>>>
>>>
>>> On 9/18/15 3:23 AM, Thierry Onkelinx wrote:
>>>>
>>>> The formula shouldn't be the problem. model.matrix() is clever enough
>>>> to handle this. Have a look at the examples below.
>>>>
>>>> model.matrix(~ A + B + A * B, data= data.frame(A = 1, B = 1))
>>>> model.matrix(~ A + B + A * B + B + A:B + B:A, data= data.frame(A = 1, B
>>>> =
>>>> 1))
>>>>
>>>> I suggest to give use a reproducible example of the problem so that we
>>>> can invesigate what when wrong.
>>>> ir. Thierry Onkelinx
>>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>>>> and Forest
>>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>>> Kliniekstraat 25
>>>> 1070 Anderlecht
>>>> Belgium
>>>>
>>>> To call in the statistician after the experiment is done may be no
>>>> more than asking him to perform a post-mortem examination: he may be
>>>> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>>> The plural of anecdote is not data. ~ Roger Brinner
>>>> The combination of some data and an aching desire for an answer does
>>>> not ensure that a reasonable answer can be extracted from a given body
>>>> of data. ~ John Tukey
>>>>
>>>>
>>>> 2015-09-18 7:23 GMT+02:00 Ché Lucero <chelucero op uchicago.edu>:
>>>>>
>>>>> I don't know if this is causing the problem you're seeing, but A*B
>>>>> expands
>>>>> to A + B + A:B, so your model right now is R ~ A + B + A + B + A:B +
>>>>> (1|S).
>>>>>
>>>>> On Thu, Sep 17, 2015 at 5:53 PM Takahiro Fushimi <taka.6765 op gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> I have been working on a linear mixed effect model by using lmer()
>>>>>> function, but I got an error saying that the fitted model is not
>>>>>> identifiable.
>>>>>>
>>>>>> The data set includes the following variables:
>>>>>> y = a numeric variable
>>>>>> factorA = a 3-level categorical variable
>>>>>> factorB = a 2-level categorical variable
>>>>>> subjectID = subject id number. 2 measurements of y for each subject
>>>>>>
>>>>>> R code and output are as follows:
>>>>>>    > (result <- lmer(y ~ factorA + factorB + factorA*factorB +
>>>>>> (1|subjectID)))
>>>>>> Linear mixed model fit by REML ['merModLmerTest']
>>>>>> Formula: y ~ factorA + factorB + factorA * factorB + (1 | subjectID)
>>>>>> REML criterion at convergence: 1928.966
>>>>>> Random effects:
>>>>>>     Groups    Name        Std.Dev.
>>>>>>     subjectID (Intercept) 4711
>>>>>>     Residual              7688
>>>>>> Number of obs: 97, groups:  subjectID, 51
>>>>>> Fixed Effects:
>>>>>>          (Intercept)           factorA1           factorA2 factorB1
>>>>>> factorA1:factorB1  factorA2:factorB1
>>>>>>                62411              -2700              -1124
>>>>>> -1037               1279               2482
>>>>>>    > summary(result)
>>>>>> Model is not identifiable...
>>>>>> summary from lme4 is returned
>>>>>> some computational error has occurred in lmerTest
>>>>>> Linear mixed model fit by REML ['lmerMod']
>>>>>> Formula: y ~ factorA + factorB + factorA * factorB + (1 | subjectID)
>>>>>>
>>>>>> REML criterion at convergence: 1929
>>>>>>
>>>>>> Scaled residuals:
>>>>>>         Min       1Q   Median       3Q      Max
>>>>>> -1.93423 -0.62611  0.01837  0.48887  2.73380
>>>>>>
>>>>>> Random effects:
>>>>>>     Groups    Name        Variance Std.Dev.
>>>>>>     subjectID (Intercept) 22194074 4711
>>>>>>     Residual              59108495 7688
>>>>>> Number of obs: 97, groups:  subjectID, 51
>>>>>>
>>>>>> Fixed effects:
>>>>>>                      Estimate Std. Error t value
>>>>>> (Intercept)          62411       2065  30.227
>>>>>> factorA1             -2700       3238  -0.834
>>>>>> factorA2             -1124       3008  -0.374
>>>>>> factorB1             -1037       2512  -0.413
>>>>>> factorA1:factorB1     1279       4013   0.319
>>>>>> factorA2:factorB1     2482       3642   0.681
>>>>>>
>>>>>> Correlation of Fixed Effects:
>>>>>>                (Intr) fctrA1 fctrA2 fctrB1 fA1:B1
>>>>>> factorA1    -0.638
>>>>>> factorA2    -0.687  0.438
>>>>>> factorB1    -0.608  0.388  0.418
>>>>>> fctrA1:fcB1  0.381 -0.601 -0.262 -0.626
>>>>>> fctrA2:fcB1  0.420 -0.268 -0.606 -0.690  0.432
>>>>>>    >
>>>>>>
>>>>>> Could anyone give me some idea of why this unidentifiability problem
>>>>>> happens and how to fix it?
>>>>>> Any help would be appreciated.
>>>>>>
>>>>>> Best regards,
>>>>>> Takahiro
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-mixed-models op r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>
>>>>>           [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models op r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>> --
>>> Takahiro Fushimi
>>> Columbia University in the City of New York
>>> Master of Arts in Statistics
>>>
>
> --
> Takahiro Fushimi
> Columbia University in the City of New York
> Master of Arts in Statistics
>