[R-sig-ME] Managing person identifier variable
Marc Schwartz
marc_schwartz at me.com
Wed Oct 5 21:28:31 CEST 2016
> On Oct 5, 2016, at 2:21 PM, Theodore Lytras <thlytras at gmail.com> wrote:
>
> Στις Τετάρτη, 5 Οκτωβρίου 2016 6:59:30 Μ.Μ. EEST MACDOUGALL Margaret έγραψε:
>> I would be most grateful for some advice in relation to the interpretation
>> of a person identifier variable (persID, say), in R. I would like to
>> represent persons, as an independent variable, by a random effect. However,
>> there are over 200 such persons. Each person is allocated a random
>> numerical code as a unique identifier. Currently, R is reading the
>> identifier variable as a numeric variable. Is there a quick way of
>> addressing this problem by recoding the variable? (I do not wish to bin
>> the values into category ranges; rather, I wish to avoid the numerical
>> codes being interpreted literally.)
>
> Just recode it as a factor, i.e. factor(persID).
>
> By the way, lme4 does that implicitly if you specify a numeric variable as a
> random effect in a model formula, i.e. you can just say: y ~ x + (1|persID)
> instead of: y ~ x + (1|factor(persID))
Just a quick pointer here which is that if the persID values contained leading zeros that are a material part of the unique IDs, such as:
01234
001234
then coercing to factors, after having been coerced to numeric values, will result in both of the above being 1234:
> factor(as.numeric("01234"))
[1] 1234
Levels: 1234
> factor(as.numeric("001234"))
[1] 1234
Levels: 1234
Food for thought...
Regards,
Marc Schwartz
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list