[R-sig-ME] Managing person identifier variable

Wed Oct 5 21:15:34 CEST 2016

> On Oct 5, 2016, at 1:59 PM, MACDOUGALL Margaret <Margaret.MacDougall at ed.ac.uk> wrote:
> 
> Hello
> 
> I would be most grateful for some advice in relation to the interpretation of a person identifier variable (persID, say),  in R. I would like to represent persons, as an independent variable, by a random effect. However, there are over 200 such persons. Each person is allocated a random numerical code as a unique identifier.  Currently, R is reading the identifier variable as a numeric variable. Is there a quick way of addressing this problem by recoding the variable?  (I do not wish to bin the values into category ranges; rather, I wish to avoid the numerical codes being interpreted literally.)
> 
> Many thanks
> 
> Margaret

Margaret,

How are you reading the data into R? Are you using read.table() or read.csv()?

If so, see ?read.table and note the 'colClasses' argument. 

If that argument is the default NA, ?type.convert will be used to convert the columns in the incoming data from the character based source to numeric, etc. as apropos.

If you explicitly define colClasses for each column, you can set the persID column to "character" and it will not be coerced to numeric by default.

Some of these issues are covered in the R Data Import/Export manual in the section on Spreadsheet-like data:

  https://cran.r-project.org/doc/manuals/r-release/R-data.html#Variations-on-read_002etable <https://cran.r-project.org/doc/manuals/r-release/R-data.html#Variations-on-read_002etable>

Regards,

Marc Schwartz

	[[alternative HTML version deleted]]