[R] Converting factors back to numbers. Trouble with SPSS importdata

Duncan Murdoch murdoch at stats.uwo.ca
Mon Feb 20 02:12:25 CET 2006


On 2/19/2006 7:53 PM, Paul Johnson wrote:
> On 2/19/06, Robert W. Baer, Ph.D. <rbaer at atsu.edu> wrote:
>> Quoted directly from the FAQ (although granted I need to look this up over
>> and over, myself.  Would that it had a easily remembered wrapper function):
>> 7.10 How do I convert factors to numeric?
>> It may happen that when reading numeric data into R (usually, when reading
>> in a file), they come in as factors. If f is such a factor object, you can
>> use
>>
>>      as.numeric(as.character(f))
>> to get the numbers back. More efficient, but harder to remember, is
>>
>>      as.numeric(levels(f))[as.integer(f)]
> 
> I don't think I have that problem described in the FAQ.  I've had that
> before, though.
> 
> Observe. Here's the original thing:
> 
>> eldatfac$HAPPY[1:10]
>  [1] Happy      Happy      Very happy Happy      Very happy Very happy
>  [7] Happy      Very happy Happy      Very happy
> 6 Levels: Not happy at all Not very happy Happy Very happy ... Refused
> 
> Here's the result of the first thing you cite from the FAQ
> 
>> as.numeric(as.character(eldatfac$HAPPY))[1:10]
>  [1] NA NA NA NA NA NA NA NA NA NA
> Warning message:
> NAs introduced by coercion
> 
> Here's the second thing from the FAQ
> 
>> as.numeric(levels(eldatfac$HAPPY))[as.integer(eldatfac$HAPPY)]
>  [1] NA NA NA NA NA NA NA NA NA NA
> Warning message:
> NAs introduced by coercion
> 
> What am I missing here?

You're right, you have a different problem.  The FAQ is talking about 
the situation where the data in a file is numeric but is read as a 
factor, perhaps because of typos in one or two values.

In your case, levels(eldatfac$HAPPY) will tell you the correspondence 
between R's internal numbers and labels.  It's not the same as SPSS 
uses; as far as I know that coding is lost at this point.  You'll need 
to work out the coding you want to use and do it yourself.  For example, 
if the 1st 4 codes should be 0:3 and the others NA, you could use

encoding <- c(0:3, NA, NA)
encoding[as.integer(eldatfac$HAPPY)]

Duncan Murdoch




More information about the R-help mailing list