[R] spss.read factor reversal

Thomas Lumley tlumley at u.washington.edu
Wed Jul 27 16:32:07 CEST 2005


On Wed, 27 Jul 2005, Peter Dalgaard wrote:

> Adaikalavan Ramasamy <ramasamy at cancer.org.uk> writes:
>
>> I think it is doing what is supposed to do but I never used read.spss,
>> so take this with a pinch of salt.
>>
>> In R when you use as.integer on a factor, the one with the lowest level
>> gets a value of 1 and so on. The lowest level of the factor can
>> determined from levels() function.
>>
>>    f <- factor( c("Green", "Green", "Red", "Blue"),
>>                 levels=c("Red", "Blue", "Green") )
>>    levels(f)
>>    [1] "Red"   "Blue"  "Green"
>>
>>    as.integer(f)
>>    [1] 3 3 1 2
>>
>> But the levels of a factor can be changed
>>
>>    as.integer( factor( f, levels=c("Green", "Blue", "Red" ) ) )
>>    [1] 1 1 3 2
>
> Doesn't explain why  1 2 3 in the input file comes out as Green Blue
> Red, does it?
>
>> You can also try setting use.value.labels=FALSE in read.spss function
>> and then creating a factor out of it.
>
> Would be interesting to see this. I would suspect that the damage is
> already done at that point though.

It would also be interesting because the raw value label information is 
then available as the "value.labels" attribute of the variable.

> I notice that the value labels are in reverse order. Shouldn't matter
> to read.spss which has
>
>            rval[[nm]] <- factor(rval[[nm]], levels = vl[[v]],
>                labels = trim(names(vl[[v]])))
>
> i.e. levels and labels should be in the correct order.
>

yes, something wrong must be coming out of the .C call.

 	-thomas




More information about the R-help mailing list