[R] spss.read factor reversal
Thomas Lumley
tlumley at u.washington.edu
Wed Jul 27 16:32:07 CEST 2005
On Wed, 27 Jul 2005, Peter Dalgaard wrote:
> Adaikalavan Ramasamy <ramasamy at cancer.org.uk> writes:
>
>> I think it is doing what is supposed to do but I never used read.spss,
>> so take this with a pinch of salt.
>>
>> In R when you use as.integer on a factor, the one with the lowest level
>> gets a value of 1 and so on. The lowest level of the factor can
>> determined from levels() function.
>>
>> f <- factor( c("Green", "Green", "Red", "Blue"),
>> levels=c("Red", "Blue", "Green") )
>> levels(f)
>> [1] "Red" "Blue" "Green"
>>
>> as.integer(f)
>> [1] 3 3 1 2
>>
>> But the levels of a factor can be changed
>>
>> as.integer( factor( f, levels=c("Green", "Blue", "Red" ) ) )
>> [1] 1 1 3 2
>
> Doesn't explain why 1 2 3 in the input file comes out as Green Blue
> Red, does it?
>
>> You can also try setting use.value.labels=FALSE in read.spss function
>> and then creating a factor out of it.
>
> Would be interesting to see this. I would suspect that the damage is
> already done at that point though.
It would also be interesting because the raw value label information is
then available as the "value.labels" attribute of the variable.
> I notice that the value labels are in reverse order. Shouldn't matter
> to read.spss which has
>
> rval[[nm]] <- factor(rval[[nm]], levels = vl[[v]],
> labels = trim(names(vl[[v]])))
>
> i.e. levels and labels should be in the correct order.
>
yes, something wrong must be coming out of the .C call.
-thomas
More information about the R-help
mailing list