[R] Retrieving Factors with Levels Ordered

H. T. Reynolds htr at udel.edu
Sat Jan 1 06:41:10 CET 2011


Hello (and Happy New Year),

When I create a factor with labels in the order I want, write the data as a text file, and then retrieve them, the factor levels are no longer in the proper order.

Here is what I do (I tried many variations):

# educ is a numeric vector with 1,001 observations.
# There is one NA

# Use educ to create a factor

feducord <- factor(educ, labels = c('Elem', 'Mid', 'HS',  
+            'Bus', 'Some', 'Col', 'Post'), ordered = T)

levels(feducord)
[1] "Elem" "Mid"  "HS"   "Bus"  "Some" "Col"  "Post"

table(feducord)
feducord
Elem  Mid   HS  Bus Some  Col Post
30   90  303  108  236  144   89

# The above is what I want. The frequencies agree with
# the codebook

# Make a data frame and save it. (I want a text file.)

testdf <- data.frame(feducord)
str(testdf)
'data.frame':   1001 obs. of  1 variable:
$ feducord: Ord.factor w/ 7 levels "Elem"<"Mid"<"HS"<..: 
     5 6 5 7 3 4 3 3 3 5 ...
write.table(testdf, file = 'Junkarea/test.txt')

# So far, so good.

rm(testdf, feducord)

# Go away.
# Come back later to retrieve the data.

testdf <- read.table(file = 'Junkarea/test.txt')

# But levels are no longer ordered

 str(testdf)
'data.frame':   1001 obs. of  1 variable:
$ feducord: Factor w/ 7 levels "Bus","Col","Elem",..: 
     7 2 7 6 4 1 4 4 4 7 

table(testdf$feducord)
Bus  Col Elem   HS  Mid Post Some
108  144   30  303   90   89  236

# The frequencies are correct, but the ordering is wrong.

Clearly I am missing something obvious, but I can't see it. If I save "feducord" and load it, the order of the levels is as it should be. But I don't know why writing to a test file should change anything. Any help would be greatly appreciated.

(You're right, I don't have anything better to do on New Year's eve.)



More information about the R-help mailing list