[R] Retrieving Factors with Levels Ordered
H. T. Reynolds
htr at udel.edu
Sat Jan 1 06:41:10 CET 2011
Hello (and Happy New Year),
When I create a factor with labels in the order I want, write the data as a text file, and then retrieve them, the factor levels are no longer in the proper order.
Here is what I do (I tried many variations):
# educ is a numeric vector with 1,001 observations.
# There is one NA
# Use educ to create a factor
feducord <- factor(educ, labels = c('Elem', 'Mid', 'HS',
+ 'Bus', 'Some', 'Col', 'Post'), ordered = T)
levels(feducord)
[1] "Elem" "Mid" "HS" "Bus" "Some" "Col" "Post"
table(feducord)
feducord
Elem Mid HS Bus Some Col Post
30 90 303 108 236 144 89
# The above is what I want. The frequencies agree with
# the codebook
# Make a data frame and save it. (I want a text file.)
testdf <- data.frame(feducord)
str(testdf)
'data.frame': 1001 obs. of 1 variable:
$ feducord: Ord.factor w/ 7 levels "Elem"<"Mid"<"HS"<..:
5 6 5 7 3 4 3 3 3 5 ...
write.table(testdf, file = 'Junkarea/test.txt')
# So far, so good.
rm(testdf, feducord)
# Go away.
# Come back later to retrieve the data.
testdf <- read.table(file = 'Junkarea/test.txt')
# But levels are no longer ordered
str(testdf)
'data.frame': 1001 obs. of 1 variable:
$ feducord: Factor w/ 7 levels "Bus","Col","Elem",..:
7 2 7 6 4 1 4 4 4 7
table(testdf$feducord)
Bus Col Elem HS Mid Post Some
108 144 30 303 90 89 236
# The frequencies are correct, but the ordering is wrong.
Clearly I am missing something obvious, but I can't see it. If I save "feducord" and load it, the order of the levels is as it should be. But I don't know why writing to a test file should change anything. Any help would be greatly appreciated.
(You're right, I don't have anything better to do on New Year's eve.)
More information about the R-help
mailing list