[R] how to sort the levels of a table
drflxms
drflxms at googlemail.com
Wed Aug 17 00:57:52 CEST 2011
Dear colleagues,
I have really heavy problems in sorting the results of a table according
to certain features of the levels in the table.
Prerequisites:
It all starts with a fairly simple data set, which stores observations
of 21 observers (horizontally from 1 to 21; 21 is
reference/goldstandard) who diagnosed 42 videos (vertically from 1 to
42). See dump of R-object "input" below in case you are interested in
the original data.
Each observation is a series of 1 and 0 (different length!) which was
obtained by concatenating answers of a multiple choice questionaire.
I created a confusion matrix (table) displaying the observations of the
20 observers (nr. 1 to 20) versus the reference (nr. 21) via the
following code:
## observations of the reference
obsValues<-factor(unlist(input[,-c(1,ncol(input))]))
## observations of the observers
refValues<-factor(input[,ncol(input)])
## data.frame that relates observations of observers and reference
RefObs<-data.frame(refValues, obsValues)
mtx<-table(
RefObs$obsValues,
RefObs$refValues,
dnn=c("observers", "reference"),
useNA=c("always")
)
And now the problem:
I need to sort the levels/classes of the table. Both axes shall be
ordered 1st according to how often 1 occurs in the string that codes the
observation and 2nd treating the observation string (formed by 0 and 1)
as a number.
i.e. "1" "0" "1010" "10000" "100" "11" should be sorted as
"0" "1" "100" "10000" "11" "1010"
I am aware of the order function and tried a command of the form
mtx[order(row1stsort,row2ndsort),order(col1stsort,col2ndsort)].
Sorting i.e. the levels of observers and the reference works well this way:
## sorted levels generated by the reference choices.
refLevels <- unique(input[,ncol(input)])[order(
as.numeric(sapply(unique(input[,ncol(input)]), FUN=function(x)
sum(as.numeric(unlist(strsplit(x,"")))))),
as.numeric(as.vector(unique(input[,ncol(input)])))
)]
## sorted levels generated by the observers choices.
obsLevels <- unique(unlist(input[,-c(1,ncol(input))]))[order(
as.numeric(sapply(unique(unlist(input[,-c(1,ncol(input))])),
FUN=function(x) sum(as.numeric(unlist(strsplit(x,"")))))),
as.numeric(as.vector(unique(unlist(input[,-c(1,ncol(input))]))))
)]
Unfortunately this code does not the trick in case of the table mtx (see
above).
At least I was not successfull with the following code:
mtx<-table(
RefObs$obsValues,
RefObs$refValues,
dnn=c("observers", "reference"),
useNA=c("always")
)[order(
as.numeric(sapply(as.vector(unique(RefObs$obsValues)), FUN=function(x)
sum(as.numeric(replace(unlist(strsplit(x,"")),which(x=="w"),NA))))), ##
generates numeric vector containing sum of digits of classes chosen by
observers. 1st order of vertical (row) sorting. Structural missings
coded as w are substituted by NA.
as.numeric(as.vector(unique(RefObs$obsValues))) ## generates numeric
vector containing classes chosen by observers. 2nd order of vertical
(row) sorting. Structural missings coded as w are substituted by NA.
),
order(
as.numeric(sapply(as.vector(unique(RefObs$refValues)), FUN=function(x)
sum(as.numeric(replace(unlist(strsplit(x,"")),which(x=="w"),NA))))), ##
generates numeric vector containing sum of digits of classes chosen by
reference. 1st order of horizontal (column) sorting. Structural missings
coded as w are substituted by NA.
as.numeric(as.vector(unique(RefObs$refValues))) ## generates numeric
vector containing classes chosen by the reference. 2nd order of
horizontal (column) sorting.
)
]
Any suggestions? I'd appreaciate any kind of help very much!
Greetings from Munich, Felix
dump of data.frame:
options(width=180) # to view it as a whole
input <- structure(list(video = c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19",
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30",
"31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41",
"42"), `1` = c("110", "0", "0", "10000", "0", "0", "0", "10000",
"0", "0", "11110", "10000", "0", "100", "1110", "10110", "110",
"1100", "0", "1000", "11100", "0", "11000", "0", "0", "0", "1110",
"0", "0", "10110", "1000", "10010", "10001", "10000", "100",
"0", "100", "111", "1000", "0", "0", "0"), `2` = c("0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "100", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), `3` = c("0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "110", "0", "0", "0", "0", "0", "10", "0", "0", "0", "0",
"0", "0", "1000", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "1100"), `4` = c("100",
"10100", "10100", "10010", "10100", "10000", "0", "10100", "1000",
"10000", "10100", "10000", "10100", "10000", "10100", "10000",
"10000", "10000", "0", "11000", "11000", "0", "1000", "1000",
"0", "0", "0", "0", "10000", "1", "10000", "10000", "10100",
"10000", "0", "0", "0", "10010", "0", "0", "0", "100"), `5` = c("10000",
"10100", "10100", "0", "0", "0", "0", "0", "0", "0", "100", "0",
"0", "0", "0", "0", "111", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "11000", "0", "0", "0", "110", "0", "0",
"0", "0", "1", "0", "0", "0", "0"), `6` = c("0", "10100", "10000",
"10000", "0", "0", "0", "10000", "0", "10000", "10000", "0",
"0", "10000", "10", "10000", "10", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "1", "0", "10000", "10000", "10000",
"0", "0", "0", "10000", "0", "0", "0", "0"), `7` = c("0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "10", "0", "100", "0", "0", "0", "10", "0", "0",
"0", "100"), `8` = c("0", "0", "100", "0", "0", "0", "0", "0",
"0", "0", "100", "0", "0", "0", "0", "0", "1", "0", "0", "0",
"0", "0", "1000", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0"), `9` = c("0",
"110", "1", "0", "0", "0", "0", "0", "0", "0", "100", "0", "0",
"10000", "10000", "10000", "1", "100", "0", "100", "0", "0",
"1000", "1000", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "1000", "0", "11000", "1", "11000", "0", "1000", "0"), `10` = c("0",
"0", "10000", "0", "0", "0", "0", "10000", "0", "0", "100", "10000",
"10000", "10000", "10000", "10000", "0", "0", "0", "0", "100",
"0", "10000", "0", "0", "0", "0", "0", "0", "0", "0", "0", "10000",
"0", "0", "0", "0", "1", "0", "0", "0", "0"), `11` = c("0", "0",
"10100", "0", "0", "0", "100", "0", "0", "0", "100", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "11100", "0", "1000", "0",
"0", "0", "0", "0", "0", "10010", "0", "0", "10100", "0", "0",
"0", "0", "10000", "0", "0", "0", "100"), `12` = c("0", "0",
"0", "10", "0", "0", "0", "1000", "0", "0", "1100", "0", "0",
"0", "0", "0", "110", "100", "0", "0", "0", "0", "1000", "1000",
"0", "0", "0", "0", "0", "10100", "10", "10", "11110", "110",
"0", "0", "0", "11110", "0", "0", "0", "0"), `13` = c("0", "0",
"100", "0", "0", "0", "0", "10100", "1000", "0", "1100", "0",
"0", "1", "110", "10", "110", "1", "0", "1000", "0", "0", "1100",
"0", "0", "0", "0", "0", "0", "10110", "0", "0", "1", "10000",
"0", "0", "0", "1", "0", "0", "1100", "10100"), `14` = c("0",
"0", "110", "100", "100", "10000", "100", "100", "1000", "11100",
"1100", "10000", "0", "111", "111", "11", "1", "10000", "0",
"10000", "10000", "11000", "11000", "1000", "0", "0", "0", "1000",
"1000", "111", "0", "0", "10", "100", "0", "0", "1000", "1",
"0", "0", "0", "1000"), `15` = c("0", "100", "100", "10", "100",
"0", "0", "1000", "0", "10000", "100", "0", "0", "100", "0",
"0", "10", "0", "0", "0", "10000", "0", "0", "1000", "0", "0",
"0", "0", "0", "0", "0", "0", "10000", "0", "0", "0", "0", "1000",
"0", "0", "0", "10000"), `16` = c("0", "0", "100", "0", "0",
"0", "10100", "11100", "0", "0", "0", "0", "0", "10111", "1",
"0", "0", "0", "0", "1100", "10000", "0", "0", "1000", "0", "0",
"0", "1000", "0", "0", "0", "0", "1", "0", "0", "0", "0", "1101",
"0", "0", "0", "0"), `17` = c("0", "100", "100", "110", "100",
"10000", "100", "10000", "0", "100", "100", "0", "0", "10", "1",
"10", "0", "100", "0", "10100", "10000", "1000", "11000", "1000",
"1000", "0", "0", "0", "0", "110", "0", "0", "10110", "0", "0",
"0", "0", "1", "0", "0", "0", "10100"), `18` = c("0", "110",
"101", "110", "0", "1000", "100", "0", "0", "0", "1101", "0",
"0", "0", "110", "0", "10", "1000", "0", "0", "0", "0", "1000",
"1000", "0", "0", "0", "0", "0", "10100", "0", "10010", "100",
"110", "0", "0", "0", "111", "0", "0", "11100", "100"), `19` = c("0",
"110", "10111", "0", "0", "0", "10010", "0", "0", "0", "1", "0",
"0", "0", "100", "0", "0", "100", "0", "11100", "0", "0", "0",
"0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
"0", "1", "0", "0", "1000", "100"), `20` = c("0", "0", "100",
"10010", "0", "0", "100", "0", "0", "0", "111", "0", "0", "0",
"1100", "0", "111", "0", "0", "0", "1100", "0", "1100", "1000",
"0", "0", "0", "0", "0", "111", "0", "0", "0", "0", "0", "0",
"0", "1", "0", "0", "0", "10000"), `21` = c("111", "0", "100",
"110", "0", "11000", "10100", "0", "11000", "11000", "1100",
"11000", "0", "0", "1110", "0", "1", "1100", "1000", "1000",
"11100", "1000", "11100", "1000", "0", "0", "0", "0", "0", "110",
"0", "10", "0", "0", "100", "0", "0", "1110", "0", "0", "11000",
"1100")), .Names = c("video", "1", "2", "3", "4", "5", "6", "7",
"8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18",
"19", "20", "21"), row.names = c(NA, 42L), idvars = "video", rdimnames =
list(
structure(list(video = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
42)), .Names = "video", row.names = c("1", "2", "3", "4",
"5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
"26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
"36", "37", "38", "39", "40", "41", "42"), class = "data.frame"),
structure(list(befunder = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)), .Names = "befunder",
row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
"13", "14", "15", "16", "17", "18", "19", "20", "21"), class =
"data.frame")), class = c("cast_df",
"data.frame"))
More information about the R-help
mailing list