[R] how to sort the levels of a table

Ista Zahn izahn at psych.rochester.edu
Wed Aug 17 01:40:24 CEST 2011


I'm not entirely sure I understand, but (can't let that stop me, I'd
never get anything done) see inline below.

On Tue, Aug 16, 2011 at 6:57 PM, drflxms <drflxms at googlemail.com> wrote:
> Dear colleagues,
>
> I have really heavy problems in sorting the results of a table according
> to certain features of the levels in the table.
>
> Prerequisites:
>
> It all starts with a fairly simple data set, which stores observations
> of 21 observers (horizontally from 1 to 21; 21 is
> reference/goldstandard) who diagnosed 42 videos (vertically from 1 to
> 42). See dump of R-object "input" below in case you are interested in
> the original data.
> Each observation is a series of 1 and 0 (different length!) which was
> obtained by concatenating answers of a multiple choice questionaire.
>
> I created a confusion matrix (table) displaying the observations of the
> 20 observers (nr. 1 to 20) versus the reference (nr. 21) via the
> following code:
>
> ## observations of the reference
> obsValues<-factor(unlist(input[,-c(1,ncol(input))]))
> ## observations of the observers
> refValues<-factor(input[,ncol(input)])
> ## data.frame that relates observations of observers and reference
> RefObs<-data.frame(refValues, obsValues)
> mtx<-table(
>           RefObs$obsValues,
>           RefObs$refValues,
>           dnn=c("observers", "reference"),
>           useNA=c("always")
>           )
>
> And now the problem:
>
> I need to sort the levels/classes of the table. Both axes shall be
> ordered 1st according to how often 1 occurs in the string that codes the
> observation and 2nd treating the observation string (formed by 0 and 1)
> as a number.
>
> i.e. "1" "0" "1010" "10000" "100" "11" should be sorted as
> "0" "1" "100" "10000" "11" "1010"
>
> I am aware of the order function and tried a command of the form
> mtx[order(row1stsort,row2ndsort),order(col1stsort,col2ndsort)].
> Sorting i.e. the levels of observers and the reference works well this way:
>
> ## sorted levels generated by the reference choices.
> refLevels <- unique(input[,ncol(input)])[order(
> as.numeric(sapply(unique(input[,ncol(input)]), FUN=function(x)
> sum(as.numeric(unlist(strsplit(x,"")))))),
> as.numeric(as.vector(unique(input[,ncol(input)])))
> )]
> ## sorted levels generated by the observers choices.
> obsLevels <- unique(unlist(input[,-c(1,ncol(input))]))[order(
> as.numeric(sapply(unique(unlist(input[,-c(1,ncol(input))])),
> FUN=function(x) sum(as.numeric(unlist(strsplit(x,"")))))),
> as.numeric(as.vector(unique(unlist(input[,-c(1,ncol(input))]))))
> )]
>
> Unfortunately this code does not the trick in case of the table mtx (see
> above).
>
> At least I was not successfull with the following code:
>
> mtx<-table(
> RefObs$obsValues,
> RefObs$refValues,
> dnn=c("observers", "reference"),
> useNA=c("always")
> )[order(
> as.numeric(sapply(as.vector(unique(RefObs$obsValues)), FUN=function(x)
> sum(as.numeric(replace(unlist(strsplit(x,"")),which(x=="w"),NA))))), ##
> generates numeric vector containing sum of digits of classes chosen by
> observers. 1st order of vertical (row) sorting. Structural missings
> coded as w are substituted by NA.
> as.numeric(as.vector(unique(RefObs$obsValues))) ## generates numeric
> vector containing classes chosen by observers. 2nd order of vertical
> (row) sorting. Structural missings coded as w are substituted by NA.
> ),
> order(
> as.numeric(sapply(as.vector(unique(RefObs$refValues)), FUN=function(x)
> sum(as.numeric(replace(unlist(strsplit(x,"")),which(x=="w"),NA))))), ##
> generates numeric vector containing sum of digits of classes chosen by
> reference. 1st order of horizontal (column) sorting. Structural missings
> coded as w are substituted by NA.
> as.numeric(as.vector(unique(RefObs$refValues))) ## generates numeric
> vector containing classes chosen by the reference. 2nd order of
> horizontal (column) sorting.
> )
> ]
>
> Any suggestions? I'd appreaciate any kind of help very much!

I think a simple
mtx[obsLevels, refLevels]
should do it. Or more simply
mtx[order(nchar(gsub("[^1]", "", rownames(mtx))),
          as.numeric(rownames(mtx))),
    order(nchar(gsub("[^1]", "", colnames(mtx))),
          as.numeric(colnames(mtx)))]

Best,
Ista

> Greetings from Munich, Felix
>
>
>
> dump of data.frame:
> options(width=180) # to view it as a whole
>
> input <- structure(list(video = c("1", "2", "3", "4", "5", "6", "7", "8",
> "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19",
> "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30",
> "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41",
> "42"), `1` = c("110", "0", "0", "10000", "0", "0", "0", "10000",
> "0", "0", "11110", "10000", "0", "100", "1110", "10110", "110",
> "1100", "0", "1000", "11100", "0", "11000", "0", "0", "0", "1110",
> "0", "0", "10110", "1000", "10010", "10001", "10000", "100",
> "0", "100", "111", "1000", "0", "0", "0"), `2` = c("0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "100", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "0"), `3` = c("0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "110", "0", "0", "0", "0", "0", "10", "0", "0", "0", "0",
> "0", "0", "1000", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "1100"), `4` = c("100",
> "10100", "10100", "10010", "10100", "10000", "0", "10100", "1000",
> "10000", "10100", "10000", "10100", "10000", "10100", "10000",
> "10000", "10000", "0", "11000", "11000", "0", "1000", "1000",
> "0", "0", "0", "0", "10000", "1", "10000", "10000", "10100",
> "10000", "0", "0", "0", "10010", "0", "0", "0", "100"), `5` = c("10000",
> "10100", "10100", "0", "0", "0", "0", "0", "0", "0", "100", "0",
> "0", "0", "0", "0", "111", "0", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "0", "11000", "0", "0", "0", "110", "0", "0",
> "0", "0", "1", "0", "0", "0", "0"), `6` = c("0", "10100", "10000",
> "10000", "0", "0", "0", "10000", "0", "10000", "10000", "0",
> "0", "10000", "10", "10000", "10", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "1", "0", "10000", "10000", "10000",
> "0", "0", "0", "10000", "0", "0", "0", "0"), `7` = c("0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "10", "0", "100", "0", "0", "0", "10", "0", "0",
> "0", "100"), `8` = c("0", "0", "100", "0", "0", "0", "0", "0",
> "0", "0", "100", "0", "0", "0", "0", "0", "1", "0", "0", "0",
> "0", "0", "1000", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"), `9` = c("0",
> "110", "1", "0", "0", "0", "0", "0", "0", "0", "100", "0", "0",
> "10000", "10000", "10000", "1", "100", "0", "100", "0", "0",
> "1000", "1000", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "1000", "0", "11000", "1", "11000", "0", "1000", "0"), `10` = c("0",
> "0", "10000", "0", "0", "0", "0", "10000", "0", "0", "100", "10000",
> "10000", "10000", "10000", "10000", "0", "0", "0", "0", "100",
> "0", "10000", "0", "0", "0", "0", "0", "0", "0", "0", "0", "10000",
> "0", "0", "0", "0", "1", "0", "0", "0", "0"), `11` = c("0", "0",
> "10100", "0", "0", "0", "100", "0", "0", "0", "100", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "11100", "0", "1000", "0",
> "0", "0", "0", "0", "0", "10010", "0", "0", "10100", "0", "0",
> "0", "0", "10000", "0", "0", "0", "100"), `12` = c("0", "0",
> "0", "10", "0", "0", "0", "1000", "0", "0", "1100", "0", "0",
> "0", "0", "0", "110", "100", "0", "0", "0", "0", "1000", "1000",
> "0", "0", "0", "0", "0", "10100", "10", "10", "11110", "110",
> "0", "0", "0", "11110", "0", "0", "0", "0"), `13` = c("0", "0",
> "100", "0", "0", "0", "0", "10100", "1000", "0", "1100", "0",
> "0", "1", "110", "10", "110", "1", "0", "1000", "0", "0", "1100",
> "0", "0", "0", "0", "0", "0", "10110", "0", "0", "1", "10000",
> "0", "0", "0", "1", "0", "0", "1100", "10100"), `14` = c("0",
> "0", "110", "100", "100", "10000", "100", "100", "1000", "11100",
> "1100", "10000", "0", "111", "111", "11", "1", "10000", "0",
> "10000", "10000", "11000", "11000", "1000", "0", "0", "0", "1000",
> "1000", "111", "0", "0", "10", "100", "0", "0", "1000", "1",
> "0", "0", "0", "1000"), `15` = c("0", "100", "100", "10", "100",
> "0", "0", "1000", "0", "10000", "100", "0", "0", "100", "0",
> "0", "10", "0", "0", "0", "10000", "0", "0", "1000", "0", "0",
> "0", "0", "0", "0", "0", "0", "10000", "0", "0", "0", "0", "1000",
> "0", "0", "0", "10000"), `16` = c("0", "0", "100", "0", "0",
> "0", "10100", "11100", "0", "0", "0", "0", "0", "10111", "1",
> "0", "0", "0", "0", "1100", "10000", "0", "0", "1000", "0", "0",
> "0", "1000", "0", "0", "0", "0", "1", "0", "0", "0", "0", "1101",
> "0", "0", "0", "0"), `17` = c("0", "100", "100", "110", "100",
> "10000", "100", "10000", "0", "100", "100", "0", "0", "10", "1",
> "10", "0", "100", "0", "10100", "10000", "1000", "11000", "1000",
> "1000", "0", "0", "0", "0", "110", "0", "0", "10110", "0", "0",
> "0", "0", "1", "0", "0", "0", "10100"), `18` = c("0", "110",
> "101", "110", "0", "1000", "100", "0", "0", "0", "1101", "0",
> "0", "0", "110", "0", "10", "1000", "0", "0", "0", "0", "1000",
> "1000", "0", "0", "0", "0", "0", "10100", "0", "10010", "100",
> "110", "0", "0", "0", "111", "0", "0", "11100", "100"), `19` = c("0",
> "110", "10111", "0", "0", "0", "10010", "0", "0", "0", "1", "0",
> "0", "0", "100", "0", "0", "100", "0", "11100", "0", "0", "0",
> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0",
> "0", "1", "0", "0", "1000", "100"), `20` = c("0", "0", "100",
> "10010", "0", "0", "100", "0", "0", "0", "111", "0", "0", "0",
> "1100", "0", "111", "0", "0", "0", "1100", "0", "1100", "1000",
> "0", "0", "0", "0", "0", "111", "0", "0", "0", "0", "0", "0",
> "0", "1", "0", "0", "0", "10000"), `21` = c("111", "0", "100",
> "110", "0", "11000", "10100", "0", "11000", "11000", "1100",
> "11000", "0", "0", "1110", "0", "1", "1100", "1000", "1000",
> "11100", "1000", "11100", "1000", "0", "0", "0", "0", "0", "110",
> "0", "10", "0", "0", "100", "0", "0", "1110", "0", "0", "11000",
> "1100")), .Names = c("video", "1", "2", "3", "4", "5", "6", "7",
> "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18",
> "19", "20", "21"), row.names = c(NA, 42L), idvars = "video", rdimnames =
> list(
>    structure(list(video = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
>    12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
>    27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
>    42)), .Names = "video", row.names = c("1", "2", "3", "4",
>    "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
>    "16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
>    "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
>    "36", "37", "38", "39", "40", "41", "42"), class = "data.frame"),
>    structure(list(befunder = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
>    11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)), .Names = "befunder",
> row.names = c("1",
>    "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12",
>    "13", "14", "15", "16", "17", "18", "19", "20", "21"), class =
> "data.frame")), class = c("cast_df",
> "data.frame"))
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org



More information about the R-help mailing list