[R] Trying to fix code that will find highest 5 column names and their associated values for each row in a data frame in R

Tom Woolman twoolm@n @ending from ont@rgettek@com
Mon Dec 17 18:33:47 CET 2018


I have a data frame each with 10 variables of integer data for various
  attributes about each row of data, and I need to know the highest 5  
variables related to each of
  row in this data frame and output that to a new data frame. In addition to
  the 5 highest variable names, I also need to know the corresponding 5
  highest variable values for each row.

  A simple code example to generate a sample data frame for this is:

  set.seed(1)
  DF <- matrix(sample(1:9,9),ncol=10,nrow=9)
  DF <- as.data.frame.matrix(DF)


This would result in an example data frame like this:

  #   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
  # 1  3  2  5  6  5  2  6  8  1   3
  # 2  1  4  7  8  7  7  3  4  2   9
  # 3  2  3  4  7  5  8  9  1  3   5
  # 4  3  8  3  4  5  6  7  4  6   5
  # 5  6  2  3  7  2  1  8  3  2   4
  # 6  8  2  4  8  3  2  9  7  6   5
  # 7  1  5  3  6  8  3  8  9  1   3
  # 8  9  3  5  8  4  9  7  8  1   2
  # 9  1  2  4  8  3  2  1  2  5   6


  My ideal output would be something like this:


  #      V1   V2   V3   V4   V5
  # 1  V2:9 V7:8 V8:7 V4:6 V3:5
  # 2  V9:9 V3:8 V5:7 V7:6 V4:5
  # 3  V5:9 V3:8 V2:7 V9:6 V7:5
  # 4  V8:9 V4:8 V2:7 V5:6 V9:5
  # 5  V9:9 V1:8 V6:7 V3:6 V5:5
  # 6  V8:9 V1:8 V5:7 V9:6 V4:5
  # 7  V2:9 V8:8 V7:7 V5:6 V9:5
  # 8  V4:9 V7:8 V9:7 V2:6 V8:5
  # 9  V3:9 V7:8 V8:7 V4:6 V5:5
  # 10 V6:9 V8:8 V1:7 V9:6 V4:5


  I was trying to use code, but this doesn't seem to work:

  out <- t(apply(DF, 1, function(x){
    o <- head(order(-x), 5)
    paste0(names(x[o]), ':', x[o])
  }))
  as.data.frame(out)



  Thanks everyone!



More information about the R-help mailing list