[R] Trying to fix code that will find highest 5 column names and their associated values for each row in a data frame in R
PIKAL Petr
petr@pik@l @ending from prechez@@cz
Wed Dec 19 13:14:15 CET 2018
Hi
generated DF is not what you expect it is
> set.seed(1)
> DF <- matrix(sample(1:9,9),ncol=10,nrow=9)
> DF
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 3 3 3 3 3 3 3 3 3 3
[2,] 9 9 9 9 9 9 9 9 9 9
[3,] 5 5 5 5 5 5 5 5 5 5
[4,] 6 6 6 6 6 6 6 6 6 6
[5,] 2 2 2 2 2 2 2 2 2 2
[6,] 4 4 4 4 4 4 4 4 4 4
[7,] 8 8 8 8 8 8 8 8 8 8
[8,] 7 7 7 7 7 7 7 7 7 7
[9,] 1 1 1 1 1 1 1 1 1 1
>
with slight input modification
> set.seed(1)
> DF <- matrix(sample(1:9,90, replace=T), ncol=10, nrow=9)
> DF <- as.data.frame.matrix(DF)
>
> out <- t(apply(DF, 1, function(x){
+ o <- head(order(-x), 5)
+ paste0(names(x[o]), ':', x[o])
+ }))
> as.data.frame(out)
V1 V2 V3 V4 V5
1 V5:8 V6:8 V10:7 V3:4 V4:4
2 V4:8 V3:7 V8:6 V1:4 V9:4
3 V3:9 V5:7 V1:6 V6:5 V9:5
4 V1:9 V9:9 V2:7 V6:7 V10:7
5 V5:8 V9:8 V6:7 V8:7 V3:6
6 V1:9 V2:7 V10:7 V5:6 V4:5
7 V1:9 V7:9 V5:8 V6:8 V8:8
8 V9:9 V4:8 V2:7 V1:6 V5:5
9 V2:9 V8:8 V4:7 V1:6 V5:5
your code seems to work.
Cheers
Petr
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Tom Woolman
> Sent: Monday, December 17, 2018 6:34 PM
> To: r-help using r-project.org
> Subject: [R] Trying to fix code that will find highest 5 column names and their
> associated values for each row in a data frame in R
>
>
> I have a data frame each with 10 variables of integer data for various
> attributes about each row of data, and I need to know the highest 5 variables
> related to each of
> row in this data frame and output that to a new data frame. In addition to
> the 5 highest variable names, I also need to know the corresponding 5
> highest variable values for each row.
>
> A simple code example to generate a sample data frame for this is:
>
> set.seed(1)
> DF <- matrix(sample(1:9,9),ncol=10,nrow=9)
> DF <- as.data.frame.matrix(DF)
>
>
> This would result in an example data frame like this:
>
> # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
> # 1 3 2 5 6 5 2 6 8 1 3
> # 2 1 4 7 8 7 7 3 4 2 9
> # 3 2 3 4 7 5 8 9 1 3 5
> # 4 3 8 3 4 5 6 7 4 6 5
> # 5 6 2 3 7 2 1 8 3 2 4
> # 6 8 2 4 8 3 2 9 7 6 5
> # 7 1 5 3 6 8 3 8 9 1 3
> # 8 9 3 5 8 4 9 7 8 1 2
> # 9 1 2 4 8 3 2 1 2 5 6
>
>
> My ideal output would be something like this:
>
>
> # V1 V2 V3 V4 V5
> # 1 V2:9 V7:8 V8:7 V4:6 V3:5
> # 2 V9:9 V3:8 V5:7 V7:6 V4:5
> # 3 V5:9 V3:8 V2:7 V9:6 V7:5
> # 4 V8:9 V4:8 V2:7 V5:6 V9:5
> # 5 V9:9 V1:8 V6:7 V3:6 V5:5
> # 6 V8:9 V1:8 V5:7 V9:6 V4:5
> # 7 V2:9 V8:8 V7:7 V5:6 V9:5
> # 8 V4:9 V7:8 V9:7 V2:6 V8:5
> # 9 V3:9 V7:8 V8:7 V4:6 V5:5
> # 10 V6:9 V8:8 V1:7 V9:6 V4:5
>
>
> I was trying to use code, but this doesn't seem to work:
>
> out <- t(apply(DF, 1, function(x){
> o <- head(order(-x), 5)
> paste0(names(x[o]), ':', x[o])
> }))
> as.data.frame(out)
>
>
>
> Thanks everyone!
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
More information about the R-help
mailing list