[R] Getting minimum value of a column according a factor column of a dataframe
Bert Gunter
bgunter@4567 @end|ng |rom gm@||@com
Thu Aug 25 19:26:49 CEST 2022
A slightly slicker solution making use of the handy by() function to
avoid the lapply(split...) construction.
> do.call(rbind,by(df1, df1$Code, \(x)x[which.min(x$Q),]))
Code Y M D Q N O
41003 41003 81 1 19 0.160 7.17 2.50
41005 41005 79 8 17 0.210 5.50 7.20
41009 41009 79 2 21 0.218 5.56 4.04
41017 41017 79 10 20 0.240 5.30 7.10
This of course ignores the issue of tied minima that Tim Ebert brought
up. That would require a bit more finagling in the anonymous function
code instead of which.min() .
Cheers,
Bert
On Thu, Aug 25, 2022 at 12:22 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
> Hello,
>
> OK, what about
>
>
> res <- lapply(split(df1, df1$Code), \(x) x[which.min(x$Q),])
> do.call(rbind, res)
> # Code Y M D Q N O
> # 41003 41003 81 1 19 0.160 7.17 2.50
> # 41005 41005 79 8 17 0.210 5.50 7.20
> # 41009 41009 79 2 21 0.218 5.56 4.04
> # 41017 41017 79 10 20 0.240 5.30 7.10
>
>
> A dplyr solution.
>
>
>
> suppressPackageStartupMessages(library(dplyr))
>
> df1 %>%
> group_by(Code) %>%
> slice_min(Q) %>%
> slice_head(n = 1)
> # # A tibble: 4 × 7
> # # Groups: Code [4]
> # Code Y M D Q N O
> # <fct> <int> <int> <int> <dbl> <dbl> <dbl>
> # 1 41003 81 1 19 0.16 7.17 2.5
> # 2 41005 79 8 17 0.21 5.5 7.2
> # 3 41009 79 2 21 0.218 5.56 4.04
> # 4 41017 79 10 20 0.24 5.3 7.1
>
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Às 05:56 de 25/08/2022, javad bayat escreveu:
> > Dear all,
> > Many thanks for your suggested methods and codes, but unfortunately they
> > did not give the desired results.
> > All the codes you have provided are correct but they did not represent the
> > whole row which is related to the minimum of "Q".
> > The code must result in 4 rows, with the minimum value of "Q" and other
> > column values, as below:
> >
> > Code
> >
> > Y
> >
> > M
> >
> > D
> >
> > Q
> >
> > N
> >
> > O
> >
> > 41003
> >
> > 81
> >
> > 1
> >
> > 19
> >
> > 0.16
> >
> > 7.17
> >
> > 2.5
> >
> > 41005
> >
> > 79
> >
> > 8
> >
> > 17
> >
> > 0.21
> >
> > 5.5
> >
> > 7.2
> >
> > 41009
> >
> > 79
> >
> > 2
> >
> > 21
> >
> > 0.218
> >
> > 5.56
> >
> > 4.04
> > 41017 79 10 20 0.24 5.3 7.1
> >
> >
> >
> >
> >
> >
> > Sincerely
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > 41017 79 10 20 0.24 5.3 7.1
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Aug 24, 2022 at 9:24 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
> >
> >> Hello,
> >>
> >> Here are two options, the 1st outputs a vector, the 2nd a data.frame.
> >>
> >>
> >> x<-'41003 81 1 19 0.16 7.17 2.5
> >> 41003 77 9 22 0.197 6.8 2.2
> >> 41003 79 7 28 0.21 4.7 6.2
> >> 41005 79 8 17 0.21 5.5 7.2
> >> 41005 80 10 30 0.21 6.84 2.6
> >> 41005 80 12 20 0.21 6.84 2.4
> >> 41005 79 6 14 0.217 5.61 3.55
> >> 41009 79 2 21 0.218 5.56 4.04
> >> 41009 79 5 27 0.218 6.4 3.12
> >> 41009 80 11 29 0.22 6.84 2.8
> >> 41009 78 5 28 0.232 6 3.2
> >> 41009 81 8 20 0.233 6.39 1.6
> >> 41009 79 9 30 0.24 5.6 7.5
> >> 41017 79 10 20 0.24 5.3 7.1
> >> 41017 80 7 30 0.24 6.73 2.6'
> >> df1 <- read.table(textConnection(x))
> >> names(df1) <- scan(what = character(),
> >> text = 'Code Y M D Q N O')
> >> df1$Code <- factor(df1$Code)
> >>
> >> # 1st option
> >> with(df1, tapply(Q, Code, min))
> >> # 41003 41005 41009 41017
> >> # 0.160 0.210 0.218 0.240
> >>
> >> # 2nd option
> >> aggregate(Q ~ Code, df1, min)
> >> # Code Q
> >> # 1 41003 0.160
> >> # 2 41005 0.210
> >> # 3 41009 0.218
> >> # 4 41017 0.240
> >>
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >> Às 08:44 de 24/08/2022, javad bayat escreveu:
> >>> Dear all;
> >>> I am trying to get the minimum value of a column based on a factor column
> >>> of the same data frame. My data frame is like the below:
> >>> Code Y M D
> >>> Q
> >>> N O
> >>> 41003 81 1 19 0.16 7.17 2.5
> >>> 41003 77 9 22 0.197 6.8 2.2
> >>> 41003 79 7 28 0.21 4.7 6.2
> >>> 41005 79 8 17 0.21 5.5 7.2
> >>> 41005 80 10 30 0.21 6.84 2.6
> >>> 41005 80 12 20 0.21 6.84 2.4
> >>> 41005 79 6 14 0.217 5.61 3.55
> >>> 41009 79 2 21 0.218 5.56 4.04
> >>> 41009 79 5 27 0.218 6.4 3.12
> >>> 41009 80 11 29 0.22 6.84 2.8
> >>> 41009 78 5 28 0.232 6 3.2
> >>> 41009 81 8 20 0.233 6.39 1.6
> >>> 41009 79 9 30 0.24 5.6 7.5
> >>> 41017 79 10 20 0.24 5.3 7.1
> >>> 41017 80 7 30 0.24 6.73 2.6
> >>>
> >>> I want to get the minimum value of the "Q" column with the whole row
> >>> values, according to the "Code" column which is a factor. Overall it
> >> will
> >>> give me 4 rows, with the value of "Q". Below is a code that I used but it
> >>> did not give me what I wanted.
> >>>
> >>>> x[which(x$Q == min(x$Q)),]
> >>>
> >>> Sincerely
> >>>
> >>>
> >>>
> >>
> >
> >
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list