[R] median of binned values
Moshe Olshansky
m_olshansky at yahoo.com
Wed Dec 19 23:22:19 CET 2007
Alternatively
levels(df$binname)[which(df$freq >=
0.5*cumsum(df$freq)[nrow(df)])[1]]
--- Chuck Cleland <ccleland at optonline.net> wrote:
> Martin Tomko wrote:
> > Dear list,
> > I have a vector (array, table row, whatever is
> best) of frequency values
> > for categories (or bins), and I need to find the
> median category.
> > Trivial to do by hand, but I was wondering if
> there is a means to do it
> > in R in an elegant way.
> >
> > The obvious medioan(vector) returns the median
> frequency for the binns,
> > and that is not what I want. i.e,:
> > freq
> > cat1 1
> > cat2 10
> > cat3 100
> > cat4 1000
> > cat5 10000
> >
> > I want it to return cat5, instead of cat3.
>
> df <- data.frame(binname = as.factor(paste("cat",
> 1:5, sep="")),
> freq = c(1,10,100,1000,10000))
>
> df
> binname freq
> 1 cat1 1
> 2 cat2 10
> 3 cat3 100
> 4 cat4 1000
> 5 cat5 10000
>
> with(df,
> levels(binname)[median(rep(as.numeric(binname),
> freq))])
> [1] "cat5"
>
> > Thanks a lot
> > Martin
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>
More information about the R-help
mailing list