[R-sig-teaching] subsetting data with the median position

Randall Pruim rpruim at calvin.edu
Tue Dec 22 18:40:33 CET 2015


The issue of ties probably has no answer that is best in all situations.  Here’s what mosaic::ntiles() does:

> x <- c(2,4,8,9,11,11,11,12,15)
> rbind(x, ntiles(x, 2))
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
x    2    4    8    9   11   11   11   12   15
     1    1    1    1    1    2    2    2    2

The goal is to create (in this case 2) groups of nearly equal size.  That leads to some potentially arbitrary splitting of ties (which might or might not be a good idea in a given context).

The ntiles() function includes several methods of labeling its result, for example

> ntiles(x, 2, format = "interval")
[1] [2,11]  [2,11]  [2,11]  [2,11]  [2,11]  [11,15] [11,15] [11,15] [11,15]
Levels: [2,11] < [11,15]

—-rjp

PS.  If your data are more complex than this (and live in a data frame) and you want to create summaries of each of the two groups, you might be interested in tools in the mosaic and dplyr packages.  Here are a couple examples.

> library(mosaic); library(NHANES)

> favstats( Weight ~ ntiles(Age, 2, format = "interval") | Gender,
     data = NHANES %>% filter( Age >= 18))

          Gender  min   Q1 median     Q3   max     mean       sd    n missing
1 [18,45].female 38.5 59.7   70.8  86.60 198.7 75.28095 21.55570 1837      15
2 [45,80].female 37.0 61.8   71.5  85.60 230.7 75.41856 19.50101 1929      14
3   [18,45].male 46.2 74.2   85.3  99.90 223.0 88.63736 20.71035 1879      10
4   [45,80].male 46.7 75.9   86.3 100.65 203.0 89.08879 18.86193 1775      22
5         female 37.0 61.1   71.3  85.90 230.7 75.35143 20.52634 3766      29
6           male 46.2 75.0   85.7 100.10 223.0 88.85665 19.83255 3654      32

> NHANES %>% filter( Age >= 18) %>%
+   group_by(Gender, ntiles(Age)) %>%
+   summarise(mean.Weight = mean(Weight, na.rm = TRUE))
Source: local data frame [6 x 3]
Groups: Gender [?]

  Gender ntiles(Age) mean.Weight
  (fctr)      (fctr)       (dbl)
1 female         1st    75.12494
2 female         2nd    76.27209
3 female         3rd    74.70114
4   male         1st    86.36225
5   male         2nd    91.67620
6   male         3rd    88.50453

On Dec 22, 2015, at 12:12 PM, Albyn Jones <jones at reed.edu<mailto:jones at reed.edu>> wrote:

dat1 <- c(2,4,8,9,11,11,11,12,15)

dat1[dat1 > median(dat1)]
[1] 12 15
dat1[dat1 < median(dat1)]
[1] 2 4 8 9

I'm just curious:  do you like this too?


albyn

On Tue, Dec 22, 2015 at 1:00 AM, Steven Stoline <sstoline at gmail.com<mailto:sstoline at gmail.com>> wrote:

Thank you very much for your helps

steve

On Tue, Dec 22, 2015 at 3:53 AM, Peter Meissner <
peter.meissner at uni-konstanz.de<mailto:peter.meissner at uni-konstanz.de>> wrote:

dat1 <- c(2,4,8,9,11,11,12)

dat1[dat1 > median(dat1)]
dat1[dat1 < median(dat1)]



dat2 <- c(2,4,8,9,11,11,12,15)

dat2[dat2 > median(dat2)]
dat2[dat2 < median(dat2)]


below_median <- function(x){ x[x < median(x)]}
above_median <- function(x){ x[x > median(x)]}

below_median(dat1)
below_median(dat2)




Am .12.2015, 09:36 Uhr, schrieb Steven Stoline <sstoline at gmail.com<mailto:sstoline at gmail.com>>:

Dear All:

is there away for subsetting data by the median position?


Example:
------------


*Data1: *2, 4, 8, 9,11,11,12  *(n is odd)*

*Data1: *2,4,8,9,11,11,12,15

*  (n is even)*
for Data1:

* I want to get:*

*lower half:      2    4    8*


*upper half:    11   11   12*



*for Data2: I want to get:*

*lower half:     2   4      8    9*

*upper half:   11   11   12   15*


with many thanks
steve



--
Peter Meißner
Workgroup 'Comparative Parliamentary Politics'
Department of Politics and Administration
University of Konstanz
Box 216
78457 Konstanz
Germany

+49 7531 88 5665
http://www.polver.uni-konstanz.de/sieberer/home/
https://github.com/petermeissner
http://pmeissner.com

_______________________________________________
R-sig-teaching at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching




--
Steven M. Stoline
1123 Forest Avenue
Portland, ME 04112
sstoline at gmail.com<mailto:sstoline at gmail.com>

       [[alternative HTML version deleted]]

_______________________________________________
R-sig-teaching at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching


[[alternative HTML version deleted]]

_______________________________________________
R-sig-teaching at r-project.org<mailto:R-sig-teaching at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching


	[[alternative HTML version deleted]]



More information about the R-sig-teaching mailing list