[R-sig-teaching] subsetting data with the median position
Randall Pruim
rpruim at calvin.edu
Tue Dec 22 18:40:33 CET 2015
The issue of ties probably has no answer that is best in all situations. Here’s what mosaic::ntiles() does:
> x <- c(2,4,8,9,11,11,11,12,15)
> rbind(x, ntiles(x, 2))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
x 2 4 8 9 11 11 11 12 15
1 1 1 1 1 2 2 2 2
The goal is to create (in this case 2) groups of nearly equal size. That leads to some potentially arbitrary splitting of ties (which might or might not be a good idea in a given context).
The ntiles() function includes several methods of labeling its result, for example
> ntiles(x, 2, format = "interval")
[1] [2,11] [2,11] [2,11] [2,11] [2,11] [11,15] [11,15] [11,15] [11,15]
Levels: [2,11] < [11,15]
—-rjp
PS. If your data are more complex than this (and live in a data frame) and you want to create summaries of each of the two groups, you might be interested in tools in the mosaic and dplyr packages. Here are a couple examples.
> library(mosaic); library(NHANES)
> favstats( Weight ~ ntiles(Age, 2, format = "interval") | Gender,
data = NHANES %>% filter( Age >= 18))
Gender min Q1 median Q3 max mean sd n missing
1 [18,45].female 38.5 59.7 70.8 86.60 198.7 75.28095 21.55570 1837 15
2 [45,80].female 37.0 61.8 71.5 85.60 230.7 75.41856 19.50101 1929 14
3 [18,45].male 46.2 74.2 85.3 99.90 223.0 88.63736 20.71035 1879 10
4 [45,80].male 46.7 75.9 86.3 100.65 203.0 89.08879 18.86193 1775 22
5 female 37.0 61.1 71.3 85.90 230.7 75.35143 20.52634 3766 29
6 male 46.2 75.0 85.7 100.10 223.0 88.85665 19.83255 3654 32
> NHANES %>% filter( Age >= 18) %>%
+ group_by(Gender, ntiles(Age)) %>%
+ summarise(mean.Weight = mean(Weight, na.rm = TRUE))
Source: local data frame [6 x 3]
Groups: Gender [?]
Gender ntiles(Age) mean.Weight
(fctr) (fctr) (dbl)
1 female 1st 75.12494
2 female 2nd 76.27209
3 female 3rd 74.70114
4 male 1st 86.36225
5 male 2nd 91.67620
6 male 3rd 88.50453
On Dec 22, 2015, at 12:12 PM, Albyn Jones <jones at reed.edu<mailto:jones at reed.edu>> wrote:
dat1 <- c(2,4,8,9,11,11,11,12,15)
dat1[dat1 > median(dat1)]
[1] 12 15
dat1[dat1 < median(dat1)]
[1] 2 4 8 9
I'm just curious: do you like this too?
albyn
On Tue, Dec 22, 2015 at 1:00 AM, Steven Stoline <sstoline at gmail.com<mailto:sstoline at gmail.com>> wrote:
Thank you very much for your helps
steve
On Tue, Dec 22, 2015 at 3:53 AM, Peter Meissner <
peter.meissner at uni-konstanz.de<mailto:peter.meissner at uni-konstanz.de>> wrote:
dat1 <- c(2,4,8,9,11,11,12)
dat1[dat1 > median(dat1)]
dat1[dat1 < median(dat1)]
dat2 <- c(2,4,8,9,11,11,12,15)
dat2[dat2 > median(dat2)]
dat2[dat2 < median(dat2)]
below_median <- function(x){ x[x < median(x)]}
above_median <- function(x){ x[x > median(x)]}
below_median(dat1)
below_median(dat2)
Am .12.2015, 09:36 Uhr, schrieb Steven Stoline <sstoline at gmail.com<mailto:sstoline at gmail.com>>:
Dear All:
is there away for subsetting data by the median position?
Example:
------------
*Data1: *2, 4, 8, 9,11,11,12 *(n is odd)*
*Data1: *2,4,8,9,11,11,12,15
* (n is even)*
for Data1:
* I want to get:*
*lower half: 2 4 8*
*upper half: 11 11 12*
*for Data2: I want to get:*
*lower half: 2 4 8 9*
*upper half: 11 11 12 15*
with many thanks
steve
--
Peter Meißner
Workgroup 'Comparative Parliamentary Politics'
Department of Politics and Administration
University of Konstanz
Box 216
78457 Konstanz
Germany
+49 7531 88 5665
http://www.polver.uni-konstanz.de/sieberer/home/
https://github.com/petermeissner
http://pmeissner.com
_______________________________________________
R-sig-teaching at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
--
Steven M. Stoline
1123 Forest Avenue
Portland, ME 04112
sstoline at gmail.com<mailto:sstoline at gmail.com>
[[alternative HTML version deleted]]
_______________________________________________
R-sig-teaching at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
[[alternative HTML version deleted]]
_______________________________________________
R-sig-teaching at r-project.org<mailto:R-sig-teaching at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
[[alternative HTML version deleted]]
More information about the R-sig-teaching
mailing list