[R] weight median by count for multiple records

Dimitris Rizopoulos d.rizopoulos at erasmusmc.nl
Thu Jul 30 20:08:19 CEST 2009


one approach is:

sp <- split(dat[-1], dat$SubjectID)
t(sapply(sp, function (d)
     c(d$SubjectID[1], median(rep(d$Distance_miles, d$Tag)))))

where 'dat' is the name of your data.frame.

I hope this helps.

Best,
Dimitris


Kirsten Beyer wrote:
> Hello everyone,
> 
> I have a .csv file with the following format:
> 
> uniqueID     SubjectID      Distance_miles     Tag
> 1                  1001                    5.5               3
> 2                  1001                    7                  1
> 3                  1001                    6.5               1
> 4                  1001                    5                  1
> 5                  1002                    2                  2
> 6                  1002                    2                  2
> 7                  1002                    1.5               2
> 8                  1003                    15                2
> 9                  1003                    17                2
> 10                1003                    18                2
> 
> 
> For each SubjectID, I want to calculate the median distance, where the
> Tag variable indicates the number of times that distance was recorded.
>  My final output table would be...
> 
> SubjectID   Median Distance
> 1001                5.5
> 1002                2
> 1003                17
> 
> I have used the following script to calculate the median for a data
> frame where each recorded distance has its own row, and where temp is
> a dataframe containing each unique SubjectID and routes is the file I
> describe above.
> 
> for(i in 1:nrow(temp)){
> temp$mediandistance[i] <-
> median(routes$Distance_miles[routes$Subject_ID==temp$Subject_ID[i]])
> }
> 
> I am interested to know...
> (1) Is there a way to incorporate a weighted median into this script,
> where the weights are the number of times each distance is recorded?
> (2) Can I transform my current matrix into one that gives each
> distance its own row?
> 
> Help is much appreciated,
> Kirsten Beyer
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014




More information about the R-help mailing list