[R] weight median by count for multiple records
Dimitris Rizopoulos
d.rizopoulos at erasmusmc.nl
Thu Jul 30 20:08:19 CEST 2009
one approach is:
sp <- split(dat[-1], dat$SubjectID)
t(sapply(sp, function (d)
c(d$SubjectID[1], median(rep(d$Distance_miles, d$Tag)))))
where 'dat' is the name of your data.frame.
I hope this helps.
Best,
Dimitris
Kirsten Beyer wrote:
> Hello everyone,
>
> I have a .csv file with the following format:
>
> uniqueID SubjectID Distance_miles Tag
> 1 1001 5.5 3
> 2 1001 7 1
> 3 1001 6.5 1
> 4 1001 5 1
> 5 1002 2 2
> 6 1002 2 2
> 7 1002 1.5 2
> 8 1003 15 2
> 9 1003 17 2
> 10 1003 18 2
>
>
> For each SubjectID, I want to calculate the median distance, where the
> Tag variable indicates the number of times that distance was recorded.
> My final output table would be...
>
> SubjectID Median Distance
> 1001 5.5
> 1002 2
> 1003 17
>
> I have used the following script to calculate the median for a data
> frame where each recorded distance has its own row, and where temp is
> a dataframe containing each unique SubjectID and routes is the file I
> describe above.
>
> for(i in 1:nrow(temp)){
> temp$mediandistance[i] <-
> median(routes$Distance_miles[routes$Subject_ID==temp$Subject_ID[i]])
> }
>
> I am interested to know...
> (1) Is there a way to incorporate a weighted median into this script,
> where the weights are the number of times each distance is recorded?
> (2) Can I transform my current matrix into one that gives each
> distance its own row?
>
> Help is much appreciated,
> Kirsten Beyer
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center
Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
More information about the R-help
mailing list