[R] weight median by count for multiple records
Kirsten Beyer
kirsten-beyer at uiowa.edu
Thu Jul 30 19:58:47 CEST 2009
Hello everyone,
I have a .csv file with the following format:
uniqueID SubjectID Distance_miles Tag
1 1001 5.5 3
2 1001 7 1
3 1001 6.5 1
4 1001 5 1
5 1002 2 2
6 1002 2 2
7 1002 1.5 2
8 1003 15 2
9 1003 17 2
10 1003 18 2
For each SubjectID, I want to calculate the median distance, where the
Tag variable indicates the number of times that distance was recorded.
My final output table would be...
SubjectID Median Distance
1001 5.5
1002 2
1003 17
I have used the following script to calculate the median for a data
frame where each recorded distance has its own row, and where temp is
a dataframe containing each unique SubjectID and routes is the file I
describe above.
for(i in 1:nrow(temp)){
temp$mediandistance[i] <-
median(routes$Distance_miles[routes$Subject_ID==temp$Subject_ID[i]])
}
I am interested to know...
(1) Is there a way to incorporate a weighted median into this script,
where the weights are the number of times each distance is recorded?
(2) Can I transform my current matrix into one that gives each
distance its own row?
Help is much appreciated,
Kirsten Beyer
More information about the R-help
mailing list