[R] How to sum one column in a data frame keyed on other columns
George Nachman
gnachman at llamas.org
Wed Dec 13 00:34:46 CET 2006
I have a data frame that looks like this:
url time somethingirrelevant visits
www.foo.com 1:00 xxx 100
www.foo.com 1:00 yyy 50
www.foo.com 2:00 xyz 25
www.bar.com 1:00 xxx 200
www.bar.com 1:00 zzz 200
www.foo.com 2:00 xxx 500
I'd like to write some code that takes this as input and outputs
something like this:
url time total_vists
www.foo.com 1:00 150
www.foo.com 2:00 525
www.bar.com 1:00 400
In other words, I need to calculate the sum of visits for each unique
tuple of (url,time).
I can do it with this code, but it's very slow, and doesn't seem like
the right approach:
keys = list()
getkey = function(m,cols,index) { paste(m[index,cols],collapse=",") }
for (i in 1:nrow(data)) { keys[[getkey(data,1:2,i)]] = 0 }
for (i in 1:nrow(data)) { keys[[getkey(data,1:2,i)]] =
keys[[getkey(data,1:2,i)]] + data[i,4] }
I'm sure there's a more functional-programming approach to this
problem! Any ideas?
More information about the R-help
mailing list