[R] Aggregate counts of observations with times surrounding a time?
Jim Lemon
drjimlemon at gmail.com
Tue May 16 11:43:24 CEST 2017
Hi Mark,
I think you might want something like this:
mndf<-data.frame(st=seq(1483360938,by=1700,length=10),
et=seq(1483362938,by=1700,length=10),
store=c(rep("gap",5),rep("starbucks",5)),
zip=c(94000,94000,94100,94100,94200,94000,94000,94100,94100,94200),
store_id=seq(50,59))
# orders the times and calculates number of simultaneous presences
count_simult<-function(x) {
nrows<-dim(x)[1]
timeorder<-order(unlist(mndf[1:nrows,c("st","et")]))
interval_counts<-data.frame(time=c(x$st,x$et)[timeorder],
startfin=rep(c("st","et"),each=5)[timeorder],count=rep(NA,10))
interval_counts[1,"count"]<-1
for(i in 2:(nrows*2)) {
interval_counts[i,"count"]<-
interval_counts[i-1,"count"]+
ifelse(interval_counts[i,"startfin"]=="st",1,-1)
}
return(interval_counts)
}
gap_counts<-count_simult(mndf[1:5,])
plot(gap_counts$time,gap_counts$count,type="l")
starbucks_counts<-count_simult(mndf[6:10,])
plot(starbucks_counts$time,gap_counts$count,type="l")
There are a lot of ways to plot the counts by time. If you have any
preferences, let me know.
Jim
On Tue, May 16, 2017 at 2:48 PM, Mark Noworolski <jmarkn at gmail.com> wrote:
> I have a data frame that has a set of observed dwell times at a set of
> locations. The metadata for the locations includes things that have varying
> degrees of specificity. I'm interested in tracking the number of people
> present at a given time in a given store, type of store, or zip code.
>
> Here's an example of some sample data (here st=start_time, and et=end_time):
> data.frame(st=seq(1483360938,by=1700,length=10),et=seq(1483362938,by=1700,length=10),store=c(rep("gap",5),rep("starbucks",5)),zip=c(94000,94000,94100,94100,94200,94000,94000,94100,94100,94200),store_id=seq(50,59))
> st et store zip store_id
> 1 1483360938 1483362938 gap 94000 50
> 2 1483362638 1483364638 gap 94000 51
> 3 1483364338 1483366338 gap 94100 52
> 4 1483366038 1483368038 gap 94100 53
> 5 1483367738 1483369738 gap 94200 54
> 6 1483369438 1483371438 starbucks 94000 55
> 7 1483371138 1483373138 starbucks 94000 56
> 8 1483372838 1483374838 starbucks 94100 57
> 9 1483374538 1483376538 starbucks 94100 58
> 10 1483376238 1483378238 starbucks 94200 59
>
> I'd like to be able to:
> a) create aggretages of the number of people present in each store_id at a
> given time
> b) create aggregates of the number of people present - grouped by zip or
> store
>
> I expect to be rolling up to hour or half hour buckets, but I don't think I
> should have to decide this up front and be able to do something clever to
> be able to use ggplot + some other library to plot the time evolution of
> this information, rolled up the way I want.
>
> Any clever solutions? I've trolled stackoverflow and this email list.. to
> no avail - but I'm willing to acknowledge I may have missed something.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list