[R] Bus stop sequence matching problem
David McPearson
dmcp at webmail.co.za
Sat Aug 30 09:54:02 CEST 2014
Homework? The list has a no homework policy - but perhaps I'll be forgiven por
posting hints.
In general terms, this is how I appraoched the problem:
* Loop through the rows of stop_onoff - for (idx in ...someething...) {...
* For each row, find the first of "ref" in a suitably filtered subset of
stop_sequence, and keep track of these row numbers
* Update columns "on" and "off"
* Use cumsum to calculate the number of passengers on the bus
Note the loop. Someone cleverer than I might be able to vectorise that step,
but I couldn't see how.
By the way, if this is homework...
Are you sure you're desired_output is correct? I would expect someething like
seq ref on off load
1 10 A 5 0 5
2 20 B 0 0 5
3 30 C 0 0 5
4 40 D 0 2 3
5 50 B 10 2 11
6 60 A 0 6 5
Are you aware that you're "ref" ccolumns are factors and not characters? If
you use "stringsAsFactors = FALSE" or
stop_onoff <-
data.frame(ref=factor(c('A','D','B','A'), levels =
levels(stop_sequence$ref)),on=c(5,0,10,0),off=c(0,2,2,6))
it will simplify your'e analysis (or at least reduce some typing).
Type the following in an R console
?data.frame
?factor
and have a read.
Now, if this ain't homework, or you just want someone to do it for you, e-mail
me offline and I'll send you my appraoch. If it is homework, let me know - I'm
happy to help anyway, but I will be trying to help you solve this for
yourself.
Cheers,
DMcP
On Sat, 30 Aug 2014 12:46:17 +1200 Adam Lawrence <alaw005 at gmail.com> wrote
> I am hoping someone can help me with a bus stop sequencing problem in R,
> where I need to match counts of people getting on and off a bus to the
> correct stop in the bus route stop sequence. I have tried looking
> online/forums for sequence matching but seems to refer to numeric sequences
> or DNA matching and over my head. I am after a simple example if anyone can
> please help.
>
> I have two data series as per below (from database), that I want to
> combine. In this example “stop_sequence” includes the equence (seq) of
bus
> stops and “stop_onoff” is a count of people getting on and off at
certain
> stops (there is no entry if noone gets on or off).
>
> stop_sequence <- data.frame(seq=c(10,20,30,40,50,60),
> ref=c('A','B','C','D','B','A'))
> ## seq ref
> ## 1 10 A
> ## 2 20 B
> ## 3 30 C
> ## 4 40 D
> ## 5 50 B
> ## 6 60 A
> stop_onoff <-
> data.frame(ref=c('A','D','B','A'),on=c(5,0,10,0),off=c(0,2,2,6))
> ## ref on off
> ## 1 A 5 0
> ## 2 D 0 2
> ## 3 B 10 2
> ## 4 A 0 6
>
> I need to match the stop_onoff numbers in the right sto sequence, with the
> correctly matched output as follows (load is a cumulative count of on and
> off)
>
> desired_output <- data.frame(seq=c(10,20,30,40,50,60),
> ref=c('A','B','C','D','B','A'),
> on=c(5,'-','-',0,10,0),off=c(0,'-','-',2,2,6), load=c(5,0,0,3,11,5))
> ## seq ref on off load
> ## 1 10 A 5 0 5
> ## 2 20 B - - 0
> ## 3 30 C - - 0
> ## 4 40 D 0 2 3
> ## 5 50 B 10 2 11
> ## 6 60 A 0 6 5
>
> In this example the stop “B” is matched to the second stop “B” in
the stop
> sequence and not the first because the onoff data is after stop “D”.
>
> Any guidance much appreciated.
>
> Regards
> Adam
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
____________________________________________________________
South Africas premier free email service - www.webmail.co.za
Cheapest Insurance Quotes!
https://www.outsurance.co.za/insurance-quote/personal/?source=msn&cr=Postit14_468x60_gif&cid=322
More information about the R-help
mailing list