[R] Bilateral matrix
William Dunlap
wdun|@p @end|ng |rom t|bco@com
Wed May 16 17:04:58 CEST 2018
Make current_location and previous_location factors with the same set of
levels. The levels could be the union of the values in the two columns or
a predetermined list. E.g.,
> x <- data.frame(previous_location=c("Mount Vernon","Burlington"),
current_location=c("Sedro Woolley","Burlington"))
> allCities <- levels(factor(unlist(x))) # union of observed values
> allCities
[1] "Burlington" "Mount Vernon" "Sedro Woolley"
> x[] <- lapply(x, factor, levels=allCities)
> xtabs(~previous_location + current_location,data=x)
current_location
previous_location Burlington Mount Vernon Sedro Woolley
Burlington 1 0 0
Mount Vernon 0 0 1
Sedro Woolley 0 0 0
or, using an externally determined set of cities
> allCities <- c("Anacortes","Burlington","Concrete","Mount Vernon","Sedro
Woolley")
> x[] <- lapply(x, factor, levels=allCities)
> xtabs(~previous_location + current_location,data=x)
current_location
previous_location Anacortes Burlington Concrete Mount Vernon Sedro Woolley
Anacortes 0 0 0 0 0
Burlington 0 1 0 0 0
Concrete 0 0 0 0 0
Mount Vernon 0 0 0 0 1
Sedro Woolley 0 0 0 0 0
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, May 16, 2018 at 7:49 AM, Miluji Sb <milujisb using gmail.com> wrote:
> Dear Bert and Huzefa,
>
> Apologies for the late reply, my account got hacked and I have just managed
> to recover it.
>
> Thank you very much for your replies and the solutions. Both work well.
>
> I was wondering if there was any way to ensure (force) that all possible
> combinations show up in the output. The full dataset has 25 cities but of
> course people have not moved from Boston to all the other 24 cities. I
> would like all the combinations if possible.
>
> Thank you again!
>
> Sincerely,
>
> Milu
>
> On Tue, May 8, 2018 at 6:28 PM, Bert Gunter <bgunter.4567 using gmail.com>
> wrote:
>
> > or in base R : ?xtabs ??
> >
> > as in:
> > xtabs(~previous_location + current_location,data=x)
> >
> > (You can convert the 0s to NA's if you like)
> >
> >
> > Cheers,
> > Bert
> >
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> > On Tue, May 8, 2018 at 9:21 AM, Huzefa Khalil <huzefa.khalil using umich.edu>
> > wrote:
> >
> >> Dear Miluji,
> >>
> >> If I understand correctly, this should get you what you need.
> >>
> >> temp1 <-
> >> structure(list(id = 101:115, current_location = structure(c(2L,
> >> 8L, 8L, 3L, 6L, 5L, 1L, 2L, 7L, 4L, 2L, 8L, 8L, 3L, 6L), .Label =
> >> c("Austin",
> >> "Boston", "Cambridge", "Durham", "Houston", "Lynn", "New Orleans",
> >> "New York"), class = "factor"), previous_location = structure(c(6L,
> >> 2L, 4L, 6L, 7L, 5L, 1L, 3L, 6L, 2L, 6L, 2L, 4L, 6L, 7L), .Label =
> >> c("Atlanta",
> >> "Austin", "Cleveland", "Houston", "New Orleans", "OKC", "Tulsa"
> >> ), class = "factor")), class = "data.frame", row.names = c(NA,
> >> -15L))
> >>
> >> dcast(temp1, previous_location ~ current_location)
> >>
> >> On Tue, May 8, 2018 at 12:10 PM, Miluji Sb <milujisb using gmail.com> wrote:
> >> > I have data on current and previous location of individuals. I would
> >> like
> >> > to have a matrix with bilateral movement between locations. I would
> like
> >> > the final output to look like the second table below.
> >> >
> >> > I have tried using crosstab() from the ecodist but I do not have
> another
> >> > variable to measure the flow. Ultimately I would like to compute the
> >> > probability of movement between cities (movement to city_i/total
> >> movement
> >> > from city_j).
> >> >
> >> > Is it possible to aggregate the data in this way? Any guidance would
> be
> >> > highly appreciated. Thank you!
> >> >
> >> > # Original data
> >> > structure(list(id = 101:115, current_location = structure(c(2L,
> >> > 8L, 8L, 3L, 6L, 5L, 1L, 2L, 7L, 4L, 2L, 8L, 8L, 3L, 6L), .Label =
> >> > c("Austin",
> >> > "Boston", "Cambridge", "Durham", "Houston", "Lynn", "New Orleans",
> >> > "New York"), class = "factor"), previous_location = structure(c(6L,
> >> > 2L, 4L, 6L, 7L, 5L, 1L, 3L, 6L, 2L, 6L, 2L, 4L, 6L, 7L), .Label =
> >> > c("Atlanta",
> >> > "Austin", "Cleveland", "Houston", "New Orleans", "OKC", "Tulsa"
> >> > ), class = "factor")), class = "data.frame", row.names = c(NA,
> >> > -15L))
> >> >
> >> > # Expected output
> >> > structure(list(X = structure(c(3L, 1L, 2L), .Label = c("Austin",
> >> > "Houston", "OKC"), class = "factor"), Boston = c(2L, NA, NA),
> >> > New.York = c(NA, 2L, 2L), Cambridge = c(2L, NA, NA)), class =
> >> > "data.frame", row.names = c(NA,
> >> > -3L))
> >> >
> >> > Sincerely,
> >> >
> >> > Milu
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posti
> >> ng-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list