[R] Making a markov transition matrix - more progress

Ajay Narottam Shah ajayshah at mayin.org
Mon Jan 23 11:58:18 CET 2006


I solved the problem in one more (and more elegant) way. So here's the
program again.

Where does R stand on the Anderson-Goodman test of 1957? I hunted
around and nobody seems to be doing this in R. Is it that there has
been much progress after 1957 and nobody uses it anymore?

# Problem statement:
#
# You are holding a dataset where firms are observed for a fixed
# (and small) set of years. The data is in "long" format - one
# record for one firm for one point in time. A state variable is
# observed (a factor).
# You wish to make a markov transition matrix about the time-series
# evolution of that state variable.

set.seed(1001)

# Raw data in long format --
raw <- data.frame(name=c("f1","f1","f1","f1","f2","f2","f2","f2"),
                  year=c(83,   84,  85,  86,  83,  84,  85,  86),
                  state=sample(1:3, 8, replace=TRUE)
                  )
# Shift to wide format --
fixedup <- reshape(raw, timevar="year", idvar="name", v.names="state",
                   direction="wide")
# Now tediously build up records for an intermediate data structure
tmp <- rbind(
             data.frame(prev=fixedup$state.83, new=fixedup$state.84),
             data.frame(prev=fixedup$state.84, new=fixedup$state.85),
             data.frame(prev=fixedup$state.85, new=fixedup$state.86)
             )
# This is a bad method because it is hardcoded to the specific values
# of "year".
markov <- table(tmp$prev, tmp$new)
markov

# Gabor's method --
transition.probabilities <- function(D, timevar="year",
                                     idvar="name", statevar="state") {
  stage1 <- merge(D, cbind(nextt=D[,timevar] + 1, D),
                  by.x=timevar, by.y="nextt")
  v1 <- paste(idvar,".x",sep="")
  v2 <- paste(idvar,".y",sep="")
  stage2 <- subset(stage1, stage1[,v1]==stage1[,v2])
  v1 <- paste(statevar,".x",sep="")
  v2 <- paste(statevar,".y",sep="")
  t(table(stage2[,v1], stage2[,v2]))
}

transition.probabilities(raw, timevar="year", idvar="name", statevar="state")

# The new and improved way --
library(msm)
statetable.msm(state, name, data=raw)

-- 
Ajay Shah                                      http://www.mayin.org/ajayshah  
ajayshah at mayin.org                             http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.




More information about the R-help mailing list