[R] Markov transition matrices , missing transitions for certain years
Rolf Turner
rolf.turner at xtra.co.nz
Tue Apr 19 12:37:07 CEST 2011
Make two assumptions:
(1) The initial state probability distribution (``ispd'') is *NOT* a
function of the
transition probability matrix (``tpm'').
(2) The boxes are stochastically independent of each other.
Both of these assumptions may be dubious. The second assumption is
the crucial one, and I would guess it to be *highly* dubious. However
without it, you simply can't get anywhere.
Subject to these assumptions the maximum likelihood estimates of the
entries of the tpm may be found as follows:
Count the number of times that any box is in state "i" at time "t" and
in state "j" at time "t+1". Count over all boxes and all times t = 1,
2, ..., m-1,
where you have observation over m years. (You have to stop at m-1
in order to be able to have observations at time t+1.)
Let this count be c_ij. Let c_i. be the sum over j of c_ij
Let the tpm be P = [p_ij].
Then the maximum likelihood estimate of p_ij is equal to c_ij/c_i.
[The only time that things can go wrong here is if state "i" never appears
in any box, at any time t < m. In such a case the p_ij (j = 1, 2, 3,
..., K, where
K is the number of states or species) are simply not estimable from the
available data. We never observe state i making a transition to *any*
state,
so we cannot estimate the probabilities of such transitions.]
Writing R code to effect this estimation procedure is easy and is left as an
exercise for the reader. :-)
cheers,
Rolf Turner
On 19/04/11 12:47, Abby_UNR wrote:
> Hi all,
> I am working for nest box occupancy data for birds and would like to
> construct a Markov transition matrix, to derive transition probabilities for
> ALL years of the study (not separate sets of transition probabilities for
> each time step). The actual dataset I'm working with is 125 boxes over 14
> years that can be occupied by 7 different species, though I have provided a
> slimmed down portion for this question...
> -
> A box can be in 1 of 4 "states" (i.e. bird species): 1,2,3,4
> Included here are 4 "box histories" over 4 years (y97, y98, y99, y00)
>
> These are the box histories
>> b1<- c(1,1,4,2)
>> b2<- c(1,4,4,3)
>> b3<- c(4,4,1,2)
>> b4<- c(3,1,1,1)
>> boxes<- data.frame(rbind(b1,b2,b3,b4))
>> colnames(boxes)<- c("y97","y98","y99","y00")
>> boxes
> y97 y98 y99 y00
> b1 1 1 4 2
> b2 1 4 4 3
> b3 4 4 1 2
> b4 3 1 1 1
> My problem is that there are 16 possible transitions, but not all possible
> transitions occur at each time step. Therefore, don't think I could do
> something easy like create a table for each time step and add them together,
> for example:
>
>> t1.boxes<- table(boxes$y98, boxes$y97)
>> t1.boxes
>
> 1 3 4
> 1 1 1 0
> 4 1 0 1
>> t2.boxes<- table(boxes$y99, boxes$y98)
>> t2.boxes
>
> 1 4
> 1 1 1
> 4 1 1
> t1.boxes and t2.boxes could not be added together to calculate the frequency
> of each transition occurring because they are of different dimensions. I'm
> not quite sure how to deal with this, I have attempted to write a function
> (shown below), though I'm not sure if it is needed, I am a bit new the
> programming world. If I could get some help either with the function or a
> way around it that would be most appreciated! Thank you!
>
> --------------
> Function requires the commands already listed above:
>
> FMAT<- matrx(0, nrow=4, ncol=4, byrow=TRUE)
> #This is the matrix that will store the frequency of each possible
> transition occurring over the 4 years
>
> nboxes<- 4
> nyears<- 4
>
> for(row in 1:nboxes)
> {
> for(col in 1:(nyears-1))
> {
> FMAT[boxes[row,col+1], boxes[row,col]]<- boxes[boxes[row, col+1],
> boxes[row,col]]
> #This is the line of code I have been struggling with an am unsure about. I
> have tried
> #various versions of this and keep getting an assortment of error messages.
> }
> }
>
> FMAT
>
> }
More information about the R-help
mailing list