[R-sig-Geo] Mann-kendall

Thu Oct 9 11:22:22 CEST 2014

Yep, it is very cool from the package maintaner.

Ok, I've used your suggestion and it works quite faster than I anticipated,
so for my current work I'll keep to it.

Before this I also calculated a simple linear regression using only the
rasters which was quite effective.

I tested for some random values and I got the slope and intercept with
correct values so I'm happy about it.

I did thought about using calc before but since I had a million pixels and
I didn't manage to make this one work for the whole raster, I ended up just
doing it using raster operations only. (The example on the page I took the
following code works, but not for my own data)

# get multiple layers, e.g. the slope _and_ intercept
fun <- function <http://inside-r.org/r-doc/base/function>(x) { lm
<http://inside-r.org/r-doc/stats/lm>(x ~ time
<http://inside-r.org/r-doc/stats/time>)$coefficients }
x3 <- calc(s <http://inside-r.org/r-doc/mgcv/s>, fun)

On 9 October 2014 03:17, Forrest Stevens <forrest at ufl.edu> wrote:

> You're welcome!  I'm glad you found it useful. I agree that the
> MannKendall package and functions could use slightly more informative
> warnings, though many packages won't give you any such warnings at
> all. At least M-K clues you in that you may be making a statistical
> error rather than letting one run on completely uninformed. Good luck
> getting it all running!
>
> Sincerely,
> Forrest
> --
> Forrest R. Stevens
> Ph.D. Candidate, QSE3 IGERT Fellow
> Department of Geography
> Land Use and Environmental Change Institute
> University of Florida
> www.clas.ufl.edu/users/forrest
>
>
> On Wed, Oct 8, 2014 at 7:58 PM, Nuno Sá <nunocesardesa at gmail.com> wrote:
> > Hello!
> >
> > ah! that "return(unlist(MannKendall(x)))" might have been why i didn't
> > manage to do it with the calc function,  will try it out, thank you!
> >
> > I have a landsat image of a large study area, about 1 092 005 cells by 7
> > "measurements", hence my avoidance of pixel by pixel but I am going to
> try
> > yours. (I also know I shouldn't be using this test for significance of
> the
> > MK here but this is just a test set that I am using for developing the
> > algorithm). Thank you also for the suggestion on how to generate the
> data.
> >
> > I'm going to leave the kendall you sugested running through the night. I
> > have the following warning though:
> >
> > WARNING: Error exit, tauk2, IFAULT =  10
> >
> > I've been reading and it might be related to the low number of n for my
> > testing case.
> > Found online in a response from the package maintainer A.I. McLeod:
> >
> > "Thank you though for your comments.  So I will improve the documentation
> > for Kendall by terminating the program with an error message when n<=3
> > (this case is of no interest to me) and warning message when n<12 that
> the
> > p-values may be inaccurate."
> >
> >
> > Thank you for everything!
> > Nuno
> >
> >
> > On 9 October 2014 00:26, Forrest Stevens <forrest at ufl.edu> wrote:
> >>
> >> Not to dissuade any interesting new algorithm development, and I may
> >> be missing something in the motivation for what you're trying to
> >> accomplish, but have you looked at the Kendall package and tried using
> >> it on your raster object?  I've been pleasantly surprised at its
> >> efficiency for at least moderately sized raster stacks.  I've
> >> implemented something like the following on real MODIS time series and
> >> it works well (simulated data provided here):
> >>
> >>
> >> #install.packages("Kendall")
> >>
> >> require(raster)
> >> require(Kendall)
> >>
> >> ## Simulate a small 25x25x25 raster with uncorrelated but trended
> series:
> >> set.seed(2002)
> >> slopes <- matrix(rnorm(625, 0, 1), nrow=25, ncol=25)
> >>
> >> r <- raster(slopes*1+matrix(rnorm(625, 0, 1), nrow=25, ncol=25))
> >> s <- stack(r)
> >>
> >> for (i in 2:25) {
> >> r <- raster(slopes*i+matrix(rnorm(625, 0, 1), nrow=25, ncol=25))
> >> s <- addLayer(s, r)
> >> }
> >>
> >>
> >> ## Confirm on a series with a high slope:
> >> slopes[1,3]
> >> plot(1:25, s[1,3,])
> >> MannKendall(s[1,3,])
> >>
> >> ## vs. one that has a very small slope:
> >> slopes[1,1]
> >> plot(1:25, s[1,1,])
> >> MannKendall(s[1,1,])
> >>
> >>
> >> ## Mann-Kendall on one pixel:
> >> MannKendall(s[1,1,])
> >>
> >>
> >> ## Perform Mann-Kendall on all pixels:
> >> a <- calc(s, fun=function(x) { return(unlist(MannKendall(x)))})
> >>
> >> ## Plot pixel-level Kendall Tau and p-values:
> >> plot(a$tau)
> >> plot(a$sl)
> >>
> >>
> >>
> >> Again, I might be missing something about your goals but hopefully
> >> this helps someone out there with similar problems.
> >>
> >> Sincerely,
> >> Forrest
> >>
> >> --
> >> Forrest R. Stevens
> >> Ph.D. Candidate, QSE3 IGERT Fellow
> >> Department of Geography
> >> Land Use and Environmental Change Institute
> >> University of Florida
> >> www.clas.ufl.edu/users/forrest
> >>
> >>
> >> On Wed, Oct 8, 2014 at 6:02 PM, Nuno Sá <nunocesardesa at gmail.com>
> wrote:
> >> > Hello
> >> >
> >> > Using the integration by part summation was quite fast (I owe a beer
> to
> >> > the
> >> > ones who've developed the raster package!!)
> >> >
> >> > So, let variable Z.combi be the Z value of the Mann-kendall.
> >> >
> >> > And PHI(x) be the CDF of the Standar normal distribution w/ mean 0 and
> >> > var
> >> > 1.
> >> >
> >> > According to Neeti the p-value is given by:
> >> >
> >> > p = 2 * (1 - PHI(|Z|)
> >> >
> >> > Now, on the code, using the integral expansion:
> >> >
> >> > -----------------------
> >> >
> >> > library(phangorn) #to have double factorials
> >> >
> >> > phi.aprox <- 0.5+exp((-1/2)*abs(Z.combi)^2)/sqrt(2*pi)
> >> > phi.integ <- 0
> >> >
> >> > for (n in 1:100){
> >> >
> >> >   phi.integ <- phi.integ + Z.combi^(2*n+1)/dfactorial(2*n+1)
> >> > }
> >> >
> >> > phi.aprox <- phi-aprox*phi.integ
> >> >
> >> > p.value.rst.neeti <- 2*(1-phi.aprox)
> >> >
> >> > -------------------
> >> >
> >> > The result falls within 0 to 1 which is a good indication. But all
> >> > depends
> >> > on everything I've asked before.
> >> >
> >> > Thank you!
> >> >
> >> >
> >> >
> >> > On 8 October 2014 22:23, Nuno Sá <nunocesardesa at gmail.com> wrote:
> >> >
> >> >> Hello!
> >> >>
> >> >> I've been trying for two days to implement the Mann-Kendall test. It
> is
> >> >> available in IDRISI and it would be cool that it would be available
> >> >> here.
> >> >>
> >> >> I believe I made part of it correctly but I am stuck in two steps.
> >> >>
> >> >> 1. Calculating the ties/NA coefficients. (Algorithm problem..)
> >> >>
> >> >> 2. Estimating the p-value on the last step (Lack of statistical
> >> >> knowledge
> >> >> problem maybe)
> >> >>
> >> >> So here it goes what I have so far. Bear with me that I have tried to
> >> >> keep
> >> >> my variables names with a certain "instant" meaning.
> >> >>
> >> >> ---------------------- Problem here, is identifying the coefficients
> >> >> for
> >> >> the ties/NA -------------------
> >> >>
> >> >> dummy.stack <- raster(stack.ndvi,layer=1) #required to initiate stack
> >> >> within loop
> >> >> mtx <-
> matrix(0,ncol=nlayers(stack.ndvi),nrow=(nlayers(stack.ndvi)-1))
> >> >> #this allows the construction of a matrix with tie values signalled
> to
> >> >> 1
> >> >>
> >> >> for (i in 1:(nlayers(stack.ndvi)-1)){
> >> >>
> >> >>   print("----- i ----")
> >> >>   print(i)
> >> >>   for (j in (i+1):nlayers(stack.ndvi)){
> >> >>
> >> >>     print(j)
> >> >>
> >> >>     dummy.stack <-
> >> >>
> >> >>
> stack(dummy.stack,sign(raster(stack.ndvi,layer=i)-raster(stack.ndvi,layer=j)))
> >> >>     tie.break <-
> >> >> sign(raster(stack.ndvi,layer=i)-raster(stack.ndvi,layer=j)) == 0
> >> >>     cc <- maxValue(tie.break)
> >> >>     if( cc == 1){
> >> >>       mtx[i,j-1]=1
> >> >>     }
> >> >>   }
> >> >> }
> >> >>
> >> >> S.stack <- dropLayer(dummy.stack,1) #removing dummy
> >> >>
> >> >> S.stati <- raster(S.stack,layer=1) #for the somatory - initiating in
> >> >> the
> >> >> second term (can be substituted by in-built overlay function)
> >> >>
> >> >> for (i in 2:nlayers(S.stack)){
> >> >>
> >> >>   S.stati <- S.stati + raster(S.stack,layer=i)
> >> >> }
> >> >>
> >> >>
> -----------------------------------------------------------------------
> >> >>
> >> >> #Ignoring the existence of ties/na (which I've read is acceptable for
> >> >> large enough datasets)
> >> >> #then:
> >> >>
> >> >>   var.mk.sqr <- sqrt(n*(n-1)*(2*n+5)/18)
> >> >>
> >> >>   S.c1 <- ((S.stati > 0)*S.stati - 1 )/ var.mk.sqr
> >> >>   S.c2 <- (S.stati ==0)*S.stati* 0
> >> >>   S.c3 <- ((S.stati < 0)*S.stati + 1 )/ var.mk.sqr
> >> >>
> >> >>   Z.combi <- overlay(S.c1,S.c2,S.c3,fun="sum")
> >> >>
> >> >> #"The P value (probability value p) of the MK statistic S of sample
> >> >> data
> >> >> can
> >> >> be estimated using the normal cumulative distribution function:"
> >> >> (Yue,2002
> >> >> and Neeti, 2011)
> >> >> # So, how to implement the CDF using raster data? Am I forced to do
> >> >> series
> >> >> expansions? would this not be to slow with raster? Or is there a
> >> >> quicker
> >> >> way to iterate pixel by pixel?
> >> >>
> >> >> Sorry for the long email and thank you in advance for any
> >> >> comments/corrections/recommendations!
> >> >>
> >> >> Best regards to all,
> >> >> Nuno Sá
> >> >>
> >> >>
> >> >> --
> >> >>
> >> >> Nuno César de Sá
> >> >> +351 91 961 90 37
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> >
> >> > Nuno César de Sá
> >> > +351 91 961 90 37
> >> >
> >> >         [[alternative HTML version deleted]]
> >> >
> >> > _______________________________________________
> >> > R-sig-Geo mailing list
> >> > R-sig-Geo at r-project.org
> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
> >
> >
> >
> > --
> >
> > Nuno César de Sá
> > +351 91 961 90 37
> >
>

-- 

Nuno César de Sá
+351 91 961 90 37

	[[alternative HTML version deleted]]