[R-sig-Geo] model fitting of randomly generated data in spatstat
Rolf Turner
r.turner at auckland.ac.nz
Thu Apr 2 02:02:51 CEST 2015
On 02/04/15 03:09, Robert Leaf wrote:
> I was generating some data for analysis and was curious to see if we could
> fit a “MatClust” model using the function *spatstat*::kppm to some of our
> observed data. As a first cut, and to see if we get values that conform to
> our expectations, I fit models to simulated data and was curious about the
> results. I am hoping that the group can help me understand the departures
> from expecations.
>
> Is it reasonable that the kppm function should return parameters values
> that are similar to the those that generated the data?
Sure, given that the those used to generate the data are not too bizarre.
>
> We are not getting value that are anywhere close to what we would expect.
That appears to be because you are using *bizarre* parameter values to
generate your data. The algorithms used by kppm() can be expected to
return far-out results unless the data to which kppm() is applied have
at least *some* reasonable prospect of conforming to the model that is
being fitted.
> library(*spatstat*)
What are those asterisks doing in that call??? That cannot have been
the call that you actually used.
> (point.vals <- rMatClust(kappa = 2, r = 2, mu = 2000)) # generate random
> points
>
> if (point.vals$n > 0) { # some realizations of the model return .ppp
> variables of with no data
I was initially bewildered by this --- the expected number of points is
4000, so how could you possibly get zero points? I asked. Finally I saw
the light; with kappa = 2 you will zero parent points, and hence an
empty pattern about 13.5% of the time. I.e. kappa = 2 is just plain
silly-small.
Using "r = 2" (these days the syntax is ***scale = 2*** means that you
are forming clusters in discs of radius 2 .... in the unit square!!!
(You are using the default window.) This makes no sense to me.
Setting mu = 2000 means you are generating an average of 2000 points in
each such disk. I really don't think this is a realistic value for a
Matérn cluster process.
Your simulated pattern (if it is not empty) will have the appearance of
having arisen from a very high intensity Poisson process. Fitting a
Matérn cluster process to such a pattern results in ill-determined
parameter values.
Try:
set.seed(42)
X <- rMatClust(kappa=20,scale=0.04,mu=5)
fit <- kppm(X ~ 1,"MatClust")
fit
....
Fitted cluster parameters:
kappa scale
22.37058543 0.04168089
Mean cluster size: 4.514857 points
The estimated parameters are reasonably commensurate with those used
to generate the pattern.
<SNIP>
cheers,
Rolf Turner
P.S. If your chosen parameter values (kappa = 2, mu = 2000) were
selected in imitation of parameter estimates obtained from fitting a
Matérn cluster model to real data, then I would suggest that you should
probably *not* fit such a model to those data.
In modelling it is important to try fitting *appropriate* models to data
sets. Otherwise the results you get may well be meaningless.
R. T.
--
Rolf Turner
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
Home phone: +64-9-480-4619
More information about the R-sig-Geo
mailing list