[R-sig-Geo] Help L Function
Adrian Baddeley
adrian.baddeley at uwa.edu.au
Tue Jan 14 02:23:56 CET 2014
Firstly some comments to the original questioner.
1. If you really need to do a formal test and obtain a p-value,
then before doing this, it is important to verify that the point pattern is homogeneous
(at least that the density of points per unit area is not spatially-varying)
because this is an underlying assumption of the 'L' function.
In spatstat you can use quadrat.test, kstest or bermantest for example.
2. The result of *any* test based on L will depend on the value or values of distance 'r'
that are used. If you choose to use mad.test() or dclf.test() then it is best
to choose the interval of 'r' values to be slightly larger than the range over which
you suspect a spatial interaction is likely.
3. Assuming you want to test CSR (complete spatial randomness) against the
alternative of clustering or repulsion, it would be best to use the argument
simulate=expression(runifpoint(n, W)) in the call to 'envelope',
where n is the number of points in your dataset, and W is the window.
Then the simulated patterns have the same number of points as the dataset.
4. An alternative to a hypothesis test would be a confidence interval for the
true L function. You can get this using the spatstat function 'lohboot' for example
(or using the asymptotic variance estimates provided in the Lest/Kest functions).
And some comments to the experts:
5. Establishing that the point process is stationary, is crucial to the validity of these methods.
6. There is no uniformly most powerful test.
If the alternative hypothesis is a point process with finite range R,
and we use mad.test() or dclf.test() over the interval of r values [0, s],
then the power of both tests is maximised when s is slightly greater than R,
and dclf.test() achieves a higher maximum power than mad.test(),
but the power of dclf.test() often drops off dramatically as s increases beyond R,
while the power of mad.test() never does. See Baddeley et al (2014)
7. Therefore the optimal choice of test depends on how much we know
about the (maximum) range of interaction R. If we know R = 4 metres,
then we should probably choose dclf.test() with s = 4.5. If we don't know
anything about the spatial interactions, then we should use mad.test().
8. Monte Carlo tests based on fitted models are typically conservative
(i.e. true probability of type I error is smaller than nominal probability alpha;
true p-value is smaller than calculated p-value).
In this context the conservatism can be extreme.
This can be avoided in the case where CSR is the null hypothesis,
by conditioning on n, as advised in #3 above. Otherwise it is a problem.
Reference:
@Article{bdhlmn14,
author = {A. Baddeley and P.J. Diggle and A. Hardegen and
T. Lawrence and R. Milne and G.M. Nair},
title = {On tests of spatial pattern based on simulation
envelopes},
journal = {Ecological Monographs},
year = 2014,
note = {In press}
}
Prof Adrian Baddeley FAA
University of Western Australia
________________________________________
From: Rolf Turner [r.turner at auckland.ac.nz]
Sent: Tuesday, 14 January 2014 3:40 AM
To: Marcelino de la Cruz
Cc: r-sig-geo at r-project.org; Adrian Baddeley
Subject: Re: [R-sig-Geo] Help L Function
I would like to add:
* If you perform a pointwise test, the value of "r" at which you conduct
the test much be chosen a priori --- before collecting, or at least
before "looking at" the data. Otherwise the associated p-value is
meaningless. It is *very* unlikely that you had an a priori value of r
in mind!
* A test based on the global envelope will not have very much power.
* My guess is that the dclf.test() route is your best bet.
cheers,
Rolf Turner
On 14/01/14 05:09, Marcelino de la Cruz wrote:
> Yes, it is possible and very easy. How do you extract your p-value
> depends on wether you are making a pointwise or a global test. See the
> help page of envelope(). You can also try a maximum absolute deviation
> test with dclf.test().
>
>
> Cheers,
>
> Marcelino
>
>
>
>
> El 13/01/2014 16:52, Francesco Carrer escribió:
>> Hi,
>>
>> I have a distribution of artifact within an archaeological surface
>> (dataset: ID, DIMENSION, X, Y), and I need to verify which is the
>> degree of
>> aggregation of these artifacts at different scales. I applied the L
>> Function (Lest in spatstat), and plotted the resulting observed values of
>> L(r) against the highest and lowest simulated values of L(r). Is it
>> possible to extract a p-value that assess that the aggregation of my data
>> is significantly higher (or lower) than the simulate values?
More information about the R-sig-Geo
mailing list