[R] Random Cluster Generation Question

Jason L. Simms jlsimms at gmail.com
Fri Apr 10 01:28:22 CEST 2009


Hello,

Thanks for your note.  I recognize that the points per cluster is
random, and also that it is possible to set the mean number of points
per cluster through the function.  What I was hoping was that I could
specify a maximum number of points overall across all clusters, but
conceptually I don't know how that could even be implemented.  I ended
up adjusting the parameters of the function until I produced right
around 2,000 points in a 10x10 box, and then I just multiplied
everything by 100.  Not sure whether it's perfect, but I suspect that
it will work for my needs currently.

I'll look into the rThomas() function, too.  I am much more of an
applied stats person, so the subtle (or even not-so-subtle)
differences and advantages/disadvantages between a Thomas Process and
a Matern Process are unclear to me at the moment.

Jason

On Thu, Apr 9, 2009 at 7:06 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Apr 9, 2009, at 5:01 PM, Jason L. Simms wrote:
>
>> Hello,
>>
>> I am fairly new to R, but I am not new to programming at all.  I want
>> to generate random clusters in a 1,000x1,000 box such that I end up
>> with a total of about 2,000 points.  Once done, I need to export the
>> X,Y coordinates of the points.
>>
>> I have looked around, and it seems that the spatstat package has what
>> I need.  The rMatClust() function can generate random clusters, but I
>> have run into some problems.
>>
>> First, I can't seem to specify that I want x number of points.
>
> The number of points per cluster IS random.
>
>> So, right now it appears that if I want around 2,000 total points that I
>> must play around with the parameters of the function (e.g., mean
>> number of points per cluster, cluster radius, etc.) until I end up
>> with roughly 2,000 points.
>>
>> More problematic, however, is that specifying a 1,000x1,000 box is too
>> much to handle.  I have been running the following function for over
>> 24 hours straight on a decent computer and it has not stopped yet:
>>
>> clust <- rMatClust(1, 50, 5, win=owin(c(0,1000),c(0,1000)))
>
> It might well be due to the 1000 x 1000 dimensions but it is because of your
> parameters. It took a significant amount of time to yield 4-10 points on a 1
> x 1 window. Whereas this particular invocation much more quickly produced
> 2707 points with a mean of 100 points per uniform cluster within a 1 x 1
> square:
>
> Y <- rMatClust(20, 0.05, 100)
>
> If you wanted the x and y dimensions to be in the range of 0-1000,  couldn't
> you just multiply the x and y values inside Y by 1000.
>  Y$x <- 1000*Y$x
>  Y$y <- 1000*Y$y
>  plot(Y) # cannot see any points, probably because the plot.kkpm method is
> using
> # internal ranges inside that Y object. So you might loose the ability to
> use
> # other functions in that package
>  plot(Y$x, Y$y)  # as expected and took seconds at most.
>
> I would think that the most important task would be deciding on the function
> that controls the intensity process of the "offspring points". The points in
> this simple example clearly violate my notions of randomness because of the
> sharp edges at the cluster boundaries. So, you may want to examine
> rThomas(...) in the same package.
>
> There is, of course, a SIG spatial stats mailing list full of people better
> qualified than I on such questions.
>>
>> Clearly, I need to rethink my strategy.  Could I generate the points
>> in a 10x10 box with a radius of .5 and then multiply out the resulting
>> point coordinates by 100?  Is there another package that might meet my
>> needs better than spatstat for easy cluster generation?
>>
>> Any suggestions are appreciated.
>> --
>> Jason L. Simms, M.A.
>> USF Graduate Multidisciplinary Scholar
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>



-- 
Jason L. Simms, M.A.
USF Graduate Multidisciplinary Scholar
Co-President, Graduate Assistants United
Ph.D. / M.P.H. Student
Departments of Anthropology and Environmental and Occupational Health
University of South Florida




More information about the R-help mailing list