[R] Splitting Area under curve into equal portions
Daniel Nordlund
djnordlund at verizon.net
Thu Mar 26 09:39:56 CET 2009
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Nathan S.
> Watson-Haigh
> Sent: Wednesday, March 25, 2009 10:59 PM
> To: milton ruser
> Cc: r-help at r-project.org
> Subject: Re: [R] Splitting Area under curve into equal portions
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Milton,
>
> Not quite, that would be an equal number of data points in
> each colour group.
> What I want is an unequal number of points in each group such that:
> sum(work[group.members]) is approximately the same for each
> group of data points.
>
> In the mean time, I came up with the following, and took a
> leaf out of your book
> with the colouring for example:
>
> <code>
> n <- 2002
> work <- vector()
> for(x in 1:(n-2)) {
> work[x] <- ((n-1-x)*(n-x))/2
> }
> plot(work)
>
> tasks <- vector('list')
> tasks_per_slave <- 1
> work_per_task <- sum(work) / (n_slaves * tasks_per_slave)
>
> # Now define ranges of x of equal "work"
> block_start <- 1
> for(x in (1:(length(work)))) {
> if(x == length(work)) {
> # this will be the last block
> tasks[[length(tasks)+1]] <- list(x=block_start:length(work))
> break
> }
> work_in_block_to_x <- sum(work[block_start:(x)])
>
> if(work_in_block_to_x > work_per_task) {
> # use this value of x as the chunk end
> tasks[[length(tasks)+1]] <- list(x=block_start:x)
>
> # move the block_start position
> block_start <- x+1
> }
> }
>
> colours <- vector()
> for(i in 1:length(tasks)) {
> colours <- append(colours,rep(i,length(tasks[[i]]$x)))
> }
>
> plot(work, col=colours)
> </code>
>
> Essentially, the area under the line for each of the coloured
> groups (i.e. the
> total work associated with those values of x) should be
> approximately equal and
> I believe the above code achieves this. Just found the
> cumsum() function. You
> could look at it this way:
>
> <code>
> plot(cumsum(work), col=colours)
> </code>
>
> The coloured groupings coincide with splitting the cumulative
> total (y-axis)
> into 4 approximately equal bits.
>
> There must be a nicer way to do this!
> Nathan
>
Nathan,
Someone will probably come up with a more elegant way, but does this help?
slice() will partition work into n groups where the sum in each group is
approximately the same. slice() returns the index of the last element of
work[] for each group (except the last group). The first group can be
indexed by 1:p[1]. The second by (p[1]+1):p[2] ... And the n-th group by
p[n-1]:N, where N <- length(work).
slice <- function(v, n){
subtot <- floor(sum(v)/n)
cumtot <- cumsum(v)
p <- rep(0,n-1)
for(i in 1:(n-1)) p[i] <- max(which(cumtot < (subtot*i)))
p
}
#to break work into ten groups
slice(work,10)
Hope this is helpful,
Dan
Daniel Nordlund
Bothell, WA USA
More information about the R-help
mailing list