> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Nathan S. > Watson-Haigh > Sent: Wednesday, March 25, 2009 10:59 PM > To: milton ruser > Cc: r-help@r-project.org > Subject: Re: [R] Splitting Area under curve into equal portions > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi Milton, > > Not quite, that would be an equal number of data points in > each colour group. > What I want is an unequal number of points in each group such that: > sum(work[group.members]) is approximately the same for each > group of data points. > > In the mean time, I came up with the following, and took a > leaf out of your book > with the colouring for example: > > <code> > n <- 2002 > work <- vector() > for(x in 1:(n-2)) { > work[x] <- ((n-1-x)*(n-x))/2 > } > plot(work) > > tasks <- vector('list') > tasks_per_slave <- 1 > work_per_task <- sum(work) / (n_slaves * tasks_per_slave) > > # Now define ranges of x of equal "work" > block_start <- 1 > for(x in (1:(length(work)))) { > if(x == length(work)) { > # this will be the last block > tasks[[length(tasks)+1]] <- list(x=block_start:length(work)) > break > } > work_in_block_to_x <- sum(work[block_start:(x)]) > > if(work_in_block_to_x > work_per_task) { > # use this value of x as the chunk end > tasks[[length(tasks)+1]] <- list(x=block_start:x) > > # move the block_start position > block_start <- x+1 > } > } > > colours <- vector() > for(i in 1:length(tasks)) { > colours <- append(colours,rep(i,length(tasks[[i]]$x))) > } > > plot(work, col=colours) > </code> > > Essentially, the area under the line for each of the coloured > groups (i.e. the > total work associated with those values of x) should be > approximately equal and > I believe the above code achieves this. Just found the > cumsum() function. You > could look at it this way: > > <code> > plot(cumsum(work), col=colours) > </code> > > The coloured groupings coincide with splitting the cumulative > total (y-axis) > into 4 approximately equal bits. > > There must be a nicer way to do this! > Nathan >
Nathan, Someone will probably come up with a more elegant way, but does this help? slice() will partition work into n groups where the sum in each group is approximately the same. slice() returns the index of the last element of work[] for each group (except the last group). The first group can be indexed by 1:p[1]. The second by (p[1]+1):p[2] ... And the n-th group by p[n-1]:N, where N <- length(work). slice <- function(v, n){ subtot <- floor(sum(v)/n) cumtot <- cumsum(v) p <- rep(0,n-1) for(i in 1:(n-1)) p[i] <- max(which(cumtot < (subtot*i))) p } #to break work into ten groups slice(work,10) Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.