[R] Create new data frame with conditional sums
Leonard Mada
|eo@m@d@ @end|ng |rom @yon|c@eu
Mon Oct 16 13:41:48 CEST 2023
Dear Jason,
The code could look something like:
dummyData = data.frame(Tract=seq(1, 10, by=1),
Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))
# Define the cutoffs
# - allow for duplicate entries;
by = 0.03; # by = 0.01;
cutoffs <- seq(0, 0.20, by = by)
# Create a new column with cutoffs
dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs,
labels = cutoffs[-1], ordered_result = TRUE)
# Sort data
# - we could actually order only the columns:
# Totpop & Cutoff;
dummyData = dummyData[order(dummyData$Cutoff), ]
# Result
cs = cumsum(dummyData$Totpop)
# Only last entry:
# - I do not have a nice one-liner, but this should do it:
isLast = rev(! duplicated(rev(dummyData$Cutoff)))
data.frame(Total = cs[isLast],
Cutoff = dummyData$Cutoff[isLast])
Sincerely,
Leonard
On 10/15/2023 7:41 PM, Leonard Mada wrote:
> Dear Jason,
>
>
> I do not think that the solution based on aggregate offered by GPT was
> correct. That quasi-solution only aggregates for every individual level.
>
>
> As I understand, you want the cumulative sum. The idea was proposed by
> Bert; you need only to sort first based on the cutoff (e.g. using an
> ordered factor). And then only extract the last value for each level.
> If Pct is unique, than you can skip this last step and use directly
> the cumsum (but on the sorted data set).
>
>
> Alternatives: see the solutions with loops or with sapply.
>
>
> Sincerely,
>
>
> Leonard
>
>
More information about the R-help
mailing list