[R] Reference factors inside split
Naresh Gurbuxani
n@re@h_gurbux@n| @end|ng |rom hotm@||@com
Mon Jul 11 16:33:31 CEST 2022
This is what I was looking for. Thanks for your quick response and elegant solution.
Naresh
Sent from my iPhone
> On Jul 11, 2022, at 10:00 AM, Ben Tupper <btupper using bigelow.org> wrote:
>
> Hi,
>
> The grouping variable is removed from the subgroups when you split.
> Instead of iterating over the elements of the split list, you can
> iterate over the **names** of the elements. In your case the account
> name is the grouping variable.
>
>
> ##start
>
> library(lattice)
> mydf <- data.frame(
> date = rep(seq.Date(from = as.Date("2022-06-01"), by = 1, length.out =
> 10), 4),
> account = c(rep("ABC", 20), rep("XYZ", 20)),
> client = c(rep("P", 10), rep("Q", 10), rep("R", 10), rep("S", 10)),
> profit = round(runif(40, 2, 5), 2), sale = round(runif(40, 10, 20), 2))
>
> account.names <- data.frame(account = c("ABC", "DEF", "XYZ"),
> corp = c("ABC Corporation", "DEF LLC",
> "XYZ Incorporated"))
>
> mydf.split <- split(mydf, mydf$account)
>
> myplots <- sapply(names(mydf.split),
> function(name, x = NULL) {
> df <- x[[name]]
> myts <- aggregate(sale ~ date, FUN = sum, data = df)
> xyplot(sale ~ date, data = myts, main = name)
> }, x = mydf.split, USE.NAMES = TRUE, simplify = FALSE)
>
> myplots[["ABC"]]
> myplots[["XYZ"]]
>
> ## end
>
> Does that help?
>
>> On Mon, Jul 11, 2022 at 9:14 AM Naresh Gurbuxani
>> <naresh_gurbuxani using hotmail.com> wrote:
>>
>>
>> I want to split my dataframe according to a list of factors. Then, in
>> the resulting list, I want to reference the factors used in split. Is
>> it possible?
>>
>> Thanks,
>> Naresh
>>
>> mydf <- data.frame(
>> date = rep(seq.Date(from = as.Date("2022-06-01"), by = 1, length.out =
>> 10), 4),
>> account = c(rep("ABC", 20), rep("XYZ", 20)),
>> client = c(rep("P", 10), rep("Q", 10), rep("R", 10), rep("S", 10)),
>> profit = round(runif(40, 2, 5), 2), sale = round(runif(40, 10, 20), 2))
>>
>> account.names <- data.frame(account = c("ABC", "DEF", "XYZ"),
>> corp = c("ABC Corporation", "DEF LLC", "XYZ Incorporated"))
>>
>> mydf.split <- split(mydf, mydf$account)
>>
>> # This does not work
>> myplots <- lapply(mydf.split, function(df) {
>> myts <- aggregate(sales ~ date, FUN = sum, data = df)
>> xyplot(sales ~ date, data = myts, main = account)})
>>
>> # This works, but may have a large overhead
>> mydf <- merge(mydf, account.names, by = "account", all.x = TRUE)
>> mydf.split <- split(mydf, mydf$account)
>> myplots <- lapply(mydf.split, function(df) {
>> myts <- aggregate(sale ~ date, FUN = sum, data = df)
>> xyplot(sale ~ date, data = myts, main = unique(myts$corp))})
>>
>> # Now I can print one plot at a time
>> myplots[["ABC"]]
>> myplots[["XYZ"]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Ben Tupper (he/him)
> Bigelow Laboratory for Ocean Science
> East Boothbay, Maine
> http://www.bigelow.org/
> https://eco.bigelow.org
More information about the R-help
mailing list