[R] Simple vector question.

jim holtman jholtman at gmail.com
Sat Jul 26 14:07:57 CEST 2008


you would basically change the statement as follows:

aggregate(x$Quantity, list(DayOfYear=x$DayOfYear, Category=x$Category), FUN=sum)

On Sat, Jul 26, 2008 at 2:30 AM,  <rkevinburton at charter.net> wrote:
> Thank you this is exactly what I wanted.
>
> I was unaware of the 'aggregate' function.
>
> Now what if I want to know the sum of the sales per day AND per Category? So for the data below the DayOfYearSales for HOLIDAY and day 1 would be 2.
>
> Thank you.
>
> Kevin
>
> ---- jim holtman <jholtman at gmail.com> wrote:
>> Is this what you want:
>>
>> > x
>>     Year DayOfYear    Sku Quantity CatId            Category      SubCategory
>> 1   2007         1 100091        1 10862             HOLIDAY        Christmas
>> 2   2007         1 100138        1 11160 PET COSTUMES Famous       (Licensed)
>> 3   2007         1 100194        1 10749  HATS, WIGS & MASKS   Wigs - Women's
>> 4   2007         1 100432        1 10865             HOLIDAY           Easter
>> 5   2007         1 100911        1 10120                 MEN  Superheroes Men
>> 600 2007         2 139002        1 10413               GIRLS Historical Girls
>> 601 2007         2 138959        1 10322                BOYS TV & Movies Boys
>> 602 2007         2 139005        1 10334                BOYS    Toddlers Boys
>> 603 2007         2 139052        1 10517                PLUS         Plus Men
>> 604 2007         2 138906        1 10322                BOYS TV & Movies Boys
>> 605 2007         2 138860        1     0           (Unknown)        (Unknown)
>> > aggregate(x$Quantity, list(DayOfYear=x$DayOfYear), FUN=sum)
>>   DayOfYear x
>> 1         1 5
>> 2         2 6
>> >
>>
>>
>> On Sat, Jul 26, 2008 at 12:06 AM,  <rkevinburton at charter.net> wrote:
>> > I have some data that I read in via read.csv:
>> >
>> >  sales2007 <- read.csv("Total2007.dat", header=TRUE)
>> >
>> > The data looks like:
>> >
>> >> sales2007[1:605,]
>> >  Year DayOfYear    Sku Quantity CatId           Category       SubCategory
>> > 1 2007         1 100091        1 10862            HOLIDAY         Christmas
>> > 2 2007         1 100138        1 11160       PET COSTUMES Famous (Licensed)
>> > 3 2007         1 100194        1 10749 HATS, WIGS & MASKS    Wigs - Women's
>> > 4 2007         1 100432        1 10865            HOLIDAY            Easter
>> > 5 2007         1 100911        1 10120                MEN   Superheroes Men
>> > . . . .
>> > 600 2007         2 139002        1 10413     GIRLS Historical Girls
>> > 601 2007         2 138959        1 10322      BOYS TV & Movies Boys
>> > 602 2007         2 139005        1 10334      BOYS    Toddlers Boys
>> > 603 2007         2 139052        1 10517      PLUS         Plus Men
>> > 604 2007         2 138906        1 10322      BOYS TV & Movies Boys
>> > 605 2007         2 138860        1     0 (Unknown)        (Unknown)
>> >>
>> >
>> > The DayOfYear goes from 1:365. I would like to form a vector from this data where the length of the vector is 365 and the value at each index coeresponds to the sum of the Quantity column where DayOfYear equals the index. For example if I was to use just the sample above this new vactor call it 'DayOfYearSales' would be:
>> >     DayOfYearSales[1] = 5
>> >     DayOfYearSales[2] = 6
>> > Since in the snippet above only DayOfYear = 1:2  is shown. I want to continue the sum for the whole data frame. I am sure this is fairly easy. I just cannot find out how to do it. Once I figure this out it would be relatively straightforward to apply the same principle to columns of like Category, SKU, or SubCategory.
>> >
>> > Something like:
>> >
>> > table(Category)
>> >
>> > would give me the number of entires for each unique value but I want the Quantity column used in this tabulation kind of like a frequency.
>> >
>> > Thank you.
>> >
>> > Kevin
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem you are trying to solve?
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list