[R] Counting with multiple criteria using data table

David Winsemius dwinsemius at comcast.net
Thu Jun 22 02:58:38 CEST 2017


> On Jun 21, 2017, at 2:50 PM, Ek Esawi <esawiek at gmail.com> wrote:
> 
> I have a data.table which is shown below. I want to count combinations of
> columns on i and count on j with by. A few examples are given below the
> table.
> 
> 
> 
> I want to:
> 
> all months to show on the output including those that they have zero value
> 
> I want the three statements combined in on if possible so the output will
> be one data table; that is the outputs are next to each other as manually
> illustrated on the last part (desired output).
> 
> 
> 
> 
> 
> Thanks--EK
> 
> 
> 
> 
> 
>> Test
> 
>     Color Grade Value  Month Day
> 
> 1: yellow     A    20    May   1
> 
> 2:  green     B    25   June   2
> 
> 3:  green     A    10  April   3
> 
> 4:  black     A    17 August   3
> 
> 5:    red     C     5    May   5
> 
> 6: orange     D     0   June  13
> 
> 7: orange     E    12  April   5
> 
> 8: orange     F    11 August   8
> 
> 9: orange     F    99  April  23
> 
> 10: orange     F    70    May   7
> 
> 11:  black     A    77   June  11
> 
> 12:  green     B    87 August  33
> 
> 13:  black     A    79  April   9
> 
> 14:  green     A    68    May  14
> 
> 15:  black     C    90   June  31
> 
> 16:  green     D    79 August  11
> 
> 17:  black     E   101  April  17
> 
> 18:    red     F    90   June  21
> 
> 19:    red     F   112 August  13
> 
> 20:    red     F   101  April  20

You should have offered the output of:

 dput(Test)

> 
>> Test[Color=="green"&Grade=="A", .N, by=Month]
> 
>   Month N
> 
> 1: April 1
> 
> 2:   May 1
> 
>> Test[Color=="orange"&Grade=="F", .N, by=Month]
> 
>    Month N
> 
> 1: August 1
> 
> 2:  April 1
> 
> 3:    May 1
> 
> 
> 
>> Test[Color=="orange"&Grade=="F", .N, by=Month]
> 
>    Month     N
> 
> 1: August     1
> 
> 2:  April     1
> 
> 3:    May     1
> 
>> Test[Color=="red"&Grade=="F", .N, by=Month]
> 
>    Month N
> 
> 1:   June 1
> 
> 2: August 1
> 
> 3:  April 1
> 
> 
> 
> Desired output
> 
>                N1           N2           N3
> 
> April       1              1              1
> 
> May       1              1              1
> 
> June        0              0              0
> 
> August 0                1              1

I count 4 data.tables and a total of 11 items so why only 3 columns and 9 items?

Were you tabulating colors by month?

> Test[ (Color=="green"&Grade=="A") | 
(Color=="red"&Grade=="F") |
(Color=="orange"&Grade=="F")|
(Color=="orange"&Grade=="F")|
(Color=="red"&Grade=="F") ,        table(Month, Color)]
        Color
Month    green orange red
  April      1      1   1
  August     0      1   1
  June       0      0   1
  May        1      1   0
> 


> 
> 	[[alternative HTML version deleted]]

Rhelp is plain-text. Do read the Posting Guide:

Another possibility:

 Test[(Color=="green"&Grade=="A") |    These are the conditions separated by logical OR's in the first argument to `[data.table`
      (Color=="red"&Grade=="F") |
      (Color=="orange"&Grade=="F")|
      (Color=="orange"&Grade=="F")|
      (Color=="red"&Grade=="F") ,          table(Month, Grade)]


        Grade
Month    A F
  April  1 2
  August 0 2
  June   0 1
  May    1 1


> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list