[R] nested for loop with data table

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Thu May 4 06:04:44 CEST 2017


You seem to be unaware of the "aggregate" data processing concept. There are many ways to accomplish aggregation, but I am not fluent in data.table methods but knowing the concept is the first step.

Perhaps look closely at [1], or Google for data table aggregation yourself? 

[1] https://www.r-bloggers.com/efficient-aggregation-and-more-using-data-table/amp/
-- 
Sent from my phone. Please excuse my brevity.

On May 3, 2017 8:17:21 AM PDT, Ek Esawi <esawiek at gmail.com> wrote:
>Thank you both Boris and Jim. Thank you, Boris, for advising to read
>the
>posting guide; I had and I just did.
>
>Jim’s idea is exactly what I want; however, I could not pass sset1,
>sset2,
>etc. to the j nested loop and collect the results in an vector.
>
>Here attached my code, file, and my question which should be clear now.
>The
>question again is instead of using separate loops for each sset1 and
>sset2,
>I want one nested loop? Because I have at least 10 subsets
>(sset1,sset2,sset3…..sset10).
>
>Thanks again, EK
>
>
>-------The code------
>
>install.packages("data.table")
>library(data.table)
>File1 <-  "C:/Users/SampleData.csv"
>DT <- fread(File1)
>sset1 <- DT[Num<10&Day<10]
>sset2 <- DT[Num>10&Day<15]
>
># Count how many combinations of A,B,C,D,E,F in each subset
>for ( i in 1:length(sset1)){
>  aa <- c(sset1[Grade=="A",.N],sset1[Grade=="D",.N])
>  bb <- c(sset1[Grade=="B",.N],sset1[Grade=="F",.N])
>  cc <- c(sset1[Grade=="C",.N],sset1[Grade=="A",.N])
>  counts <- c(aa, bb,cc)
>}
>
>for ( i in 1:length(sset2)){
>  aa1 <- c(sset2[Grade=="A",.N],sset2[Grade=="D",.N])
>  bb1 <- c(sset2[Grade=="B",.N],sset2[Grade=="F",.N])
>  cc1 <- c(sset2[Grade=="C",.N],sset2[Grade=="A",.N])
>  counts <-  c(aa1,bb1,cc1)
>}
>
>-----------The File------------
>
>   Num  Color Grade Value    Month Day
> 1:   1 yellow     A    20      May   1
> 2:   2  green     B    25     June   2
> 3:   3  green     A    10    April   3
> 4:   4  black     A    17   August   3
> 5:   5    red     C     5 December   5
> 6:   6 orange     D     0  January  13
> 7:   7 orange     E    12  January   5
> 8:   8 orange     F    11 February   8
> 9:   9 orange     F    99     July  23
>10:  10 orange     F    70      May   7
>11:  11  black     A    77     June  11
>12:  12  green     B    87    April  33
>13:  13  black     A    79   August   9
>14:  14  green     A    68 December  14
>15:  15  black     C    90  January  31
>16:  16  green     D    79  January  11
>17:  17  black     E   101 February  17
>18:  18    red     F    90     July  21
>19:  19    red     F   112 February  13
>20:  20    red     F   101     July  20
>
>On Tue, May 2, 2017 at 12:35 PM, Ek Esawi <esawiek at gmail.com> wrote:
>
>> I have a huge data file; a sample is listed below. I am using the
>package
>> data table to process the file and I am stuck on one issue and need
>some
>> feedback. I used fread to create a data table. Then I divided the
>data
>> table (named File1) into 10 general subsets using common table
>commands
>> such as:
>>
>>
>>
>> AAA <- File1[Num<5&day>15]
>>
>> BBB <- File1[Num>15&day<10]
>>
>> …..
>>
>> …..
>>
>> …..
>>
>> …..
>>
>> …..
>>
>> …..
>>
>>
>>
>> I wanted to divide and count each of the above subsets based on a set
>of
>> parameters common to all subsets. I did the following to go through
>each
>> subset and it works:
>>
>> For (I in 1: length (AAA)) {
>>
>>               aa <- c(AAA[color==”green”&grade==”a”,month==”Januray”
>> .N],[ AAA[color==”green”&grade==”b”& month==”June”’ .N])
>>
>> }
>>
>>
>>
>> The question: I don’t want to have a separate loop for each subset
>(10
>> loops). Instead, I was hoping to have 2 nested loops in the form
>below:
>>
>>
>>
>> For (I in 1:N)){
>>
>>               For (j in 1:M){
>>
>>
>>
>> }
>>
>> }
>>
>>
>>
>>  Sample
>>
>>
>> Num
>>
>> Color
>>
>> Grade
>>
>> Value
>>
>> Month
>>
>> Day
>>
>> 1
>>
>> yellow
>>
>> A
>>
>> 20
>>
>> May
>>
>> 1
>>
>> 2
>>
>> green
>>
>> B
>>
>> 25
>>
>> June
>>
>> 2
>>
>> 3
>>
>> green
>>
>> A
>>
>> 10
>>
>> April
>>
>> 3
>>
>> 4
>>
>> black
>>
>> A
>>
>> 17
>>
>> August
>>
>> 3
>>
>> 5
>>
>> red
>>
>> C
>>
>> 5
>>
>> December
>>
>> 5
>>
>> 6
>>
>> orange
>>
>> D
>>
>> 0
>>
>> January
>>
>> 13
>>
>> 7
>>
>> orange
>>
>> E
>>
>> 12
>>
>> January
>>
>> 5
>>
>> 8
>>
>> orange
>>
>> F
>>
>> 11
>>
>> February
>>
>> 8
>>
>> 9
>>
>> orange
>>
>> F
>>
>> 99
>>
>> July
>>
>> 23
>>
>> 10
>>
>> orange
>>
>> F
>>
>> 70
>>
>> May
>>
>> 7
>>
>> 11
>>
>> black
>>
>> A
>>
>> 77
>>
>> June
>>
>> 11
>>
>> 12
>>
>> green
>>
>> B
>>
>> 87
>>
>> April
>>
>> 33
>>
>> 13
>>
>> black
>>
>> A
>>
>> 79
>>
>> August
>>
>> 9
>>
>> 14
>>
>> green
>>
>> A
>>
>> 68
>>
>> December
>>
>> 14
>>
>> 15
>>
>> black
>>
>> C
>>
>> 90
>>
>> January
>>
>> 31
>>
>> 16
>>
>> green
>>
>> D
>>
>> 79
>>
>> January
>>
>> 11
>>
>> 17
>>
>> black
>>
>> E
>>
>> 101
>>
>> February
>>
>> 17
>>
>> 18
>>
>> red
>>
>> F
>>
>> 90
>>
>> July
>>
>> 21
>>
>> 19
>>
>> red
>>
>> F
>>
>> 112
>>
>> February
>>
>> 13
>>
>> 20
>>
>> red
>>
>> F
>>
>> 101
>>
>> July
>>
>> 20
>>
>>
>>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list