[R] Help understanding loop behaviour

There is something wrong here I believe -- see inline below:

> For column J, ave/seq_along seems to be the simplest. For column I, ave
> is also a good option, it avoids split/lapply.
> xx\$I <- ave(xx\$NUMBER_OF_YEARS, xx\$COMPANY_NUMBER, FUN = function(x){
>    c(rep(1, length(x) - 1), max(length(x)))  ### ???
> })
**********
length() returns a single integer, so max(length(x)) makes no sense
************************************

> xx\$J <- ave(xx\$NUMBER_OF_YEARS, xx\$COMPANY_NUMBER, FUN = seq_along)
>
Às 11:49 de 30/04/21, PIKAL Petr escreveu:
> > Hallo,
> >
> > Sorry, my suggestion did not worked in your case correctly as split used
> > natural factor ordering.
> >
> > So using Jim's data, this results in desired output.
> >
> > #prepare factor in original ordering
> > ff <- factor(xx[,1], levels=unique(xx[,1]))
> > lll <- split(xx\$COMPANY_NUMBER, ff)
> > xx\$I <- unlist(lapply(lll, function(x) c(rep(1, length(x)-1),
> > max(length(x)))),use.names=FALSE)
> > xx\$J <- unlist(lapply(lll, function(x) 1:length(x)), use.names=FALSE)
> >> xx
> >     COMPANY_NUMBER NUMBER_OF_YEARS I J
> > 1           70837               3 1 1
> > 2           70837               3 1 2
> > 3           70837               3 3 3
> > 4         1000403               4 1 1
> > 5         1000403               4 1 2
> > 6         1000403               4 1 3
> > 7         1000403               4 4 4
> > 8        10029943               3 1 1
> > 9        10029943               3 1 2
> > 10       10029943               3 3 3
> > 11       10037980               4 1 1
> > 12       10037980               4 1 2
> > 13       10037980               4 1 3
> > 14       10037980               4 4 4
> > 15       10057418               3 1 1
> > 16       10057418               3 1 2
> > 17       10057418               3 3 3
> > 18        1009550               4 1 1
> > 19        1009550               4 1 2
> > 20        1009550               4 1 3
> > 21        1009550               4 4 4
> >> Hi email,
> >> If you want what you described, try this:
> >>
> >> 0070837  3
> >> 0070837  3
> >> 0070837  3
> >> 1000403  4
> >> 1000403  4
> >> 1000403  4
> >> 1000403  4
> >> 10029943  3
> >> 10029943  3
> >> 10029943  3
> >> 10037980  4
> >> 10037980  4
> >> 10037980  4
> >> 10037980  4
> >> 10057418  3
> >> 10057418  3
> >> 10057418  3
> >> 1009550  4
> >> 1009550  4
> >> 1009550  4
> >> 1009550  4",
> >> xx\$I<-NA
> >> xx\$J<-NA
> >> row_count<-1
> >> for(row in 1:nrow(xx)) {
> >>   if(row == nrow(xx) ||
> >> xx\$COMPANY_NUMBER[row]==xx\$COMPANY_NUMBER[row+1]) {
> >>    xx\$I[row]<-1
> >>    xx\$J[row]<-row_count
> >>    row_count<-row_count+1
> >>   } else {
> >>    xx\$I[row]<-xx\$J[row]<-xx\$NUMBER_OF_YEARS[row]
> >>    row_count<-1
> >>   }
> >> }
> >> xx
> >> Like Petr, I am assuming that you want company 10057418 treated the same
> >> as the others. If not, let us know why. I am also adssuming that the
> first
> > three
> >> rows should _not_ have a "#" at the beginning, which means that they
> will
> > be
> >>
> >> On Fri, Apr 30, 2021 at 1:41 AM e-mail ma015k3113 via R-help <r-help using r-
> >> project.org> wrote:
> >>> I am trying to understand how loops in operate. I have a simple
> >>> dataframe xx which is as follows
> >>>
> >>> COMPANY_NUMBER   NUMBER_OF_YEARS
> >>>
> >>> #0070837                             3
> >>> #0070837                             3
> >>> #0070837                             3
> >>> 1000403                               4
> >>> 1000403                               4
> >>> 1000403                               4
> >>> 1000403                               4
> >>> 10029943                             3
> >>> 10029943                             3
> >>> 10029943                             3
> >>> 10037980                             4
> >>> 10037980                             4
> >>> 10037980                             4
> >>> 10037980                             4
> >>> 10057418                             3
> >>> 10057418                             3
> >>> 10057418                             3
> >>> 1009550                               4
> >>> 1009550                               4
> >>> 1009550                               4
> >>> 1009550                               4
> >>> The code I have written is
> >>>
> >>> while (i <= nrow(xx1) )
> >>>
> >>> {
> >>>
> >>> for (j in 1:xx1\$NUMBER_OF_YEARS[i])
> >>> {
> >>> xx1\$I[i] <- i
> >>> xx1\$J[j] <- j
> >>> xx1\$NUMBER_OF_YEARS_j[j] <- xx1\$NUMBER_OF_YEARS[j] } i=i +
> >>> (xx1\$NUMBER_OF_YEARS[i] ) } After running the code I want my
> >> dataframe
> >>> to look like
> >>>
> >>> |COMPANY_NUMBER |NUMBER_OF_YEARS| | I| |J|
> >>>
> >>> |#0070837 |3| |1| |1|
> >>> |#0070837 |3| |1| |2|
> >>> |#0070837 |3| |3| |3|
> >>> |1000403 |4| |1| |1|
> >>> |1000403 |4| |1| |2|
> >>> |1000403 |4| |1| |3|
> >>> |1000403 |4| |4| |4|
> >>> |10029943 |3| |1| |1|
> >>> |10029943 |3| |1| |2|
> >>> |10029943 |3| |3| |3|
> >>> |10037980 |4| |1| |1|
> >>> |10037980 |4| |1| |2|
> >>> |10037980 |4| |1| |3|
> >>> |10037980 |4| |4| |4|
> >>> |10057418 |3| |1| |1|
> >>> |10057418 |3| |1| |1|
> >>> |10057418 |3| |1| |1|
> >>> |1009550 |4| |1| |1|
> >>> |1009550 |4| |1| |2|
> >>> |1009550 |4| |1| |3|
> >>> |1009550 |4| |4| |4|
> >>> I get the correct value of I but in the wrong row but the vaule of J
> >>> is correct in the first iteration and then it goes to 1
> >>>
> >>> Any help will be greatly appreciated
