[R] R first.id last.id function error
jim holtman
jholtman at gmail.com
Sat Sep 8 03:30:59 CEST 2007
This function should do it for you:
> file1 <- read.table(textConnection(" id rx week dv1
+ 1 1 1 1 1
+ 2 1 1 2 1
+ 3 1 1 3 2
+ 4 2 1 1 3
+ 5 2 1 2 4
+ 6 2 1 3 1
+ 7 3 1 1 2
+ 8 3 1 2 3
+ 9 3 1 3 4
+ 10 4 1 1 2
+ 11 4 1 2 6
+ 12 4 1 3 5
+ 13 5 2 1 7
+ 14 5 2 2 8
+ 15 5 2 3 5
+ 16 6 2 1 2
+ 17 6 2 2 4
+ 18 6 2 3 6
+ 19 7 2 1 7
+ 20 7 2 2 8
+ 21 8 2 1 9
+ 22 9 2 1 4
+ 23 9 2 2 5"), header=TRUE)
>
> mark.function <-
+ function(df){
+ df <- df[order(df$id, df$week),]
+ # create 'diff' of 'id' to determine where the breaks are
+ breaks <- diff(df$id)
+ # the first entry will be TRUE, and then every occurance of
non-zero in breaks
+ df$first.id <- c(TRUE, breaks != 0)
+ # the last entry is TRUE and every non-zero breaks
+ df$last.id <- c(breaks != 0, TRUE)
+ df
+ }
>
> mark.function(file1)
id rx week dv1 first.id last.id
1 1 1 1 1 TRUE FALSE
2 1 1 2 1 FALSE FALSE
3 1 1 3 2 FALSE TRUE
4 2 1 1 3 TRUE FALSE
5 2 1 2 4 FALSE FALSE
6 2 1 3 1 FALSE TRUE
7 3 1 1 2 TRUE FALSE
8 3 1 2 3 FALSE FALSE
9 3 1 3 4 FALSE TRUE
10 4 1 1 2 TRUE FALSE
11 4 1 2 6 FALSE FALSE
12 4 1 3 5 FALSE TRUE
13 5 2 1 7 TRUE FALSE
14 5 2 2 8 FALSE FALSE
15 5 2 3 5 FALSE TRUE
16 6 2 1 2 TRUE FALSE
17 6 2 2 4 FALSE FALSE
18 6 2 3 6 FALSE TRUE
19 7 2 1 7 TRUE FALSE
20 7 2 2 8 FALSE TRUE
21 8 2 1 9 TRUE TRUE
22 9 2 1 4 TRUE FALSE
23 9 2 2 5 FALSE TRUE
>
>
On 9/7/07, Gerard Smits <g_smits at verizon.net> wrote:
> Hi R users,
>
> I have a test dataframe ("file1," shown below) for which I am trying
> to create a flag for the first and last ID record (equivalent to SAS
> first.id and last.id variables.
>
> Dump of file1:
>
> > file1
> id rx week dv1
> 1 1 1 1 1
> 2 1 1 2 1
> 3 1 1 3 2
> 4 2 1 1 3
> 5 2 1 2 4
> 6 2 1 3 1
> 7 3 1 1 2
> 8 3 1 2 3
> 9 3 1 3 4
> 10 4 1 1 2
> 11 4 1 2 6
> 12 4 1 3 5
> 13 5 2 1 7
> 14 5 2 2 8
> 15 5 2 3 5
> 16 6 2 1 2
> 17 6 2 2 4
> 18 6 2 3 6
> 19 7 2 1 7
> 20 7 2 2 8
> 21 8 2 1 9
> 22 9 2 1 4
> 23 9 2 2 5
>
> I have written code that correctly assigns the first.id and last.id variabes:
>
> require(Hmisc) #for Lags
> #ascending order to define first dot
> file1<- file1[order(file1$id, file1$week),]
> file1$first.id <- (Lag(file1$id) != file1$id)
> file1$first.id[1]<-TRUE #force NA to TRUE
>
> #descending order to define last dot
> file1<- file1[order(-file1$id,-file1$week),]
> file1$last.id <- (Lag(file1$id) != file1$id)
> file1$last.id[1]<-TRUE #force NA to TRUE
>
> #resort to original order
> file1<- file1[order(file1$id,file1$week),]
>
>
>
> I am now trying to get the above code to work as a function, and am
> clearly doing something wrong:
>
> > first.last <- function (df, idvar, sortvars1, sortvars2)
> + {
> + #sort in ascending order to define first dot
> + df<- df[order(sortvars1),]
> + df$first.idvar <- (Lag(df$idvar) != df$idvar)
> + #force first record NA to TRUE
> + df$first.idvar[1]<-TRUE
> +
> + #sort in descending order to define last dot
> + df<- df[order(-sortvars2),]
> + df$last.idvar <- (Lag(df$idvar) != df$idvar)
> + #force last record NA to TRUE
> + df$last.idvar[1]<-TRUE
> +
> + #resort to original order
> + df<- df[order(sortvars1),]
> + }
> >
>
> Function call:
>
> > first.last(df=file1, idvar=file1$id,
> sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week))
>
> R Error:
>
> Error in as.vector(x, mode) : invalid argument 'mode'
> >
>
> I am not sure about the passing of the sort strings. Perhaps this is
> were things are off. Any help greatly appreciated.
>
> Thanks,
>
> Gerard
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
More information about the R-help
mailing list