[R] who can give me some hint?

William Dunlap wdunlap at tibco.com
Thu Mar 12 19:30:48 CET 2009


I think I answered a very similar question from you yesterday
but perhaps the mail went astray.  The subject line is not
informative.

It may make it easier to think about if you use a function like
   isFirstInRun <- function(x) c(TRUE, x[-1]!=x[-length(x)]
Given a vector x (without NA's in it) it tells you if a given
element of x is the first in a run of identical values.  E.g.,
   x <- c(1,2,2,1,1,3)
   isFirstInRun(x)
   [1] TRUE TRUE FALSE TRUE FALSE TRUE
You don't have to understand why this works or why it works
quickly or have this idiom in your working set yet.  You do need to
know how to use logical values as subscripts to extract elements of
interest from vectors or rows of interest from data frames.  E.g.,
   act_2[ with(act_2, isFirstInRun(Rep)), ]
should returns row 51, 52, 58, and 60 of your example.  If you
want to only return the first of each Hour/Min combinary you could
use either
   isFirstInRun(interaction(Hour,Min))
or
   isFirstInRun(Hour)|isFirstInRun(Min)
as the row subscript to act_2 to pull out rows 51 and 60.

If this were to become a standard function it could be modified
to handle NA's, 0-long arguments, and multiple arguments.  (If
it accepted multiple arguments then rle() ought to be modified
in the same way, as they are closely related.) 

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

----------------------------------------------------------------------
Tammy Ma metal_licaling at live.com 
Thu Mar 12 11:25:56 CET 2009

> act_2
         Date    Dtime Hour Min Second               Rep
51 2006-02-22 14:52:18   14  52     18  useractivity_act
52 2006-02-22 14:52:18   14  52     18                 4
55 2006-02-22 14:52:49   14  52     49                 4
57 2006-02-22 14:52:51   14  52     51                 4
58 2006-02-22 14:52:52   14  52     52                 3
60 2006-02-22 14:54:42   14  54     42 useractivity_idle

I want to change act_2 to 
         Date    Dtime Hour Min Second               Rep

51 2006-02-22 14:52:18   14  52     18  useractivity_act

52 2006-02-22 14:52:18   14  52     18                 4
58 2006-02-22 14:52:52   14  52     52                 3
60 2006-02-22 14:54:42   14  54     42 useractivity_idle

in other word, I want to keep 1st if there are many repeated value, I
made the program as:


rm_r<-function(act_2){
 dm<-dim(act_2)[1]-1
 for(i in 2:dm){
   
 if(act_2$Rep[i+1]==act_2$Rep[i]){
   act_2<-act_2[-(i+1),]
   }else{
   act_2<-act_2
   }
 }
return(act_2)
}

when it moved one row on 1st loop, i should still start 2 but it become
3 at 2nd loop, if I add i<-i-1, then i go to 1
seems not reasonbale. How should I modify it`?

Tammy




More information about the R-help mailing list