[R] removing dropouts (setting the values to NA)
Petr Pikal
petr.pikal at precheza.cz
Fri Feb 14 07:49:03 CET 2003
Dear all
I hope there is somebody who encountered similar problem and
can give me a hint how to do it or where to look.
I have several data sets in DBF format. I can transfer them to R
data frames and then I want to perform aggregation or some
other computations, but there are values in my data which I can
call drop-outs and I want them to be discarded (see example).
Usually I can find row of zeros (the measuring device is out of
order or does not obtain any data) or a gradual decrease of some
measured values due to real interruption of the process. I would
like to do some evaluation (automatic) to set an logical vector
where, for instance, TRUE will stay for "correct" values and
FALSE will be for "drop-outs" (or vice versa).
Preferably I would like to ***discard few values before and after
actual drop-out occurred***. Then I will set all "wrong" values in
my variables to NA and continue further computations.
Here is some foo code for making artificial drop-outs similar like
in my actual data
x<-seq(0,100,.1)
y<-sin(x)+rnorm(length(x),mean=0,sd=1)
y1<-y-c(rep(0,200),exp(x[20:50]),rep(0,770))
y<-y1+50
y<-y*(y>0)
y[600:700]<-0
My actual data looks like:
Date, Time, Var1, Var2, Var3, ......
01.01.01, 03:05:00, 12, 27, 0.53, .....
01.01.01, 03:05:15, 12.2, 29, 1.2, .....
01.01.01, 03:05:30, 12.2, 29, 0, .....
.........
in several data sets.
I can simply put
idx1<-y==0
I can set an arbitrary limit under or over which the value is
considered a drop-out
idx2<-y<45
and I can combine both indexes
idx<-as.logical(idx1+idx2)
But I do not know how easily enlarge the TRUE parts of index
vector forwards and backwards the actual drop-out occurred.
The only way how I am able to accomplish it is
changes<-seq(along=x)[as.logical(diff(idx))]+1
than select odd an even values from changes subtract a certain
value from odd and add a value to even and construct something
like that
c(rep(F,odd[1]),rep(T,even[1]-odd[1]),rep(F,odd[2]-
even[1]),rep(T,even[2]- odd[2]),rep(F,length(x)-even[2]))
what is a little bit complicated and not very general solution.
Please can somebody help me find the better procedure or
function for such drop- out filtering?
Thank you.
Petr Pikal
Precheza a.s., Nabř.Dr.E.BeneÜe 24, 750 62 Přerov
tel: +420581 252 257 ; 724 008 364
petr.pikal at precheza.cz; p.pik at volny.cz
fax +420581 252 561
More information about the R-help
mailing list