[R] how to chage values in data frame to NA iside a function
ripley@stats.ox.ac.uk
ripley at stats.ox.ac.uk
Tue Feb 25 08:58:03 CET 2003
You are mis-using <<-. I don't know what you think it does, so please
look it up. Using <<- in R/S programming is normally a sign of incorrect
thinking (but not quite always). (Also, it behaves differently in R and
in S which can be a cause of confusion to those who know only one of the
definitions.)
On Tue, 25 Feb 2003, Petr Pikal wrote:
> Thank you for your answers. It works OK but my real question is
> why my function behaves differently used on vector and data
> frame (or matrix or list).
> I attached a full version below with some foo data, but basically
> the function returns the correct index if applied correctly on any
> type (list, data frame, matrix, vector) but it changes values of
> operand only if operand is a vector.
Not so. It always alters an object called `y'. It just so happens that
your vector argument was called `y' and the other cases you tried were
not.
> Why please?
(Because that is what you asked it to do ....)
I can see a way to do what I think is your intention (to change the
object which was passed as the y argument from the parent environment),
but it is convoluted and against the spirit of a functional language, so I
won't describe it.
> On 21 Feb 2003 at 10:23, Spencer Graves wrote:
>
> > Thomas Blackwell's solution will also work if dropout(df$y) returns a
> > logical vector of length = length(df$y). This also allows more
> > general conditions, e.g.,
> >
> > select1 <- df[,1] > 0
> > select2 <- (select1) & (dr[,2] > 0)
> >
> > df[select2, "y"] <- NA
> >
> > Spencer Graves
> >
> > Thomas W Blackwell wrote:
> > > Petr -
> > >
> > > Does your function return "index" or return "y" after modifying y ?
> > > In the email, it looks as though it returns "index". If so, the
> > > following should work:
> > >
> > >
> > >>df$y[ dropout(df$y) ] <- NA
> > >
> > >
> > > - tom blackwell - u michigan medical school - ann arbor -
> > >
> > >
> > >
> > > On Fri, 21 Feb 2003, Petr Pikal wrote:
> > >
> > >
> > >>Dear all
> > >>
> > >>I have a function in which I would like to change some values to NA
> > >>according to some condition.
> > >>
> > >>dropout<-function(y, nahr=FALSE,...) {
> > >>
> > >><some stuff for computing an index>
> > >>
> > >>if (nahr) y[index]<<-NA
> > >>invisible(index)
> > >>
> > >>}
> > >>
> > >>in case y is a vector all works OK but if it is a part of data frame
> > >>by calling
> > >>
> > >>dropout(df$y) or dropout(df[,number]) no change is done.
> > >>
> > >>Please can you help me what is wrong with my code?
> > >>
> > >>By the way
> > >>
> > >>idx<-dropout(df$y)
> > >>df$y[idx]<-NA
> > >>
> > >>works OK
> > >>
> > >>Thanks a lot beforehand
> > >>
> > >>Best regards.
> > >>
> > >>Petr Pikal
>
>
> #foo data
>
> x<-seq(0,100,.1)
> y<-sin(x)+rnorm(length(x),mean=0,sd=1)
> y1<-y-c(rep(0,200),exp(x[20:50]),rep(0,770))
> y<-y1+50
> y<-y*(y>0)
> y[600:700]<-0
> df<-data.frame(y)
> mat<-as.matrix(df)
> mylist<-as.list(df)
>
> #vector
>
> plot(x,y)
> ddd<-dropout(y)
> points(x[ddd],y[ddd],col=2)
> ddd<-dropout(y,nahr=T)
> plot(x,y)
> rm(ddd)
>
> #data frame
>
> plot(x,df$y)
> ddd<-dropout(df$y)
> points(x[ddd],df$y[ddd],col=2)
> ddd<-dropout(df$y,nahr=T)
> plot(x,df$y)
> rm(ddd)
>
> #matrix
>
> plot(x,mat[,1])
> ddd<-dropout(mat[,1])
> points(x[ddd],mat[ddd,1],col=2)
> ddd<-dropout(mat[,1],nahr=T)
> plot(x,mat[,1])
> rm(ddd)
>
> #list
>
> plot(x,mylist$y)
> ddd<-dropout(df$y)
> points(x[ddd],mylist$y[ddd],col=2)
> ddd<-dropout(mylist$y,nahr=T)
> plot(x,mylist$y)
>
> #this is full function
>
> dropout<-function(y,span=21, mez=NULL, p=0.99995,
> nahradit=FALSE, ...) {
>
> ### this part is just computing the logical index vector with length
> = length(y) ### and TRUE values where dropout occurs
>
> #kontrola licheho spanu
> if(span/2-span%/%2<.4|span<2) span<-
> ceiling(span+floor(1/span)+.1)
>
> n<-length(y)
> s<-span%/%2
>
>
> idx1<-y==0
> prumer<-median(y[!idx1],na.rm=T)
>
> if (is.null(mez))
> {
> mez<-mad(y[!idx1],na.rm=T)
> dm<-prumer-mez*qnorm(p)
> hm<-prumer+mez*qnorm(p)
> } else {
>
> dm<-prumer-mez
> hm<-prumer+mez
> }
>
>
> idx2<-y<dm
> idx3<-y>hm
>
> idx<-as.logical(idx1+idx2+idx3)
> z <- embed(idx,span)
> rowSums(z)
> length(rowSums(z))
> sumy<-rowSums(z)>0
> index<-c(rep(sumy[1],s),sumy,rep(sumy[n-span+1],s))
>
> ### index is a returned logical vector and it is OK
>
> if (nahradit) y[index]<<-NA
> ### this is the ghastly line which does not work as I expected :-(
>
> invisible(index)
>
> }
>
>
> Thank you
>
> Best regardsPetr Pikal
> petr.pikal at precheza.cz
> p.pik at volny.cz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list