[R] dataframe operation

Marc Schwartz marc_schwartz at comcast.net
Wed Jan 24 21:56:48 CET 2007


On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:
> > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > hi
> > > i have a dataframe "a" which looks like:
> > >  
> > > column1, column2, column3
> > > 10,12, 0
> > > NA, 0,1
> > > 12,NA,50
> > >  
> > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > >  
> > > any idea i can do it?
> > >  
> > > i appreciate any hint
> > > regards
> > > lukas
> > >  
> > 
> > Here is one possibility:
> > 
> > > sapply(seq(along = colnames(DF)), 
> >          function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> >      [,1] [,2] [,3]
> > [1,]   10   12    0
> > [2,]  100    0    1
> > [3,]   12  200   50
> > 
> > 
> > Note that the returned object will be a matrix, so if you need a data
> > frame, just coerce the result with as.data.frame().
> 
> OK....that's what I get for pulling the trigger too fast.
> 
> Just reverse the logic in the function:
> 
> > sapply(seq(along = colnames(DF)), 
>          function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
>      [,1] [,2] [,3]
> [1,]  100  200  300
> [2,]   NA  200  300
> [3,]  100   NA  300
> 
> 
> I misread the query initially.

Here is another possibility, which may be faster depending upon the
actual size and dims of your initial data frame.

Preallocate a matrix of replacement values:

Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),
              ncol = ncol(DF))

> Mat
     [,1] [,2] [,3]
[1,]  100  200  300
[2,]  100  200  300
[3,]  100  200  300


Now do the replacement:

> ifelse(!is.na(DF), Mat, NA)
  column1 column2 column3
1     100     200     300
2      NA     200     300
3     100      NA     300


In doing some testing, the above may be about 10 times faster than using
sapply() in my first solution, again depending upon the structure of
your DF.

HTH,

Marc



More information about the R-help mailing list