[R] dataframe operation
Marc Schwartz
marc_schwartz at comcast.net
Wed Jan 24 21:56:48 CET 2007
On Wed, 2007-01-24 at 14:16 -0600, Marc Schwartz wrote:
> On Wed, 2007-01-24 at 14:10 -0600, Marc Schwartz wrote:
> > On Wed, 2007-01-24 at 20:27 +0100, Indermaur Lukas wrote:
> > > hi
> > > i have a dataframe "a" which looks like:
> > >
> > > column1, column2, column3
> > > 10,12, 0
> > > NA, 0,1
> > > 12,NA,50
> > >
> > > i want to replace all values in column1 to column3 which do not contain "NA" with values of vector "b" (100,200,300).
> > >
> > > any idea i can do it?
> > >
> > > i appreciate any hint
> > > regards
> > > lukas
> > >
> >
> > Here is one possibility:
> >
> > > sapply(seq(along = colnames(DF)),
> > function(x) ifelse(is.na(DF[[x]]), 100 * x, DF[[x]]))
> > [,1] [,2] [,3]
> > [1,] 10 12 0
> > [2,] 100 0 1
> > [3,] 12 200 50
> >
> >
> > Note that the returned object will be a matrix, so if you need a data
> > frame, just coerce the result with as.data.frame().
>
> OK....that's what I get for pulling the trigger too fast.
>
> Just reverse the logic in the function:
>
> > sapply(seq(along = colnames(DF)),
> function(x) ifelse(!is.na(DF[[x]]), 100 * x, DF[[x]]))
> [,1] [,2] [,3]
> [1,] 100 200 300
> [2,] NA 200 300
> [3,] 100 NA 300
>
>
> I misread the query initially.
Here is another possibility, which may be faster depending upon the
actual size and dims of your initial data frame.
Preallocate a matrix of replacement values:
Mat <- matrix(rep(seq(along = colnames(DF)) * 100, each = nrow(DF)),
ncol = ncol(DF))
> Mat
[,1] [,2] [,3]
[1,] 100 200 300
[2,] 100 200 300
[3,] 100 200 300
Now do the replacement:
> ifelse(!is.na(DF), Mat, NA)
column1 column2 column3
1 100 200 300
2 NA 200 300
3 100 NA 300
In doing some testing, the above may be about 10 times faster than using
sapply() in my first solution, again depending upon the structure of
your DF.
HTH,
Marc
More information about the R-help
mailing list