[R] How to replace NAs in a vector of factors?

hadley wickham h.wickham at gmail.com
Wed Jul 22 03:18:39 CEST 2009


On Tue, Jul 21, 2009 at 7:39 PM, Gene Leynes<gleynes+r at gmail.com> wrote:
> # Just when I thought I had the basic stuff mastered....
> # This has been quite perplexing, thanks for any help
>
>
> ## Here's the example:
>
> db1=data.frame(
>    olditems=c('soup','','','','nuts'),
>    prices=c(4.45, 3.25, 4.42, 2.25, 3.98))
> db2=data.frame(
>    newitems=c('stew','crackers','tofu','goatsmilk','peanuts'))
>
> str(db1)    #factors and prices
> str(db2)    #new names, but I want *only* the updates
>
> is.na(db1$olditems)  #a little surprising that '' is not equal to NA

Why?

> db1$olditems==''     #oh good, at least I can get to the blanks this way
> db1$olditems[db1$olditems=='']  #wait, only one item is returned?

length(db1$olditems[db1$olditems==''])

> db1[db1$olditems=='',]  #somehow this works!
>
> #how would I get the new item names into the old items column of db1??
> # I was expecting that this would work:
> #    db1$olditems[db1$olditems=='']=
> #        db2$newitems[db1$olditems=='']

Try working with characters instead of factors.

db1$olditems <- as.character(db1$olditems)
db2$newitems <- as.character(db2$newitems)
db1$olditems[db1$olditems==''] <- db2$newitems[db1$olditems=='']

Hadley

-- 
http://had.co.nz/




More information about the R-help mailing list