[R] How to create an ifelse statement where it matches a different data.frame variable

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Thu Mar 27 03:10:14 CET 2014


Please keep the mailing list included by using "reply-all"... I am not 
doing this as a private consultation.

Your sample data is a step forward, but it is still not reproducible. You 
could Google "R reproducible example" and find 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example.

Try running the following R code, and decipher each step using the R help 
documentation (such as typing ?dput at the R command line) and the 
Introduction to R document (particularly about indexing):

# What you should have provided

# dput( PastData )
PastData <- structure(list(Name = c("aaa", "ccc", "ddd"), Code = c(1L, 3L, 
4L)), .Names = c("Name", "Code"), class = "data.frame", row.names = c(NA, 
-3L))

# dput( CurrentData )
CurrentData <- structure(list(Name = c("aaa", "bbb", NA, "ddd"), Code = 
1:4), .Names = c("Name", "Code"), class = "data.frame", row.names = c(NA, 
-4L))

# Want to fix CurrentData to be like NewData

# dput( NewData )
NewData <- structure(list(Name = c("aaa", "bbb", "ccc", "ddd"), Code = 
1:4), .Names = c("Name", "Code"), row.names = c(NA, -4L), class = 
"data.frame")

# What the answer might look like if you had provided the above

# Learning sequence... what is the current code vector?
CurrentData$Code
# Which indexes in PastData have the codes from CurrentData?
match( CurrentData$Code, PastData$Code )
# How do we look up the corresponding Name values?
PastData[ match( CurrentData$Code, PastData$Code ), "Name" ]
# At this point, we have a vector of names from PastData corresponding to 
# codes in CurrentData
# Pick only those names from PastData where the CurrentData$Name is NA
ifelse( is.na( CurrentData$Name ), PastData[ match( CurrentData$Code, 
PastData$Code ), "Name" ], CurrentData$Name )

# Proposed new data frame
MyNewData <- CurrentData
MyNewData$Name <- ifelse( is.na( CurrentData$Name ), PastData[ match( 
CurrentData$Code, PastData$Code ), "Name" ], CurrentData$Name )

# Check whether we achieved your goal
identical( MyNewData, NewData )

---

Please note that your data frame might already have converted the Name 
column to a factor, so the above code might need to be adapted or you 
might be better off re-importing your data so that the Name column is a 
character vector instead of a factor. If you had read about and created a 
reproducible example we would already know whether this was going to be a 
problem.

On Wed, 26 Mar 2014, Megan Weigel wrote:

> I apologize... I will make time to read Posting Guide.
> 
> The date looks similar to the following:
> 
> PastData:
> Name   Code
> aaa      1
> ccc      3
> ddd      4
> 
> CurrentData:
> Name   Code
> aaa      1
> bbb      2
> NA       3
> ddd      4
> 
> It should look like this...
> 
> NewData:
> Name   Code
> aaa      1
> bbb      2
> ccc      3
> ddd      4
> 
> The code has to replace NA with the name that corresponds to the same code number in the PastData. They
> also do not have the same number of rows.
> 
> Thank you very much,
> 
> Johnson
> 
> 
> On Wed, Mar 26, 2014 at 1:36 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>       Please read and in the future follow the Posting Guide, which requests that you provide a
>       reproducible example... that is, a series of R statements that we can run to get us to your
>       problem point with a small sample data set that resembles yours. Forging on anyway...
>
>       The ifelse function applies to vectors, not data frames. That is, as long as both data
>       frames have the same number of rows, you should be able to do things like
>
>       CurrentDataFrame$Name <- ifelse( CurrentDataFrame$Name=="NA", PastDataFrame$Name,
>       CurrentDataFrame$Name)
>
>       Please note that NA is completely different than "NA" (read the Introduction to R document
>       that comes with R if you need a refresher). If you are really trying to weed out NA values
>       then you would need to do something like
>
>       CurrentDataFrame$Name <- ifelse( is.na(CurrentDataFrame$Name), PastDataFrame$Name,
>       CurrentDataFrame$Name)
>
>       Also, if you need speed or are pushing the limits of your RAM, the following approach
>       avoids replacing the entire vector:
>
>       idx <- is.na(CurrentDataFrame$Name)
>       CurrentDataFrame[idx,"Name"] <- PastDataFrame[idx,"Name"]
>
>       ---------------------------------------------------------------------------
>       Jeff Newmiller                        The     .....       .....  Go Live...
>       DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>                                             Live:   OO#.. Dead: OO#..  Playing
>       Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>       /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
>       ---------------------------------------------------------------------------
>       Sent from my phone. Please excuse my brevity.
>
>       On March 26, 2014 9:44:06 AM PDT, Megan Weigel <mw5wags at gmail.com> wrote:
>       >Hello,
>       >
>       >Hopefully there is an answer for this, but I need an ifelse statement
>       >that
>       >replaces and returns a value based on a different dataframe. For
>       >example:
>       >
>       >CurrentDataFrame<-ifelse(CurrentDataFrame$Name=="NA",match(CurrentDataFrame$Code
>       >with PastDataFrame$Code),replace(CurrentDataFrame$Name) with
>       >(PastDataFrame$Name)
>       >
>       >
>       >I hope that makes sense.
>       >
>       >Thank you very much,
>       >
>       >Johnson
>       >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> 
> --
> Megan Weigel
> (865)924-2124
> mw5wags at gmail.com
> 
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------


More information about the R-help mailing list