[R] replacing values of rows with identical row names in two dataframes
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sat May 7 04:20:11 CEST 2016
Please use reply-all to keep the mailing list in the loop, and use plain text rather than HTML to make sure the your message gets through uncorrupted.
?merge
?lapply
# untested
# align rows by date
df1a <- merge( df1, df2, by="date", all.x=TRUE )
# like-named columns have .x or .y appended
df1an0 <- grep( "\\.x$", names( df1a ), values=TRUE )
df1an <- substr( df1an0, 1, nchar( df1an0 ) - 2 )
# make a list of updated columns
df1b <- lapply( df1an, function(nm) {
nmx <- paste0( nm, ".x" )
nmy <- paste0( nm, ".y" )
ifelse( is.na( df1a[[ nmx ]] ), df1a[[ nmy ]], df1a[[ nmx ]] )
}
# set the names of the fixed columns
df1b <- setNames( df1b, df1an )
# figure out the names of the non-duped columns
df1an1 <- grep( "\\.[xy]$", names( df1a ), invert =TRUE )
# make a new data frame
df1c <- data.frame( df1a[ , df1an1, drop=FALSE ], df1b )
--
Sent from my phone. Please excuse my brevity.
On May 6, 2016 4:32:15 PM PDT, Saba Sehrish <sabasehrish at yahoo.com> wrote:
>No. If there is some other way, i would like to go for it.
>RegardsSaba
>
>On Saturday, 7 May 2016, 11:30, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
>
>
> Why would you want to use a for loop? Is this homework?
>--
>Sent from my phone. Please excuse my brevity.
>
>On May 6, 2016 4:15:09 PM PDT, Saba Sehrish via R-help
><r-help at r-project.org> wrote:
>
>
>Hi
>
>I have two dataframes(df1, df2) with equal number of columns (1566) but
>lesser rows in df2 (2772 in df1 and 40 in df2). Row names are
>identical in both dataframes (date). I want to replace NAs of df1 with
>the values of df2 for all those rows having identical row names (date)
>but
>without affecting already existing values in those rows of df1.
>
>Please see below:
>
>df1:
>date 11A 11A 21B 3CC 3CC
>20040101 100 150 NA NA 140
>20040115 200 NA 200 NA NA
>20040131 NA 165 180 190 190
>20040205 NA NA NA NA NA
>20040228 NA NA NA NA NA
>20040301 150 155 170 150 160
>20040315 NA NA 180 190 200
>20040331 NA NA NA 175 180
>
>df2:
>date 11A 11A 21B 3CC 3CC
>20040131 170 NA NA NA NA
>20040228 140 145 165 150 155
>20040331 NA
>145 160 NA NA
>
>I want the resulting dataframe to be:
>
>df3:
>date 11A 11A 21B 3CC 3CC
>20040101 100 150 NA NA 140
>20040115 200 NA 200 NA NA
>20040131 170 165 180 190 190
>20040205 NA NA NA NA NA
>20040228 140 145 165 150 155
>20040301 150 155 170 150 160
>20040315 NA NA 180 190 200
>20040331 NA 145 160 175 180
>
>If it is possible, I would prefer to use "for loop" and "which"
>function to achieve the result.
>
>Please guide me in this regard.
>
>Thanks
>Saba
>
>
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list