[R] merge data
Chuck White
chuckwhite8 at charter.net
Wed Nov 11 06:18:29 CET 2009
David -- thank you for your response.
merge does work but it creates another dataframe. df1 is very large and I did not want another copy created. What I ended up doing is:
df1 <- merge(df1, df2, by="week")
In terms of memory allocation, will memory for two dataframes be allocated or will the additional column be added to df1?
Thanks.
---- David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Nov 10, 2009, at 12:36 PM, Chuck White wrote:
>
> > df1 -- dataframe with column date and several other columns. #rows
> > >40k Several of the dates are repeated.
> > df2 -- dataframe with two columns date and index. #rows ~130 This
> > is really a map from date to index.
> >
> > I would like to create a column called index in df1 which has the
> > corresponding index from df2.
> >
> > The following works:
> > index <- NULL
> > for(wk in df1$week){
> > index <- c(index,df2$index[df2$week==wk])
> > }
> > and then add index to df1.
> >
> > Can you please suggest a better way of doing this? I didn't think
> > merge was suitable for this...is it? THANKS.
>
> I think merge should work, but if you really have looked at the
> various arguments, tested reasonable examples and are still convinced
> it wouldn't, then see what you get with:
>
> > df1 <- data.frame(dt = Sys.Date() - sample(100:120, 30,
> replace=TRUE), 1:30)
> > df2 <- data.frame(dt2 = Sys.Date() -100:120, index=LETTERS[1:21])
>
> > df1$index <- df2[ match(df1$dt,df2$dt2), "index"]
> > df1
> dt X1.30 index
> 1 2009-07-30 1 D
> 2 2009-07-16 2 R
> 3 2009-07-23 3 K
> 4 2009-07-29 4 E
> 5 2009-07-15 5 S
> 6 2009-08-02 6 A
> 7 2009-07-18 7 P
> 8 2009-07-21 8 M
> 9 2009-07-27 9 G
> 10 2009-07-26 10 H
> 11 2009-07-31 11 C
> 12 2009-07-26 12 H
> 13 2009-07-18 13 P
> 14 2009-07-23 14 K
> 15 2009-07-21 15 M
> 16 2009-07-19 16 O
> 17 2009-07-14 17 T
> 18 2009-07-16 18 R
> 19 2009-07-15 19 S
> 20 2009-07-13 20 U
> 21 2009-07-28 21 F
> 22 2009-07-20 22 N
> 23 2009-07-24 23 J
> 24 2009-07-20 24 N
> 25 2009-07-16 25 R
> 26 2009-07-30 26 D
> 27 2009-07-14 27 T
> 28 2009-08-02 28 A
> 29 2009-07-19 29 O
> 30 2009-07-26 30 H
>
> I tried merge(df1, df2, by.x=1, by.y=1) and got the same result modulo
> the order of the output.
>
>
> --
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
More information about the R-help
mailing list