[R] efficiency in merging two data frames

Gabor Grothendieck ggrothendieck at gmail.com
Mon May 1 13:47:58 CEST 2006


Some functions that may be of help:

?aggregate.ts
?cbind
?merge

and in the zoo package

?as.yearmon
?as.yearqtr
?aggregate.zoo
?merge.zoo

On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:
> I have two data sets about lots of companies' stock
> and fiscal data.  One is monthly data with about
> 144,000 lines, and the other is quaterly with about
> 56,000.  Each data set takes different company code.
> I need to merge these two together.  I read both ask
> cvs.  And the other file with corresponding firm code.
>  Now I have three data sets. return$PERMNO,
> account$GVKEY.  id is the data frames of the
> corresponding relation and has both id$PERMNO and
> id$GVKEY.  Also, I need to convert the return's month
> into quarter and finally merge two data frames(return
> and account).  I end up write a short program for
> this, but it runs very slow.  15+ minutes.  Is there
> quick way to do it.  Here is my original codes.
>
>
>
> id$fy=rep(0,length(id$PERMNO))
> for (i in 1:length(id$PERMNO))
>
> id$fy[[i]]<-account$FYR[id$GVKEY[[i]]==account$GVKEY][[1]]
>
> return$GVKEY=rep(0,length(return$PERMNO))
> return$fyy=rep(0,length(return$PERMNO))
> return$fyq=rep(0,length(return$PERMNO))
> for (i in i:length(return$PERMNO)) {
>    temp<-id$PERMNO==return$PERMNO[[i]];
>    tempmon<-id$fy[temp][[1]];
>    if (return$month[[i]]<-tempmon) {
>        return$fyy[[i]]<-return$year[[i]];
>        return$fyq[[i]]<-4-(tempmon-return$month[[i]])%/%3;
>        }
>      else{
>        return$fyy[[i]]<-return$year[[i]]+1;
>        return$fyq[[i]]<-(return$month[[i]]-tempmon-1)%/%3;
>        }
>    return$GVKEY[[i]]<-id$GVKEY[temp][[1]];
> }
>
> returnnew=merge(return,account,by.x<-c("GVKEY","fyy","fyq"),by.y<-c("GVKEY","fyy","fyq"))
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list