[R] Computing growth rate

Berend Hasselman bhh at xs4all.nl
Thu Dec 15 13:18:08 CET 2016


> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmishra at gmail.com> wrote:
> 
> Hi,
> 
> I am trying to calculate growth rate (say, sales, though it is to be
> computed for many variables) in a panel data set. Problem is that I
> have missing data for many firms for many years. To put it simply, I
> have created this short dataframe (original df id much bigger)
> 
> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
> 
> # this gives me
> co_code1 fyear1 sales1
> 1      1100   1990   1000
> 2      1100   1991   1100
> 3      1100   1992   1200
> 4      1100   1993   1300
> 5      1100   1994   1400
> 6      1100   1995   1500
> 7      1100   1996   1600
> 8      1200   1990   1000
> 9      1200   1991   1100
> 10     1200   1992   1200
> 11     1200   1993   1300
> 12     1200   1994   1400
> 13     1200   1995   1500
> 14     1200   1996   1600
> 15     1300   1990   1000
> 16     1300   1991   1100
> 17     1300   1992   1200
> 18     1300   1993   1300
> 19     1300   1994   1400
> 20     1300   1995   1500
> 21     1300   1996   1600
> 
> # I am now removing a couple of rows
> df1<-df1[-c(5, 8), ]
> # the result is
>   co_code1 fyear1 sales1
> 1      1100   1990   1000
> 2      1100   1991   1100
> 3      1100   1992   1200
> 4      1100   1993   1300
> 6      1100   1995   1500
> 7      1100   1996   1600
> 9      1200   1991   1100
> 10     1200   1992   1200
> 11     1200   1993   1300
> 12     1200   1994   1400
> 13     1200   1995   1500
> 14     1200   1996   1600
> 15     1300   1990   1000
> 16     1300   1991   1100
> 17     1300   1992   1200
> 18     1300   1993   1300
> 19     1300   1994   1400
> 20     1300   1995   1500
> 21     1300   1996   1600
> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
> removed. If I try,
> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
> 
> # this apparently gives wrong results for the year 1995 (as shown
> below) as growth rates are computed considering yearly increment.
> 
>   co_code1 fyear1 sales1    growth
> 1      1100   1990   1000        NA
> 2      1100   1991   1100 10.000000
> 3      1100   1992   1200  9.090909
> 4      1100   1993   1300  8.333333
> 5      1100   1995   1500 15.384615
> 6      1100   1996   1600  6.666667
> 7      1200   1991   1100        NA
> 8      1200   1992   1200  9.090909
> 9      1200   1993   1300  8.333333
> 10     1200   1994   1400  7.692308
> 11     1200   1995   1500  7.142857
> 12     1200   1996   1600  6.666667
> 13     1300   1990   1000        NA
> 14     1300   1991   1100 10.000000
> 15     1300   1992   1200  9.090909
> 16     1300   1993   1300  8.333333
> 17     1300   1994   1400  7.692308
> 18     1300   1995   1500  7.142857
> 19     1300   1996   1600  6.666667
> # I thought of using the formula only when the increment of fyear1 is
> only 1 while in a co_code1, by using this formula
> 
> d<-ddply(df1,
>         "co_code1",
>         transform,
>         if(diff(fyear1)==1){
>           growth=(exp(diff(log(df1$sales1)))-1)*100
>         } else{
>           growth=NA
>         })
> 
> But, this doesn't work. I am getting the following error.
> 
> In if (diff(fyear1) == 1) { :
>  the condition has length > 1 and only the first element will be used
> (repeated a few times).
> 
> # I have searched for a solution, but somehow couldn't get one. Hope
> that some kind soul will guide me here.
> 

In your case use ifelse() as explained by Rui. 
But it can be done more easily since the fyear1 and co_code1 are synchronized.
Add a new column to df1 like this

df1$growth <- c(NA,
         ifelse(diff(df1$fyear1)==1,
                    (exp(diff(log(df1$sales1)))-1)*100,
                    NA
                    )
        )

and display df1. From your request I cannot determine if this is what you want.

regards,

Berend Hasselman



More information about the R-help mailing list