[R] Computing growth rate
Berend Hasselman
bhh at xs4all.nl
Thu Dec 15 13:18:08 CET 2016
> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmishra at gmail.com> wrote:
>
> Hi,
>
> I am trying to calculate growth rate (say, sales, though it is to be
> computed for many variables) in a panel data set. Problem is that I
> have missing data for many firms for many years. To put it simply, I
> have created this short dataframe (original df id much bigger)
>
> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>
> # this gives me
> co_code1 fyear1 sales1
> 1 1100 1990 1000
> 2 1100 1991 1100
> 3 1100 1992 1200
> 4 1100 1993 1300
> 5 1100 1994 1400
> 6 1100 1995 1500
> 7 1100 1996 1600
> 8 1200 1990 1000
> 9 1200 1991 1100
> 10 1200 1992 1200
> 11 1200 1993 1300
> 12 1200 1994 1400
> 13 1200 1995 1500
> 14 1200 1996 1600
> 15 1300 1990 1000
> 16 1300 1991 1100
> 17 1300 1992 1200
> 18 1300 1993 1300
> 19 1300 1994 1400
> 20 1300 1995 1500
> 21 1300 1996 1600
>
> # I am now removing a couple of rows
> df1<-df1[-c(5, 8), ]
> # the result is
> co_code1 fyear1 sales1
> 1 1100 1990 1000
> 2 1100 1991 1100
> 3 1100 1992 1200
> 4 1100 1993 1300
> 6 1100 1995 1500
> 7 1100 1996 1600
> 9 1200 1991 1100
> 10 1200 1992 1200
> 11 1200 1993 1300
> 12 1200 1994 1400
> 13 1200 1995 1500
> 14 1200 1996 1600
> 15 1300 1990 1000
> 16 1300 1991 1100
> 17 1300 1992 1200
> 18 1300 1993 1300
> 19 1300 1994 1400
> 20 1300 1995 1500
> 21 1300 1996 1600
> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
> removed. If I try,
> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>
> # this apparently gives wrong results for the year 1995 (as shown
> below) as growth rates are computed considering yearly increment.
>
> co_code1 fyear1 sales1 growth
> 1 1100 1990 1000 NA
> 2 1100 1991 1100 10.000000
> 3 1100 1992 1200 9.090909
> 4 1100 1993 1300 8.333333
> 5 1100 1995 1500 15.384615
> 6 1100 1996 1600 6.666667
> 7 1200 1991 1100 NA
> 8 1200 1992 1200 9.090909
> 9 1200 1993 1300 8.333333
> 10 1200 1994 1400 7.692308
> 11 1200 1995 1500 7.142857
> 12 1200 1996 1600 6.666667
> 13 1300 1990 1000 NA
> 14 1300 1991 1100 10.000000
> 15 1300 1992 1200 9.090909
> 16 1300 1993 1300 8.333333
> 17 1300 1994 1400 7.692308
> 18 1300 1995 1500 7.142857
> 19 1300 1996 1600 6.666667
> # I thought of using the formula only when the increment of fyear1 is
> only 1 while in a co_code1, by using this formula
>
> d<-ddply(df1,
> "co_code1",
> transform,
> if(diff(fyear1)==1){
> growth=(exp(diff(log(df1$sales1)))-1)*100
> } else{
> growth=NA
> })
>
> But, this doesn't work. I am getting the following error.
>
> In if (diff(fyear1) == 1) { :
> the condition has length > 1 and only the first element will be used
> (repeated a few times).
>
> # I have searched for a solution, but somehow couldn't get one. Hope
> that some kind soul will guide me here.
>
In your case use ifelse() as explained by Rui.
But it can be done more easily since the fyear1 and co_code1 are synchronized.
Add a new column to df1 like this
df1$growth <- c(NA,
ifelse(diff(df1$fyear1)==1,
(exp(diff(log(df1$sales1)))-1)*100,
NA
)
)
and display df1. From your request I cannot determine if this is what you want.
regards,
Berend Hasselman
More information about the R-help
mailing list