[R] Computing growth rate
Brijesh Mishra
brijeshkmishra at gmail.com
Thu Dec 15 13:34:39 CET 2016
Dear Mr Hasselman,
I missed you mail, while I was typing my own mail as a reply to Mr.
Barradas suggestion. In fact, I implemented your suggestion even
before reading it. But, I have a concern that I have noted (though its
only hypothetical- such a scenario is very unlikely to occur). Is
there a way to restrict such calculations co_code1 wise?
Many thanks,
Brijesh
On Thu, Dec 15, 2016 at 5:48 PM, Berend Hasselman <bhh at xs4all.nl> wrote:
>
>> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmishra at gmail.com> wrote:
>>
>> Hi,
>>
>> I am trying to calculate growth rate (say, sales, though it is to be
>> computed for many variables) in a panel data set. Problem is that I
>> have missing data for many firms for many years. To put it simply, I
>> have created this short dataframe (original df id much bigger)
>>
>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>
>> # this gives me
>> co_code1 fyear1 sales1
>> 1 1100 1990 1000
>> 2 1100 1991 1100
>> 3 1100 1992 1200
>> 4 1100 1993 1300
>> 5 1100 1994 1400
>> 6 1100 1995 1500
>> 7 1100 1996 1600
>> 8 1200 1990 1000
>> 9 1200 1991 1100
>> 10 1200 1992 1200
>> 11 1200 1993 1300
>> 12 1200 1994 1400
>> 13 1200 1995 1500
>> 14 1200 1996 1600
>> 15 1300 1990 1000
>> 16 1300 1991 1100
>> 17 1300 1992 1200
>> 18 1300 1993 1300
>> 19 1300 1994 1400
>> 20 1300 1995 1500
>> 21 1300 1996 1600
>>
>> # I am now removing a couple of rows
>> df1<-df1[-c(5, 8), ]
>> # the result is
>> co_code1 fyear1 sales1
>> 1 1100 1990 1000
>> 2 1100 1991 1100
>> 3 1100 1992 1200
>> 4 1100 1993 1300
>> 6 1100 1995 1500
>> 7 1100 1996 1600
>> 9 1200 1991 1100
>> 10 1200 1992 1200
>> 11 1200 1993 1300
>> 12 1200 1994 1400
>> 13 1200 1995 1500
>> 14 1200 1996 1600
>> 15 1300 1990 1000
>> 16 1300 1991 1100
>> 17 1300 1992 1200
>> 18 1300 1993 1300
>> 19 1300 1994 1400
>> 20 1300 1995 1500
>> 21 1300 1996 1600
>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>> removed. If I try,
>> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>
>> # this apparently gives wrong results for the year 1995 (as shown
>> below) as growth rates are computed considering yearly increment.
>>
>> co_code1 fyear1 sales1 growth
>> 1 1100 1990 1000 NA
>> 2 1100 1991 1100 10.000000
>> 3 1100 1992 1200 9.090909
>> 4 1100 1993 1300 8.333333
>> 5 1100 1995 1500 15.384615
>> 6 1100 1996 1600 6.666667
>> 7 1200 1991 1100 NA
>> 8 1200 1992 1200 9.090909
>> 9 1200 1993 1300 8.333333
>> 10 1200 1994 1400 7.692308
>> 11 1200 1995 1500 7.142857
>> 12 1200 1996 1600 6.666667
>> 13 1300 1990 1000 NA
>> 14 1300 1991 1100 10.000000
>> 15 1300 1992 1200 9.090909
>> 16 1300 1993 1300 8.333333
>> 17 1300 1994 1400 7.692308
>> 18 1300 1995 1500 7.142857
>> 19 1300 1996 1600 6.666667
>> # I thought of using the formula only when the increment of fyear1 is
>> only 1 while in a co_code1, by using this formula
>>
>> d<-ddply(df1,
>> "co_code1",
>> transform,
>> if(diff(fyear1)==1){
>> growth=(exp(diff(log(df1$sales1)))-1)*100
>> } else{
>> growth=NA
>> })
>>
>> But, this doesn't work. I am getting the following error.
>>
>> In if (diff(fyear1) == 1) { :
>> the condition has length > 1 and only the first element will be used
>> (repeated a few times).
>>
>> # I have searched for a solution, but somehow couldn't get one. Hope
>> that some kind soul will guide me here.
>>
>
> In your case use ifelse() as explained by Rui.
> But it can be done more easily since the fyear1 and co_code1 are synchronized.
> Add a new column to df1 like this
>
> df1$growth <- c(NA,
> ifelse(diff(df1$fyear1)==1,
> (exp(diff(log(df1$sales1)))-1)*100,
> NA
> )
> )
>
> and display df1. From your request I cannot determine if this is what you want.
>
> regards,
>
> Berend Hasselman
>
More information about the R-help
mailing list