[R] Computing growth rate
Brijesh Mishra
brijeshkmishra at gmail.com
Thu Dec 15 13:35:48 CET 2016
This was ensured while using ddply()...
On Thu, Dec 15, 2016 at 6:04 PM, Brijesh Mishra
<brijeshkmishra at gmail.com> wrote:
> Dear Mr Hasselman,
>
> I missed you mail, while I was typing my own mail as a reply to Mr.
> Barradas suggestion. In fact, I implemented your suggestion even
> before reading it. But, I have a concern that I have noted (though its
> only hypothetical- such a scenario is very unlikely to occur). Is
> there a way to restrict such calculations co_code1 wise?
>
> Many thanks,
>
> Brijesh
>
> On Thu, Dec 15, 2016 at 5:48 PM, Berend Hasselman <bhh at xs4all.nl> wrote:
>>
>>> On 15 Dec 2016, at 04:40, Brijesh Mishra <brijeshkmishra at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I am trying to calculate growth rate (say, sales, though it is to be
>>> computed for many variables) in a panel data set. Problem is that I
>>> have missing data for many firms for many years. To put it simply, I
>>> have created this short dataframe (original df id much bigger)
>>>
>>> df1<-data.frame(co_code1=rep(c(1100, 1200, 1300), each=7),
>>> fyear1=rep(1990:1996, 3), sales1=rep(seq(1000,1600, by=100),3))
>>>
>>> # this gives me
>>> co_code1 fyear1 sales1
>>> 1 1100 1990 1000
>>> 2 1100 1991 1100
>>> 3 1100 1992 1200
>>> 4 1100 1993 1300
>>> 5 1100 1994 1400
>>> 6 1100 1995 1500
>>> 7 1100 1996 1600
>>> 8 1200 1990 1000
>>> 9 1200 1991 1100
>>> 10 1200 1992 1200
>>> 11 1200 1993 1300
>>> 12 1200 1994 1400
>>> 13 1200 1995 1500
>>> 14 1200 1996 1600
>>> 15 1300 1990 1000
>>> 16 1300 1991 1100
>>> 17 1300 1992 1200
>>> 18 1300 1993 1300
>>> 19 1300 1994 1400
>>> 20 1300 1995 1500
>>> 21 1300 1996 1600
>>>
>>> # I am now removing a couple of rows
>>> df1<-df1[-c(5, 8), ]
>>> # the result is
>>> co_code1 fyear1 sales1
>>> 1 1100 1990 1000
>>> 2 1100 1991 1100
>>> 3 1100 1992 1200
>>> 4 1100 1993 1300
>>> 6 1100 1995 1500
>>> 7 1100 1996 1600
>>> 9 1200 1991 1100
>>> 10 1200 1992 1200
>>> 11 1200 1993 1300
>>> 12 1200 1994 1400
>>> 13 1200 1995 1500
>>> 14 1200 1996 1600
>>> 15 1300 1990 1000
>>> 16 1300 1991 1100
>>> 17 1300 1992 1200
>>> 18 1300 1993 1300
>>> 19 1300 1994 1400
>>> 20 1300 1995 1500
>>> 21 1300 1996 1600
>>> # so 1994 for co_code1 1100 and 1990 for co_code1 1200 have been
>>> removed. If I try,
>>> d<-ddply(df1,"co_code1",transform, growth=c(NA,exp(diff(log(sales1)))-1)*100)
>>>
>>> # this apparently gives wrong results for the year 1995 (as shown
>>> below) as growth rates are computed considering yearly increment.
>>>
>>> co_code1 fyear1 sales1 growth
>>> 1 1100 1990 1000 NA
>>> 2 1100 1991 1100 10.000000
>>> 3 1100 1992 1200 9.090909
>>> 4 1100 1993 1300 8.333333
>>> 5 1100 1995 1500 15.384615
>>> 6 1100 1996 1600 6.666667
>>> 7 1200 1991 1100 NA
>>> 8 1200 1992 1200 9.090909
>>> 9 1200 1993 1300 8.333333
>>> 10 1200 1994 1400 7.692308
>>> 11 1200 1995 1500 7.142857
>>> 12 1200 1996 1600 6.666667
>>> 13 1300 1990 1000 NA
>>> 14 1300 1991 1100 10.000000
>>> 15 1300 1992 1200 9.090909
>>> 16 1300 1993 1300 8.333333
>>> 17 1300 1994 1400 7.692308
>>> 18 1300 1995 1500 7.142857
>>> 19 1300 1996 1600 6.666667
>>> # I thought of using the formula only when the increment of fyear1 is
>>> only 1 while in a co_code1, by using this formula
>>>
>>> d<-ddply(df1,
>>> "co_code1",
>>> transform,
>>> if(diff(fyear1)==1){
>>> growth=(exp(diff(log(df1$sales1)))-1)*100
>>> } else{
>>> growth=NA
>>> })
>>>
>>> But, this doesn't work. I am getting the following error.
>>>
>>> In if (diff(fyear1) == 1) { :
>>> the condition has length > 1 and only the first element will be used
>>> (repeated a few times).
>>>
>>> # I have searched for a solution, but somehow couldn't get one. Hope
>>> that some kind soul will guide me here.
>>>
>>
>> In your case use ifelse() as explained by Rui.
>> But it can be done more easily since the fyear1 and co_code1 are synchronized.
>> Add a new column to df1 like this
>>
>> df1$growth <- c(NA,
>> ifelse(diff(df1$fyear1)==1,
>> (exp(diff(log(df1$sales1)))-1)*100,
>> NA
>> )
>> )
>>
>> and display df1. From your request I cannot determine if this is what you want.
>>
>> regards,
>>
>> Berend Hasselman
>>
More information about the R-help
mailing list