[R] Drop firms in unbalanced panel if not more than 5 observations in consecutive years for all variables
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Jul 22 13:40:14 CEST 2010
On Thu, Jul 22, 2010 at 5:18 AM, Christian Schoder
<schoc152 at newschool.edu> wrote:
> Dear R-user,
>
> a few weeks ago I consulted the list-serve with a similar question.
> However, my task changed a little but sufficiently to get lost again. So
> I would appreciate any help on the following issue.
>
> I use the plm package and work with firm-level data in a panel. I would
> like to eliminate all firms that do not fulfill the requirement of
> having an observation in every variable used for at least x consecutive
> years.
>
> For illustration of the problem assume the following data set
>> data
> id year y z
> 1 a 2000 1 1
> 2 b 2000 NA 2
> 3 b 2001 3 3
> 4 c 1999 1 1
> 5 c 2000 2 2
> 6 c 2001 4 NA
> 7 c 2002 5 4
> 8 d 1998 6 5
> 9 d 1999 5 NA
> 10 d 2000 6 6
> 11 d 2001 7 7
> 12 d 2002 3 6
> where id is the index of the firm, year the index for the year, and y
> and z are variables. Now, I would like to get rid of all firms with,
> let's say, less than 3 consecutive years in which there are observations
> for every variable. Hence, the procedure should yield
>> data.reduced
> id year y z
> 1 d 1998 6 5
> 2 d 1999 5 NA
> 3 d 2000 6 6
> 4 d 2001 7 7
> 5 d 2002 3 6
>
Try this:
do.call(rbind, by(DF, DF$id, function(x) if
(length(na.contiguous(x$y * x$z)) >= 3) x ))
More information about the R-help
mailing list