[R-SIG-Finance] panel data in R

Richard Herron richard.c.herron at gmail.com
Sat May 5 21:08:12 CEST 2012


What kind of models are you estimating? I would use PLM if I were
doing models with firm fixed effects (FE). But I don't think I see
firm FE with daily observations. I usually see firm FE at the annual
level.

If you're either estimating time series models or aggregating daily
observations to the month-level for cross-sectional models, then a
list of firm-level time series would be best (or if you're only using
the return series you could put this in one wide xts or zoo object).

Re: missing data. xts has -na.locf- for carrying forward the last
non-missing observation. I tend to leave missing observations as
missing.

Could you provide an example of what you would like to estimate?

Richard Herron


On Sat, May 5, 2012 at 11:30 AM, Alexander Chernyakov
<alexander.chernyakov at gmail.com> wrote:
> Hi Richard,
> Thanks for your response.  One issue I have run into with PLM is it
> seems to be fairly slow with large data sets (14 mil date, firm
> points).  Any tricks with this? Also, it seems to not handle
> irregularly spaced time points.. it fills in the missing ones with NA
> so when doing lagging or differencing things don't work correctly.  Do
> you have any advice on fixing this?
>
> Thanks,
> Alex
>
> On Sat, May 5, 2012 at 8:43 AM, Richard Herron
> <richard.c.herron at gmail.com> wrote:
>> What kind of models do plan on using?
>>
>> If you plan on using time series models, then I suggest generating a
>> list where each entry is one firm. This will make it easy to fit
>> models with lapply.
>>
>> If you plan on using panel models, then I suggest using PLM. It is
>> easy enough to manually code within and between estimators, but if you
>> use clustered standard errors or dynamic panel models, then PLM will
>> make you life a lot easier.
>>
>> Richard Herron
>>
>>
>> On Fri, May 4, 2012 at 6:30 PM, Alexander Chernyakov
>> <alexander.chernyakov at gmail.com> wrote:
>>>
>>> Hi,
>>> This question is of a general nature: How do people handle panel data
>>> in R?  For example,  I have returns of firms and each firm has daily
>>> observations.  One way is to use the plm package.. another is to use
>>> plyr and just do the operations on (date, firmid) units using
>>> something like zoo as a container for each firm so that lagging and
>>> differencing can be done.  For regression it seems that plm might be
>>> the better option?  Just curious if somebody has a well worked out
>>> system for this.
>>>
>>> Thanks
>>> Alex
>>>
>>> _______________________________________________
>>> R-SIG-Finance at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>>> -- Subscriber-posting only. If you want to post, subscribe first.
>>> -- Also note that this is not the r-help list where general R questions should go.



More information about the R-SIG-Finance mailing list