[R] Problems with Panel Data estimation
JBrettas
jcosta at marketdata.com.br
Wed Jan 18 14:14:34 CET 2012
Hi everybody,
Got some doubts here. I'm kinda desperate for help, so please ask me if
anything isn't clear.
I have a database with this structure (panel data structure):
> head(dados_2)
Tempo Safra Data Resposta Perc_Resg_Acum Alta_Temporada Flexi Promo
1 1 1 200701 0.04223216 0 1 0 0
2 1 2 200702 0.02801536 0 -1 0 0
3 1 3 200703 0.02786171 0 0 0 0
4 1 4 200704 0.02913633 0 0 0 0
5 1 5 200705 0.03953217 0 0 0 0
6 1 6 200706 0.05084010 0 0 0 0
Promo_Ponto_Frio Parceiros
1 0 0
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
>
where I have 25 levels of "Tempo" and 34 for "Safra".
I want to obtain the confidence intervals of the regression coefficients,
and also forecast the "Resposta" variable with prediction intervals.
But then, I've got some problems here:
-When "Tempo" = 1 (the time index), the variable "Perc_Resg_Acum" gets 0.
-I have some databases of the same kind (panel data structure) and some of
then does not have any value on the variable "Promo" in the entire column.
I'm modeling with the funcions pvcm() and lmList() (which are equivalent),
but then, instead of giving 0 as coefficient for variable "promo", the
function removes the entire column of the model and calculates the
estimations. How can I do to consider the columns of zeros on the regression
model and return a null coefficient instead of NA?
To help with my doubts, here is part of my code:
model_within1 <-
pvcm(Resposta~Perc_Resg_Acum+Alta_Temporada+Flexi+Promo+Promo_Ponto_Frio+Parceiros,
data = dados_2, model="within")
model_within2 <-
lmList(Resposta~Perc_Resg_Acum+Alta_Temporada+Flexi+Promo+Promo_Ponto_Frio+Parceiros|Tempo,
data = dados_2)
When I run the first model, I get this:
> model_within1 <-
> pvcm(Resposta~Perc_Resg_Acum+Alta_Temporada+Flexi+Promo+Promo_Ponto_Frio+Parceiros,
> data = dados_2, model="within")
*serie Promo_Ponto_Frio is constant and has been removed
Error in eval(expr, envir, enclos) :
object 'Promo_Ponto_Frio' not found*>
(Well, I don't want to remove the constant column and then proceed using it)
With the lmList function, I get no error message, but this outputs:
>model_within2
> model_within2
Call:
Model: Resposta ~ Perc_Resg_Acum + Alta_Temporada + Flexi + Promo +
Promo_Ponto_Frio + Parceiros | Tempo
Data: dados_2
Coefficients:
(Intercept) Perc_Resg_Acum Alta_Temporada Flexi Promo
1 0.05575606 NA 0.0094899066 NA NA
2 0.02767265 0.602756910 0.0097098374 NA NA
3 0.01493001 0.216359571 0.0072083199 NA NA
4 0.01702644 0.130147260 0.0073664874 NA NA
5 0.02199162 0.077860221 0.0072053502 NA NA
6 0.02574624 0.049635548 0.0062181048 NA NA
7 0.02672193 0.035288194 0.0064811866 NA NA
8 0.03620546 0.001001478 0.0056185695 0.0117215422 NA
9 0.03834693 -0.007266645 0.0060674586 0.0107572932 NA
10 0.03851210 -0.011103720 0.0059166792 0.0099111959 NA
11 0.03877860 -0.011788541 0.0052854680 0.0085353595 NA
12 0.04213484 -0.017576921 0.0049738941 0.0084101158 NA
13 0.04217531 -0.017615294 0.0057095338 0.0094572949 NA
14 0.04591170 -0.027457745 0.0052304802 0.0091305518 NA
15 0.05575347 -0.047174244 0.0043227892 0.0070854218 NA
16 0.06751835 -0.068053502 0.0041113756 0.0035637901 NA
17 0.06743575 -0.066419074 0.0035628714 0.0027136668 NA
18 0.08494492 -0.092279778 0.0027045102 0.0025033917 0.0056991774
19 0.10540605 -0.122592396 0.0034576115 0.0007773916 0.0043499539
20 0.09374987 -0.102578612 0.0022937536 0.0003246327 -0.0033912104
21 0.09477620 -0.103937511 0.0018046064 -0.0019254150 0.0038385416
22 0.07984309 -0.081880920 0.0031004004 0.0012212949 -0.0001274436
23 0.04209354 -0.027693308 0.0033759713 0.0012759561 -0.0005342256
24 0.02248439 0.001793878 0.0019126335 0.0016330109 -0.0012942994
25 -0.04124712 0.093798787 -0.0009151255 0.0026952764 0.0002002742
Promo_Ponto_Frio Parceiros
1 NA 0.0085825438
2 NA -0.0040152859
3 NA -0.0015317053
4 NA -0.0016866579
5 NA -0.0014183949
6 NA -0.0016753846
7 NA -0.0012411159
8 NA -0.0016690746
9 NA -0.0018987163
10 NA -0.0016922052
11 NA -0.0017404386
12 NA -0.0017259225
13 NA -0.0014849246
14 NA -0.0014719829
15 NA -0.0016265977
16 NA -0.0015527121
17 NA -0.0014492467
18 NA -0.0016308425
19 NA -0.0013443498
20 NA -0.0012088912
21 NA -0.0006974880
22 NA -0.0006981946
23 NA -0.0006599528
24 NA -0.0004592202
25 NA -0.0020974059
And, because this NAs, when I run summary(model_within2), I've got only
estimations, std.errors and quantiles of t of 3 variables. Is there a way to
solve this problem? A way to consider also the constant columns on my model?
Help me, please!!!
--
View this message in context: http://r.789695.n4.nabble.com/Problems-with-Panel-Data-estimation-tp4306602p4306602.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list