[R] Constrained Poisson model / Bayesian Poisson model

David Winsemius dwinsemius at comcast.net
Sat Jan 23 18:40:33 CET 2016


> On Jan 23, 2016, at 5:24 AM, mara.pfleiderer at uni-ulm.de wrote:
> 
> Hi David,
> 
> I'm sorry. I'm not familiar with posting problems on helppages.
> As the data I deal with is confidential I can't provide all details,
> but I will try to be as precise as possible about my problem:
> 
> As I said I'm working on a Poisson regression model with a linear
> predictor and the identity as link function.

Then that is not what most people would call a "Poisson regression model".

In particular: I have data which consists of observations of two random variables X and Y over the period of about 3 months. I have clustered this data into hours, so I received 2088 observations of those variables and each observation represents the values of the variables in one hour. Now I assume Y to be Poisson distributed (Y_i~Poi(lambda_i)) and that the values of Y come from two different impacts, so I split Y_i into two Poisson-distributed variables Y_i1 and Y_i2. I assume that the values of X have a long-term effect (of at least one day) on the part Y_i1 and I have estimators for the parameters lambda_i2, but I have no exact values of the variables of Y_i1 and Y_i2 but only of Y_i as a whole.

> So my model looks as follows:
> fml=Y_i1 ~ b_1*X_i+....+b_n*X_(i-n+1) - lambda_i2 -1 with n>=24
> That means that the last 24 (or more) values of X influence the value of Y_i1 in an additive way and I have no intercept(b_0=0).
> Now I made a matrix whose rows represent Y_i, all of its 24 (or more) regressors for each observation of Y_i and the corresponding estimator lambda_i2.
> Then I used glm(fml, family=poisson(link="identity"), data=matrix) and tried it for different values of n (=24,48,36,...).

I worry that might not construct the model you described above. The use of `poisson` constructs Poisson distributed _errors_ with an additive link, and to my understanding does _not_ assume Poisson distributed Y values.


> But always some of the coefficients received negative values which doesnt't make sense in interpretation. (The values of X represent certain events which can only have a positive or none effect on the value of Y.)
> 
> Now I want to use the constraint b_i >=0 in my model, but I don't know how I can do this.
> 
> I also thought of analyzing this model in a Bayesian way, but yet I haven't found a Bayesian version of glm() for the Poisson distribution such that I can specify the prior for the b_i on my own. (Then I could include the positivity in the prior.) Do you have a hint for me?

The R blogoshere has been heralding the arrival of the `rstanarm` package which provides a glm-like interface to an MCMC (Bayesian) estimation engine. Version 2.90 is on CRAN. The list of authors in the description file is impressive: 

Authors at R: c(person("Jonah", "Gabry", email = "jsg2201 at columbia.edu", role = "aut"),
             person("Trustees of", "Columbia University", role = "cph"),
             person("R Core", "Deveopment Team", role = "cph", 
                     comment = "R/pp_data.R, R/stan_aov.R"),
             person("Douglas", "Bates", role = "cph", comment = "R/pp_data.R"),
             person("Martin", "Maechler", role = "cph", comment = "R/pp_data.R"),
             person("Ben", "Bolker", role = "cph", comment = "R/pp_data.R"),
             person("Steve", "Walker", role = "cph", comment = "R/pp_data.R"),
             person("Brian", "Ripley", role = "cph", 
                    comment = "R/stan_aov.R, R/stan_polr.R"),
             person("William", "Venables", role = "cph", comment = "R/stan_polr.R"),
             person("Ben", "Goodrich", email = "benjamin.goodrich at columbia.edu", 
                    role = c("cre", "aut")))

> 
> I'm sorry if I went too much into detail now...
> I hope you understand my point now and have some answers for me!
> 
> Best,
> Mara
> 
> 
> Zitat von David Winsemius <dwinsemius at comcast.net>:
> 
>> 
>>> On Jan 22, 2016, at 7:01 AM, mara.pfleiderer at uni-ulm.de wrote:
>>> 
>>> Hi all,
>>> 
>>> I am dealing with a problem about my linear Poisson regression   model (link function=identity).
>>> 
>>> I am using the glm()-function which results in negative   coefficients, but a negative influence of the regressors wouldn't   make sense.
>> 
>> Negative coefficients merely indicate a lower relative rate. You   need to be more specific about the exactly data and model output   before you can raise our concern to a level where further comment   can be made.
>> 
>> 
>>> 
>>> (i) Is there a possibility to set constraints on the regression   parameters in glm() such that all coefficients are positive? Or is   there another function in R for which this is possible?
>>> 
>>> (ii) Is there a Bayesian version of the glm()-function where I can   specify the prior distribution for my regression parameters? (e.g.   a Dirichlet prior s.t. the parameters are positive)
>>> 
>>> All this with respect to the linear Poisson model...
>> 
>> As I implied above, the word "linear" means something different than   "additive" when the link is log().
>> 
>> --
>> 
>> David Winsemius
>> Alameda, CA, USA
>> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list