[R-pkg-devel] Formula modeling

pik@pp@@devei m@iii@g oii gm@ii@com pik@pp@@devei m@iii@g oii gm@ii@com
Thu Oct 7 23:20:20 CEST 2021


Dear R-package-devel subscribers,

 

My question concerns a package design issue relating to the usage of
formulas.

 

I am interested in describing via formulas systems of the form:

 

d = p + x + y 

s = p + w + y

p = z + y

q = min(d,s).

 

The context in which I am working is that of market models with, primarily,
panel data. In the above system, one may think of the first equation as
demand, the second as supply, and the third as an equation (co-)determining
prices. The fourth equation is implicitly used by the estimation method, and
it does not need to be specified when programming the R formula. If you need
more information bout the system, you may check the package diseq.
Currently, I am using constructors to build market model objects. In a
constructor call, I pass [i] the right-hand sides of the first three
equations as strings, [ii] an argument indicating whether the equations of
the system have correlated shocks, [iii] the identifiers of the used dataset
(one for the subjects of the panel and one for time), and [iv] the quantity
(q) and price (p) variables. These four arguments contain all the necessary
information for constructing a model.

 

I would now like to re-implement model construction using formulas, which
would be a more regular practice for most R users. I am currently
considering passing all the above information with a single formula of the
form:

 

q | p | subject | time | rho ~ p + x + y | p + w + y | z + y 

 

where subject and time are the identifiers, and rho indicates whether
correlated or independent shocks should be used.

 

I am unaware of other packages that use formulas in this way (for instance,
passing the identifiers in the formula), and I wonder if this would go
against any good practices. Would it be better to exclude some of the
necessary elements for constructing the model? This might make the resuting
formulas more similar to those of models with multiple responses or multiple
parts. I am not sure, though, how one would use such model formulas without
all the relevant information. Is there any suggested design alternative that
I could check?

 

I would appreciate any suggestions and discussion!

 

Kind regards,

Pantelis


	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list