[R] model syntax processed --- probably common

David Winsemius dwinsemius at comcast.net
Mon Aug 19 22:02:01 CEST 2013


On Aug 19, 2013, at 12:48 PM, David Winsemius wrote:

> 
> On Aug 19, 2013, at 9:45 AM, ivo welch wrote:
> 
>> dear R experts---I was programming a fama-macbeth panel regression (a
>> fama-macbeth regression is essentially T cross-sectional regressions, with
>> statistics then obtained from the time-series of coefficients), partly
>> because I wanted faster speed than plm, partly because I wanted some
>> additional features.
>> 
>> my function starts as
>> 
>> fama.macbeth <- function( formula, din ) {
>>  names <- terms( formula )
>> ## omitted : I want an immediate check that the formula refers to
>> existing variables in the data frame with English error messages
>> 
> 
> Look the structure of a terms result from a formula argument with str():
> 
> fama.macbeth <- function( formula, din ) {
>   fnames <- terms( formula ) ; str(fnames)
> }
> 
>> fama.macbeth( x ~ y, data.frame(x=rnorm(10), y=rnorm(10) ) )
> Classes 'terms', 'formula' length 3 x ~ y
snipped output

> 
> Then extract the dimnames from the "factors" attribute to compare to the names in hte data-object:
> 
>> fama.macbeth <- function( formula, din ) {
>  fnames <- terms( formula ) ;  dnames <- names( din)
>  dimnames(attr(fnames, "factors"))[[1]] %in%  dnames
> }
> #[1] TRUE TRUE
> 

This might be more economical"

?all.names
 fama.macbeth <- function( formula, din ) {
   fnames <- all.names( formula ) ; str(fnames)
   dnames <- names( din)
   fnames[-1] %in%  dnames
 }
 fama.macbeth( x ~ y, data.frame(x=rnorm(10), y=rnorm(10) ) )
 chr [1:3] "~" "x" "y"
[1] TRUE TRUE



> 
> I couldn't tell if this was the main thrust of you question. It seems to meander a bit.
> 
> -- 
> David.
> 
>> monthly.regressions <- by( din, as.factor(din$month), function(dd)
>> coef(lm(model.frame( formula, data=dd )))
>>  as.m <- do.call("rbind", monthly.regressions)
>>  colMeans(as.m)  ## or something like this.
>> }
>> say my data frame mydata has columns named month, r, laggedx and ... .  I
>> can call this function
>> 
>>  fama.macbeth( r ~ laggedx, din=mydata )
>> 
>> but it fails
> 
> What fails?
> 
> 
>> if I want to compute my x variables.  for example,
>> 
>>  myx <- d[,"laggedx"]
>>  fama.macbeth( r ~ myx)
>> 
>> I also wish that the computed myx still remembered that it was really
>> laggedx.  it's almost as if I should not create a vector myx but a data
>> frame myx to avoid losing the column name.
> 
> I wouldn't say "almost"... rather that is exactly what you should do. R regression methods almost always work better when formulas are interpreted in the environment of the data argument.
> 
>> I wonder why such vectors don't
>> keep a name attribute of some sort.
>> 
>> there is probably an "R way" of doing this.  is there?
>> 
>> /iaw
>> 
>> ----
>> Ivo Welch (ivo.welch at gmail.com)
>> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list