[R] A comment about R:
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Thu Jan 5 18:42:59 CET 2006
John Fox wrote:
> Dear Peter,
>
>
>>-----Original Message-----
>>From: r-help-bounces at stat.math.ethz.ch
>>[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter
>>Muhlberger
>>Sent: Wednesday, January 04, 2006 2:43 PM
>>To: rhelp
>>Subject: [R] A comment about R:
>>
>
>
> . . .
>
>
>>Ex. 1) Wald tests of linear hypotheses after max. likelihood
>>or even after a regression. "Wald" does not even appear in
>>my standard R package on a search. There's no comment in the
>>lm help or optim help about what function to use for
>>hypothesis tests. I know that statisticians prefer
>>likelihood ratio tests, but Wald tests are still useful and
>>indeed crucial for first-pass analysis. After searching with
>>Google for some time, I found several Wald functions in
>>various contributed R packages I did not have installed. One
>>confusion was which one would be relevant to my needs. This
>>took some time to resolve. I concluded, perhaps on
>>insufficient evidence, that package car's Wald test would be
>>most helpful. To use it, however, one has to put together a
>>matrix for the hypotheses, which can be arduous for a
>>many-term regression or a complex hypothesis.
>>In comparison,
>>in Stata one simply states the hypothesis in symbolic terms.
>>I also don't know for certain that this function in car will
>>work or work properly w/ various kinds of output, say from lm
>>or from optim. To be sure, I'd need to run time-consuming
>>tests comparing it with Stata output or examine the
>>function's code. In Stata the test is easy to find, and
>>there's no uncertainty about where it can be run or its
>>accuracy. Simply having a comment or "see also" in lm help
>>or mle or optim help pointing the user to the right Wald
>>function would be of enormous help.
The Design package's anova.Design and contrast.Design make many Wald
tests very easy. contrast( ) will allow you to test all kinds of
hypotheses by stating which differences in predicted values you are
interested in.
Frank Harrell
>>
>
>
>
> The reference, I believe, is to the linear.hypothesis() function, which has
> methods for lm and glm objects. [To see what kinds of objects
> linear.hypothesis is suitable for, use the command
> methods(linear.hypothesis).] For lm objects, you get an F-test by default.
> Note that the Anova() function, also in car, can more conveniently compute
> Wald tests for certain kinds of hypotheses. More generally, however, I'd be
> interested in your suggestions for an alternative method of specifying
> linear hypotheses. There is currently no method for mle objects, but adding
> one is a good idea, and I'll do that when I have a chance. (In the meantime,
> it's very easy to compute Wald tests from the coefficients and the
> hypothesis and coefficient-covariance matrices. Writing a small function to
> do so, without the bells and whistles of something like linear.hypothesis(),
> should not be hard. Indeed, the ability to do this kind of thing easily is
> what I see as the primary advantage of working in a statistical computing
> environment like R -- or Stata.
>
>
>>Ex. 2) Getting neat output of a regression with Huberized
>>variance matrix.
>>I frequently have to run regressions w/ robust variances. In
>>Stata, one simply adds the word "robust" to the end of the
>>command or "cluster(cluster.variable)" for a cluster-robust
>>error. In R, there are two functions, robcov and hccm. I
>>had to run tests to figure out what the relationship is
>>between them and between them and Stata (robcov w/o cluster
>>gives hccm's hc0; hccm's hc1 is equivalent to Stata's
>>'robust' w/o cluster; etc.). A single sentence in hccm's
>>help saying something to the effect that statisticians prefer
>>hc3 for most types of data might save me from having to
>>scramble through the statistical literature to try to figure
>>out which of these I should be using. A few sentences on
>>what the differences are between these methods would be even
>>better. Then, there's the problem of output. Given that hc1
>>or hc3 are preferred for non-clustered data, I'd need to be
>>able to get regression output of the form summary(lm) out of
>>hccm, for any practical use. Getting this, however, would
>>require programming my own function. Huberized t-stats for
>>regressions are commonplace needs, an R oriented a little
>>toward more everyday needs would not require programming of
>>such needs. Also, I'm not sure yet how well any of the
>>existing functions handle missing data.
>>
>
>
> I think that we have a philosophical difference here: I don't like giving
> advice in documentation. An egregious extended example of this, in my
> opinion, is the SPSS documentation. The hccm() function uses hc3 as the
> default, which is an implicit recommendation, but more usefully, in my view,
> points to Long and Erwin's American Statistician paper on the subject, which
> does give advice and which is quite accessible. As well, and more generally,
> the car package is associated with a book (my R and S-PLUS Companion to
> Applied Regression), which gives advice, though, admittedly, tersely in this
> case.
>
> The Anova() function with argument white=TRUE will give you F-tests
> corresponding to the t-tests to which you refer (though it will combine df
> for multiple-df terms in the model). To get the kind of summary you
> describe, you could use something like
>
> mysummary <- function(model){
> coef <- coef(model)
> se <- sqrt(diag(hccm(model)))
> t <- coef/se
> p <- 2*pt(abs(t), df=model$df.residual, lower=FALSE)
> table <- cbind(coef, se, t, p)
> rownames(table) <- names(coef)
> colnames(table) <- c("Estimate", "Std. Error", "t value",
> "Pr(>|t|)")
> table
> }
>
> Again, it's not time-consuming to write simple functions like this for one's
> own use, and the ability to do so is a strength of R, in my view.
>
> I'm not sure what you mean about handling missing data: functions like
> hccm(), linear.hypothesis(), and Anova() start with a model object for which
> missing data have already been handled.
>
> Regards,
> John
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list