[Rd] Enhanced version of plot.lm()
John Fox
jfox at mcmaster.ca
Thu Apr 28 14:39:04 CEST 2005
Dear John et al.,
Curiously, Georges Monette (at York University in Toronto) and I were just
talking last week about influence-statistic contours, and I wrote a couple
of functions to show these for Cook's D and for covratio as functions of
hat-values and studentized residuals. These differ a bit from the ones
previously discussed here in that they show rule-of-thumb cut-offs for D and
covratio, along with Bonferroni critical values for studentized residuals.
I've attached a file with these functions, even though they're not that
polished.
More generally, I wonder whether it's not best to supply plots like these as
separate functions rather than as a do-it-all plot method for lm objects.
Regards,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-devel-bounces at stat.math.ethz.ch
> [mailto:r-devel-bounces at stat.math.ethz.ch] On Behalf Of John
> Maindonald
> Sent: Wednesday, April 27, 2005 7:54 PM
> To: Martin Maechler
> Cc: David Firth; Werner Stahel; r-devel at stat.math.ethz.ch;
> Peter Dalgaard
> Subject: Re: [Rd] Enhanced version of plot.lm()
>
>
> On 28 Apr 2005, at 1:30 AM, Martin Maechler wrote:
>
> >>>>>> "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk>
> >>>>>> on 27 Apr 2005 16:54:02 +0200 writes:
> >
> > PD> Martin Maechler <maechler at stat.math.ethz.ch> writes:
> >>> I'm about to commit the current proposal(s) to R-devel,
> >>> **INCLUDING** changing the default from 'which = 1:4' to 'which =
> >>> c(1:3,5)
> >>>
> >>> and ellicit feedback starting from there.
> >>>
> >>> One thing I think I would like is to use color for the Cook's
> >>> contours in the new 4th plot.
> >
> > PD> Hmm. First try running example(plot.lm) with the modified
> > function and
> > PD> tell me which observation has the largest Cook's D.
> With the
> > suggested
> > PD> new 4th plot it is very hard to tell whether obs #49 is
> > potentially or
> > PD> actually influential. Plots #1 and #3 are very close to
> > conveying the
> > PD> same information though...
> >
> > I shouldn't be teaching here, and I know that I'm getting
> into fighted
> > territory (regression diagnostics; robustness; "The" Truth,
> etc,etc)
> > but I believe there is no unique way to define "actually
> influential"
> > (hence I don't believe that it's extremely useful to know exactly
> > which Cook's D is largest).
> >
> > Partly because there are many statistics that can be derived from a
> > multiple regression fit all of which are influenced in some way.
> > AFAIK, all observation-influence measures g(i) are
> functions of (r_i,
> > h_{ii}) and the latter are the quantities that "regression users"
> > should really know {without consulting a text book} and that are
> > generalizable {e.g. to "linear smoothers" such as gam()s (for
> > "non-estimated" smoothing parameter)}.
> >
> > Martin
>
> I agree with Martin. I like the idea of using color (red?)
> for the new Cook's contours. People who want (fairly)
> precise comparisons of the Cook's statistics can still use
> the present plot #4, perhaps as a follow-up to the new plot #5.
> It would be possible to label the Cookwise most extreme
> points with the actual values (to perhaps 2sig figures, i.e.,
> labeling on both sides of such points), but this would add
> what I consider is unnecessary clutter to the graph.
>
> John.
>
> John Maindonald email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473 fax : +61 2(6125)5549
> Centre for Bioinformation Science, Room 1194, John Dedman
> Mathematical Sciences Building (Building 27) Australian
> National University, Canberra ACT 0200.
>
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list