[Rd] Enhanced version of plot.lm()

Thu Apr 28 14:39:04 CEST 2005

Dear John et al.,

Curiously, Georges Monette (at York University in Toronto) and I were just
talking last week about influence-statistic contours, and I wrote a couple
of functions to show these for Cook's D and for covratio as functions of
hat-values and studentized residuals. These differ a bit from the ones
previously discussed here in that they show rule-of-thumb cut-offs for D and
covratio, along with Bonferroni critical values for studentized residuals. 

I've attached a file with these functions, even though they're not that
polished.

More generally, I wonder whether it's not best to supply plots like these as
separate functions rather than as a do-it-all plot method for lm objects.

Regards,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: r-devel-bounces at stat.math.ethz.ch 
> [mailto:r-devel-bounces at stat.math.ethz.ch] On Behalf Of John 
> Maindonald
> Sent: Wednesday, April 27, 2005 7:54 PM
> To: Martin Maechler
> Cc: David Firth; Werner Stahel; r-devel at stat.math.ethz.ch; 
> Peter Dalgaard
> Subject: Re: [Rd] Enhanced version of plot.lm()
> 
> 
> On 28 Apr 2005, at 1:30 AM, Martin Maechler wrote:
> 
> >>>>>> "PD" == Peter Dalgaard <p.dalgaard at biostat.ku.dk>
> >>>>>>     on 27 Apr 2005 16:54:02 +0200 writes:
> >
> >     PD> Martin Maechler <maechler at stat.math.ethz.ch> writes:
> >>> I'm about to commit the current proposal(s) to R-devel,
> >>> **INCLUDING** changing the default from 'which = 1:4' to 'which = 
> >>> c(1:3,5)
> >>>
> >>> and ellicit feedback starting from there.
> >>>
> >>> One thing I think I would like is to use color for the Cook's 
> >>> contours in the new 4th plot.
> >
> >     PD> Hmm. First try running example(plot.lm) with the modified 
> > function and
> >     PD> tell me which observation has the largest Cook's D. 
> With the 
> > suggested
> >     PD> new 4th plot it is very hard to tell whether obs #49 is 
> > potentially or
> >     PD> actually influential. Plots #1 and #3 are very close to 
> > conveying the
> >     PD> same information though...
> >
> > I shouldn't be teaching here, and I know that I'm getting 
> into fighted 
> > territory (regression diagnostics; robustness; "The" Truth, 
> etc,etc) 
> > but I believe there is no unique way to define "actually 
> influential"
> > (hence I don't believe that it's extremely useful to know exactly 
> > which Cook's D is largest).
> >
> > Partly because there are many statistics that can be derived from a 
> > multiple regression fit all of which are influenced in some way.
> > AFAIK, all observation-influence measures g(i) are 
> functions of (r_i, 
> > h_{ii}) and the latter are the quantities that "regression users" 
> > should really know {without consulting a text book} and that are 
> > generalizable {e.g. to "linear smoothers" such as gam()s (for 
> > "non-estimated" smoothing parameter)}.
> >
> > Martin
> 
> I agree with Martin.  I like the idea of using color (red?) 
> for the new Cook's contours.  People who want (fairly) 
> precise comparisons of the Cook's statistics can still use 
> the present plot #4, perhaps as a follow-up to the new plot #5.
> It would be possible to label the Cookwise most extreme 
> points with the actual values (to perhaps 2sig figures, i.e., 
> labeling on both sides of such points), but this would add 
> what I consider is unnecessary clutter to the graph.
> 
> John.
> 
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Bioinformation Science, Room 1194, John Dedman 
> Mathematical Sciences Building (Building 27) Australian 
> National University, Canberra ACT 0200.
> 
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel