[R] how exactly does 'identify' work?

Greg Snow Greg.Snow at imail.org
Thu Nov 18 21:20:16 CET 2010


One additional point on your original post.  You added row names to the test data frame, but did not specify the name of the data frame when you did the regression, rather you attached the data frame.  When you did this lm found x and y, but did not find the rownames, so the diagnostic plot just used numbers to label the extreme points.

This is just one of the many pitfalls with using attach rather than the more direct methods, try your example again but instead of attaching the data frame use it in the data argument to lm:

> test.lm <- lm( y~x, data=test )

Then when you do plot(test.lm, 2) the most extreme points (3 if you don't change the id.n value) will be labeled using the rownames.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Greg Snow
> Sent: Thursday, November 18, 2010 1:11 PM
> To: casperyc; r-help at r-project.org
> Subject: Re: [R] how exactly does 'identify' work?
> 
> Did you read the help page for qqnorm?  The return value has the x and
> y coordinates used, you can just do something like:
> 
> > tmp <- qqnorm( resid(test.lm) )
> > identify(tmp, , names(resid(test.lm)) )
> 
> Or the plot.lm function has an argument id.n that automatically labels
> the n most extreme values:
> 
> > plot( test.lm, 2, id.n=10 )
> 
> Those both worked in my tests, if they are not working for you then
> send a reproducible example (include data, see ?dput) and maybe we can
> help further.
> 
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.snow at imail.org
> 801.408.8111
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > project.org] On Behalf Of casperyc
> > Sent: Thursday, November 18, 2010 11:50 AM
> > To: r-help at r-project.org
> > Subject: Re: [R] how exactly does 'identify' work?
> >
> >
> > Hi,
> >
> > I think the problem is
> >
> > 1 - when a linear model is fitted, ploting the qqnorm( test.lm$ res )
> > we dont 'know' what values are actually being used on the y-axis; and
> > how do we refer to the ‘Index’ on the x-axis??
> >      therefore, i dont know how to refer to the x and y coordinates
> in
> > the
> > identify function
> >
> > 2 - i have tried using the stdres function in the MASS library, to
> > extract
> > the standardised
> > residuals and plot them manully, ( using the plot ) function.
> >      this way, the problem is we have to SORT the residuals first in
> > increasing order to reproduce the same qqnorm plot, in that case,
> > 'identify'
> > function works, however, that CHANGES the order, i.e. it wont return
> > the
> > original A:Z ( row.names ) label.
> > --
> > View this message in context: http://r.789695.n4.nabble.com/how-
> > exactly-does-identify-work-tp3045953p3049357.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list