[R] Fitting data and removing outliers

Stephen Sefick ssefick at gmail.com
Fri Jul 13 22:15:25 CEST 2012


They are due to measurement error, sample of a different population, or 
... ?  What is the unusual event?  Does it explain something important 
about the system that you are working on?  I am not telling you not to 
do what you are doing, but just writing things that I consider when I am 
doing regression modelling.
FWIW,

Stephen

On 07/13/2012 02:26 PM, Lauren Vogric wrote:
> Yes, they are unusual events that occurred that affected my data. They have no positive affect in shaping a strong model.
>
> -----Original Message-----
> From: stephen sefick [mailto:ssefick at gmail.com]
> Sent: Friday, July 13, 2012 3:24 PM
> To: David L Carlson
> Cc: Lauren Vogric; r-help at r-project.org
> Subject: Re: [R] Fitting data and removing outliers
>
> Do you have a good reason to throw these points out?
>
> On Fri, Jul 13, 2012 at 2:17 PM, David L Carlson <dcarlson at tamu.edu> wrote:
>> I didn't actually see any question in this posting, but instead of removing the outliers consider using a robust linear model.
>>
>> library(MASS)
>> ?rlm
>>
>> The TeachingDemos package has a data set called outliers to show what can happen when you iteratively remove "outliers" in the way you suggest.
>>
>> -------------------------------------
>> David L Carlson
>> Associate Professor of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>>
>>
>> ----- Original Message -----
>>
>> From: "Lauren Vogric" <lvogric at grahamcapital.com>
>> To: r-help at r-project.org
>> Sent: Friday, July 13, 2012 1:36:43 PM
>> Subject: [R] Fitting data and removing outliers
>>
>> What I'm trying to do is create best fit line in R for a set of data points and then remove all the outliers to re-create a best fit. I can't use IQR because the outliers I have in mind are easily within the range, but way out of line for the best fit, which is ruining the fit. I'd rather throw out those points all together.
>>
>> Thanks!
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
> Stephen Sefick
> **************************************************
> Auburn University
> Biological Sciences
> 331 Funchess Hall
> Auburn, Alabama
> 36849
> **************************************************
> sas0025 at auburn.edu
> http://www.auburn.edu/~sas0025
> **************************************************
>
> Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.  We are mammals, and have not exhausted the annoying little problems of being mammals.
>
>                                  -K. Mullis
>
> "A big computer, a complex algorithm and a long time does not equal science."
>
>                                -Robert Gentleman



More information about the R-help mailing list