[R] removing outlier

David Winsemius dwinsemius at comcast.net
Mon Sep 14 07:29:30 CEST 2015



If this mailing list accepted formatted submissions I would have used the
trèsModernSarcastic font for my first sentence. Failing the availability of
that mode of communication I am (top) posting through Nabble (perhaps)  in
"Comic Sans".<br />

On Sat, Sep 12, 2015 at 9:52 AM, David Winsemius <dwinsemius@> wrote:
>
> On Sep 12, 2015, at 2:32 AM, Juli wrote:

>> And if I remove the outliers, my problem ist, that as you said, they
>> differ
>> in length. I need the data frame for a regression, so can I remove the
>> whole
>> column or is there a call to exclude the data?
>
*> Most regression methods have a 'subset' parameter which would allow you
to distort the data to your desired specification.*


Bert Gunter-2 wrote
> 
/
> ... and this, of course, is a nice example of how statistics
> contributes to the "irreproducibility crisis" now roiling Science.
/
> 
> Cheers,
> Bert
> 
> (Quote from a long ago engineering colleague: "Whenever I see an
> outlier, I never know whether to throw it away or patent it.")
> 
> 
> Bert Gunter
> 
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
>    -- Clifford Stoll
> 
> 
> On Sat, Sep 12, 2015 at 9:52 AM, David Winsemius <

> dwinsemius@

> > wrote:
>>
>> On Sep 12, 2015, at 2:32 AM, Juli wrote:
>>
>>> Hi Jim,
>>>
>>> thank you for your help. :)
>>>
>>> My point is, that there are outlier and I don´t really know how to deal
>>> with
>>> that.
>>>
>>> I need the dataframe for a regression and read often that only a few
>>> outlier
>>> can change your results very much. In addition, regression diacnostics
>>> didn´t indcate me the best results.
>>> Yes, and I know its not the core of statistics to work in a way you get
>>> results you would like to have ;).
>>>
>>> So what is your suggestion?
>>>
>>> And if I remove the outliers, my problem ist, that as you said, they
>>> differ
>>> in length. I need the data frame for a regression, so can I remove the
>>> whole
>>> column or is there a call to exclude the data?
>>
>> Most regression methods have a 'subset' parameter which would allow you
>> to distort the data to your desired specification. But why not think
>> about examining a different statistical model or using robust methods?
>> That way you can keep all your data. (Sounds like you don't really have a
>> lot.)
>>
>> --
>> David.
>>>
>>> JULI
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/removing-outlier-tp4712137p4712170.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> 

> R-help@

>  mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> 

> R-help@

>  mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________

> R-help@

>  mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.





--
View this message in context: http://r.789695.n4.nabble.com/removing-outlier-tp4712137p4712208.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list