[R] Data cleaning & Data preparation, what do R users want?
bgunter.4567 at gmail.com
Wed Nov 29 17:49:12 CET 2017
Oh Crap! I mistakenly replied onlist. PLEASE IGNORE -- these are only my
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Wed, Nov 29, 2017 at 8:48 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
> I don't think my view is of interest to many, so offlist.
> I reject this:
> " I would consider data analysis work to be three stages: data preparation,
> statistical analysis, and producing the report."
> For example, there is no such thing as "outliers" -- data to be removed as
> part of cleaning/preparation -- without a statistical model to be an
> "outlier" **from**, which is part of the statistical analysis. And the
> structure of the data (data preparation) may need to change depending on
> the course of the analysis (including graphics, also part of the analysis).
> So I think your view reflects a naïve view of the nature of data analysis,
> which is an iterative and holistic process. I suspect your training is as a
> computer scientist and you have not done much 1-1 consulting with
> researchers, though you should certainly feel free to reject this canard.
> Building software for large scale automated analysis of data required a
> much different analytical paradigm than the statistical consulting model,
> which is largely my background.
> No reply necessary. Just my opinion, which you are of course free to trash.
> Bert Gunter
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> On Wed, Nov 29, 2017 at 8:37 AM, Robert Wilkins <iwritecode2 at gmail.com>
>> R has a very wide audience, clinical research, astronomy, psychology, and
>> so on and so on.
>> I would consider data analysis work to be three stages: data preparation,
>> statistical analysis, and producing the report.
>> This regards the process of getting the data ready for analysis and
>> reporting, sometimes called "data cleaning" or "data munging" or "data
>> So as regards tools for data preparation, speaking to the highly diverse
>> audience mentioned, here is my question:
>> What do you want?
>> Or are you already quite happy with the range of tools that is currently
>> before you?
>> [BTW, I posed the same question last week to the r-devel list, and was
>> advised that r-help might be a more suitable audience by one of the
>> Robert Wilkins
>> [[alternative HTML version deleted]]
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help