[R] How to ignore data
sbsidney at mweb.co.za
Mon Dec 13 20:36:37 CET 2010
Oh dear oh dear!!! another arrogant statistician/scientist
One asks for help and instead one gets an ear full!!!
So much for the much vaunted helpful R community.
But thanks anyway, I guess you were trying........
On 2010/12/13 08:17 PM, Bert Gunter wrote:
> Inline below. -- Bert
> On Mon, Dec 13, 2010 at 9:42 AM, Steve Sidney<sbsidney at mweb.co.za> wrote:
>> Thanks for the questions.
>> 1) The data represents micro-organism counts and a count of zero in this
>> case is highly unlikely given the info we have; including the other
> ?? Censoring or an experimental failure? Big difference.
>> 2) The data is submitted in duplicate and then a standardised sum and
>> difference is established and is used to calculate a Z-score which is used
>> as a measure of performance.
> Z scores are usually inappropriate for count data, which are discrete
> and tend to be skew.
>> Given both 1) and 2) it is necessary to exclude a raw count of zero (since
>> the log of 0 is meaningless) and a count of one (since the log of 1 of
>> course is zero).
> False. Correct statement is: "Because I do not know the statistical
> methodology necessary to handle such discrete data with 0 counts, I
> exclude them." You are confusing your ignorance of statistical
> methodology with the need for spurious ad hoc treatments. 0 counts can
> and should be handled by appropriate statistical methods (e.g.
> possibly 0 inflated Poisson models via glm() or otherwise).
>> I guess one can think of these values as outliers and that is what I am
>> trying to exclude.
> This is a wholly unscientific statement, I'm afraid.
>> There is ample evidence that such an approach is acceptable.
> What evidence, pray tell? -- a prior culture of inappropriate
> analyses, perhaps? I do not wish to engage in a debate about this,
> but, again, all I can say is that the above statement is not
> scientific. If I were consulting with you, I would say "Please show me
> your 'evidence.' " But, of course, I am not, and won't.
> None of this is to say that you aren't correct in all respects. It is
> just that you have raised all my usual warning flags, so that I am
> somewhat skeptical. But that's MY problem. This is the last I will say
> on the matter, so feel free to get in the final word, as I will not
> And I wish you success in your efforts.
> -- Bert
>> Thanks for the interest
>> On 2010/12/13 06:47 PM, Stavros Macrakis wrote:
>>> If you need to take the log of the values for your calculation, then
>>> what does it mean that you have 0 values in the input?
>>> And why do you need to exclude the 1 values?
>>> Are you sure that a) you are doing the correct kind of analysis and b)
>>> the analysis is correct if you exclude 0 and 1?
>>> On Mon, Dec 13, 2010 at 10:38, Steve Sidney<sbsidney at mweb.co.za> wrote:
>>>> Dear list
>>>> I have quite a small data set in which I need to have the following
>>>> ignored - not used when performing an analysis but they need to be
>>>> later in the report that I write.
>>>> Can anyone help with a suggestion as to how this can be accomplished
>>>> Values to be ignored
>>>> 0 - zero and 1 this is in addition to NA (null)
>>>> The reason is that I need to use the log10 of the values when performing
>>>> Currently I hand massage the data set, about a 100 values, of which less
>>>> than 5 to 10 are in this category.
>>>> The NA values are NOT the problem
>>>> What I was hoping was that I did not have to use a series of if and
>>>> statements. Perhaps there is a more elegant solution.
>>>> Any ideas would be welcomed.
>>>> R-help at r-project.org mailing list
>>>> PLEASE do read the posting guide
>>>> and provide commented, minimal, self-contained, reproducible code.
>> R-help at r-project.org mailing list
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help