[R] Help using Cast (Text) Version
Steve Sidney
sbsidney at mweb.co.za
Mon Jan 18 16:26:45 CET 2010
David
Excellent !!!!! It its exactly what I was looking for.
Two very small questions to conclude
1) I don't understand the significance of the -1 in the sq brackets.
2) Not sure I really understand how function(x)works in this context.
If you can point me towards a doc that explains this in simple terms I would
be obliged. Don't expect you to have to provide the answer.
Once again many thanks for your patience and help
Regards
Steve
----- Original Message -----
From: "David Winsemius" <dwinsemius at comcast.net>
To: "David Winsemius" <dwinsemius at comcast.net>
Cc: "Steve Sidney" <sbsidney at mweb.co.za>; <r-help at r-project.org>
Sent: Monday, January 18, 2010 3:58 PM
Subject: Re: [R] Help using Cast (Text) Version
>
> On Jan 18, 2010, at 8:53 AM, David Winsemius wrote:
>
>>
>> On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
>>
>>> Hi David
>>>
>>> Thanks for your patience, as well as thanks to Dennis Murphy and James
>>> Rome for trying to help.
>>>
>>> I have tried all your suggestions but still no joy.
>>>
>>> In order to try and resolve the problem I am attaching the following
>>> files, hope the system allows this.
>>>
>>> 1) Test_data_res.txt (used dput and this is all the data to be
>>> evaluated )
>>> 2) Test_data_b.txt ( after performing the melt-cast. See the code)
>>> 3) Annual Results NLA WMS Ver1.r ( the code for one of the parameters
>>> to be evaluated. In this case SPC)
>>>
>>> Background; the data is from a laboratory Proficiency Testing Scheme
>>> and the z-scores outside the |3| range, are identified as "fails". My
>>> code assigns a 1 or 0 depending on this evaluation and because not
>>> every lab participates in every round NA are assigned where there are
>>> no results.
>>>
>>> What I am looking for is the following for each round (1-6)
>>> a) The total number of participants which in this case are represented
>>> by 1's and 0' per round
>>
>> > apply(b[,-1], 2, function(x) sum(is.na(x) ) )
>> [1] 32 21 21 18 14 15
>
> Ooops, forgot the negation operator to turn not(NA) into TRUE:
>
> > apply(b[,-1], 2, function(x) sum(!is.na(x) ) )
> [1] 40 51 51 54 58 57
>
>>
>>
>>
>>> b) The total number of 1's, ie Fails per round
>>
>> > apply(b[,-1], 2, sum, na.rm=TRUE )
>> [1] 5 2 4 3 5 7
>>
>>>
>>>
>>>
>>> Regards
>>> Steve
>>>
>>>
>>>
>>> ----- Original Message ----- From: "David Winsemius"
>>> <dwinsemius at comcast.net
>>> >
>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>> Cc: <r-help at r-project.org>
>>> Sent: Monday, January 18, 2010 12:38 AM
>>> Subject: Re: [R] Help using Cast (Text) Version
>>>
>>>
>>>>
>>>> On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
>>>>
>>>>> Well now I am totally baffled !!!!!!!!!!
>>>>>
>>>>> Using
>>>>>
>>>>> sum( !is.na(b[,3])) I get the total of all col 3 except those that
>>>>> are NA -
>>>>> Great solves the first problem
>>>>>
>>>>> What I can't seem to do is use the same logic to count all the 1's
>>>>> in that
>>>>> col, which are there before I use the cast with margins.
>>>>>
>>>>> So it seems to me that somehow is wrong and is the part of my
>>>>> understanding that's missing.
>>>>>
>>>>> My guess is that that before using margins and sum in the cast
>>>>> statement the col is a character type and in order for == 1 to work
>>>>> I need to convert this to an integer.
>>>>
>>>> Yiu can test your theory with:
>>>>
>>>> sum(as.integer(b[,3]) == 1)
>>>>
>>>> Or you could post some reproducible data using dput ....
>>>>
>>>> --
>>>> David.
>>>>
>>>>
>>>>>
>>>>> Hope this helps you to understand the problem.
>>>>>
>>>>> Regards
>>>>> Steve
>>>>>
>>>>> Your help is much appreciated
>>>>> ----- Original Message ----- From: "David Winsemius"
>>>>> <dwinsemius at comcast.net
>>>>> >
>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>> Cc: <r-help at r-project.org>
>>>>> Sent: Sunday, January 17, 2010 7:36 PM
>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>
>>>>>
>>>>>>
>>>>>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> Thanks, I'll try that......but no what I need is the total (1's)
>>>>>>> for
>>>>>>> each of the rows, labelled 1-6 at the top of each col in the table
>>>>>>> provided.
>>>>>>
>>>>>> Part of my confusion with your request (which remains unaddressed)
>>>>>> is
>>>>>> what you mean by "valid". The melt-cast operation has turned a
>>>>>> bunch of
>>>>>> NA's into 0's which are now indistinguishable from the original
>>>>>> 0's. So I
>>>>>> don't see any way that operating on "b" could tell you the numbers
>>>>>> you
>>>>>> are asking for. If you were working on the original data, "res",
>>>>>> you
>>>>>> might have gotten the column-wise "valid" counts of column 2 with
>>>>>> something like:
>>>>>>
>>>>>> sum( !is.na(res[,2]) )
>>>>>>
>>>>>>>
>>>>>>> What I guess I am not sure of is how to identify the col after the
>>>>>>> melt
>>>>>>> and cast.
>>>>>>
>>>>>> The cast object represents columns as a list of vectors. The i- th
>>>>>> column
>>>>>> is b[[i]] which could be further referenced as a vector. So the j-
>>>>>> th row
>>>>>> entry for the i-th column would be b[[i]][j].
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>> ----- Original Message ----- From: "David Winsemius"
>>>>>>> <dwinsemius at comcast.net
>>>>>>> >
>>>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>>>> Cc: <r-help at r-project.org>
>>>>>>> Sent: Sunday, January 17, 2010 4:39 PM
>>>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>>>>>
>>>>>>>>> Sorry to repeat the meassage, not sure if the HTML version has
>>>>>>>>> been
>>>>>>>>> received - Apologies for duplication
>>>>>>>>>
>>>>>>>>> Dear list
>>>>>>>>>
>>>>>>>>> I am trying to count the no of occurances in a column of a data
>>>>>>>>> frame
>>>>>>>>> and there is missing data identifed by NA.
>>>>>>>>>
>>>>>>>>> I am able to melt and cast the data correctly as well as sum the
>>>>>>>>> occurances using margins and sum.
>>>>>>>>>
>>>>>>>>> Here are the melt and cast commands
>>>>>>>>>
>>>>>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>>>>>
>>>>>>>>> Sample Data (before using sum and margins)
>>>>>>>>>
>>>>>>>>> lab 1 2 3 4 5 6
>>>>>>>>> 1 4er66 1 NA 1 0 NA 0
>>>>>>>>> 2 4gcyi 0 0 1 0 0 0
>>>>>>>>> 3 5d3hh 0 0 0 NA 0 0
>>>>>>>>> 4 5d3wt 0 0 0 0 0 0
>>>>>>>>> .
>>>>>>>>> . lines deleted to save space
>>>>>>>>> .
>>>>>>>>> 69 v3st5 NA NA 1 NA NA NA
>>>>>>>>> 70 a22g5 NA 0 NA NA NA NA
>>>>>>>>> 71 b5dd3 NA 0 NA NA NA NA
>>>>>>>>> 72 g44d2 NA 0 NA NA NA NA
>>>>>>>>>
>>>>>>>>> Data after using sum and margins
>>>>>>>>>
>>>>>>>>> lab 1 2 3 4 5 6 (all)
>>>>>>>>> 1 4er66 1 0 1 0 0 0 2
>>>>>>>>> 2 4gcyi 0 0 1 0 0 0 1
>>>>>>>>> 3 5d3hh 0 0 0 0 0 0 0
>>>>>>>>> 4 5d3wt 0 0 0 0 0 0 0
>>>>>>>>> 5 6n44r 0 0 0 0 0 0 0
>>>>>>>>> .
>>>>>>>>> .lines deleted to save space
>>>>>>>>> .
>>>>>>>>> 70 a22g5 0 0 0 0 0 0 0
>>>>>>>>> 71 b5dd3 0 0 0 0 0 0 0
>>>>>>>>> 72 g44d2 0 0 0 0 0 0 0
>>>>>>>>> 73 (all) 5 2 4 3 5 7 26
>>>>>>>>>
>>>>>>>>> Uisng length just tells me how many total rows there are.
>>>>>>>>
>>>>>>>>
>>>>>>>>> What I need to do is count how many rows there is valid data, in
>>>>>>>>> this
>>>>>>>>> case either a one (1) or a zero (0) in b
>>>>>>>>
>>>>>>>> I'm guessing that you mean to apply that test to the column in b
>>>>>>>> labeled "(all)" . If that's the case, then something like
>>>>>>>> (obviously
>>>>>>>> untested):
>>>>>>>>
>>>>>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I have a report to construct for tomorrow Mon so any help would
>>>>>>>>> be
>>>>>>>>> appreciated
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Steve
>>>>>>>>
>>>>>>>> David Winsemius, MD
>>>>>>>> Heritage Laboratories
>>>>>>>> West Hartford, CT
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> David Winsemius, MD
>>>>>> Heritage Laboratories
>>>>>> West Hartford, CT
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>> <Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
More information about the R-help
mailing list