[R] Help using Cast (Text) Version

David Winsemius dwinsemius at comcast.net
Mon Jan 18 14:53:44 CET 2010


On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:

> Hi David
>
> Thanks for your patience, as well as thanks to Dennis Murphy and  
> James Rome for trying to help.
>
> I have tried all your suggestions but still no joy.
>
> In order to try and resolve the problem I am attaching the following  
> files, hope the system allows this.
>
> 1) Test_data_res.txt (used dput and this is all the data to be  
> evaluated )
> 2) Test_data_b.txt ( after performing the melt-cast. See the code)
> 3) Annual Results NLA WMS Ver1.r ( the code for one of the  
> parameters to be evaluated. In this case SPC)
>
> Background; the data is from a laboratory Proficiency Testing Scheme  
> and the z-scores outside the |3| range, are identified as "fails".  
> My code assigns a 1 or 0 depending on this evaluation and because  
> not every lab participates in every round NA are assigned where  
> there are no results.
>
> What I am looking for is the following for each round (1-6)
> a) The total number of participants which in this case are  
> represented by 1's and 0' per round

 > apply(b[,-1], 2, function(x) sum(is.na(x) ) )
[1] 32 21 21 18 14 15



> b) The total number of 1's, ie Fails per round

 > apply(b[,-1], 2, sum, na.rm=TRUE )
[1] 5 2 4 3 5 7

>
>
>
> Regards
> Steve
>
>
>
> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net 
> >
> To: "Steve Sidney" <sbsidney at mweb.co.za>
> Cc: <r-help at r-project.org>
> Sent: Monday, January 18, 2010 12:38 AM
> Subject: Re: [R] Help using Cast (Text) Version
>
>
>>
>> On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
>>
>>> Well now I am totally baffled !!!!!!!!!!
>>>
>>> Using
>>>
>>> sum( !is.na(b[,3])) I get the total of all col 3 except those  
>>> that  are NA -
>>> Great solves the first problem
>>>
>>> What I can't seem to do is use the same logic to count all the  
>>> 1's  in that
>>> col, which are there before I use the cast with margins.
>>>
>>> So it seems to me that somehow   is wrong and is the part of my  
>>> understanding that's missing.
>>>
>>> My guess is that that before using margins and sum in the cast   
>>> statement the col is a character type and in order for == 1 to  
>>> work  I need to convert this to an integer.
>>
>> Yiu can test your theory with:
>>
>> sum(as.integer(b[,3]) == 1)
>>
>> Or you could post some reproducible data using dput ....
>>
>> -- 
>> David.
>>
>>
>>>
>>> Hope this helps you to understand the problem.
>>>
>>> Regards
>>> Steve
>>>
>>> Your help is much appreciated
>>> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
>>> >
>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>> Cc: <r-help at r-project.org>
>>> Sent: Sunday, January 17, 2010 7:36 PM
>>> Subject: Re: [R] Help using Cast (Text) Version
>>>
>>>
>>>>
>>>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>>>
>>>>> David
>>>>>
>>>>> Thanks, I'll try that......but no what I need is the total (1's)  
>>>>> for
>>>>> each of the rows, labelled 1-6 at the top of each col in the table
>>>>> provided.
>>>>
>>>> Part of my confusion with your request (which remains  
>>>> unaddressed) is
>>>> what you mean by "valid". The melt-cast operation has turned a   
>>>> bunch of
>>>> NA's into 0's which are now indistinguishable from the original    
>>>> 0's. So I
>>>> don't see any way that operating on "b" could tell you the   
>>>> numbers  you
>>>> are asking for. If you were working on the original data,  "res",  
>>>> you
>>>> might have gotten the column-wise "valid" counts of column  2 with
>>>> something like:
>>>>
>>>> sum( !is.na(res[,2]) )
>>>>
>>>>>
>>>>> What I guess I am not sure of is how to identify the col after   
>>>>> the melt
>>>>> and cast.
>>>>
>>>> The cast object represents columns as a list of vectors. The i-th  
>>>> column
>>>> is b[[i]] which could be further referenced as a vector. So the   
>>>> j- th row
>>>> entry for the i-th column would be b[[i]][j].
>>>>
>>>>
>>>>>
>>>>> Steve
>>>>>
>>>>> ----- Original Message ----- From: "David Winsemius"
>>>>> <dwinsemius at comcast.net
>>>>> >
>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>> Cc: <r-help at r-project.org>
>>>>> Sent: Sunday, January 17, 2010 4:39 PM
>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>
>>>>>
>>>>>>
>>>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>>>
>>>>>>> Sorry to repeat the meassage, not sure if the HTML version  
>>>>>>> has   been
>>>>>>> received - Apologies for duplication
>>>>>>>
>>>>>>> Dear list
>>>>>>>
>>>>>>> I am trying to count the no of occurances in a column of a   
>>>>>>> data frame
>>>>>>> and there is missing data identifed by NA.
>>>>>>>
>>>>>>> I am able to melt and cast the data correctly as well as sum the
>>>>>>> occurances using margins and sum.
>>>>>>>
>>>>>>> Here are the melt and cast commands
>>>>>>>
>>>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>>>
>>>>>>> Sample Data (before using sum and margins)
>>>>>>>
>>>>>>> lab  1  2  3  4  5  6
>>>>>>> 1  4er66  1 NA  1  0 NA  0
>>>>>>> 2  4gcyi  0  0  1  0  0  0
>>>>>>> 3  5d3hh  0  0  0 NA  0  0
>>>>>>> 4  5d3wt  0  0  0  0  0  0
>>>>>>> .
>>>>>>> . lines deleted to save space
>>>>>>> .
>>>>>>> 69 v3st5 NA NA  1 NA NA NA
>>>>>>> 70 a22g5 NA  0 NA NA NA NA
>>>>>>> 71 b5dd3 NA  0 NA NA NA NA
>>>>>>> 72 g44d2 NA  0 NA NA NA NA
>>>>>>>
>>>>>>> Data after using sum and margins
>>>>>>>
>>>>>>> lab 1 2 3 4 5 6 (all)
>>>>>>> 1  4er66 1 0 1 0 0 0     2
>>>>>>> 2  4gcyi 0 0 1 0 0 0     1
>>>>>>> 3  5d3hh 0 0 0 0 0 0     0
>>>>>>> 4  5d3wt 0 0 0 0 0 0     0
>>>>>>> 5  6n44r 0 0 0 0 0 0     0
>>>>>>> .
>>>>>>> .lines deleted to save space
>>>>>>> .
>>>>>>> 70 a22g5 0 0 0 0 0 0     0
>>>>>>> 71 b5dd3 0 0 0 0 0 0     0
>>>>>>> 72 g44d2 0 0 0 0 0 0     0
>>>>>>> 73 (all) 5 2 4 3 5 7    26
>>>>>>>
>>>>>>> Uisng length just tells me how many total rows there are.
>>>>>>
>>>>>>
>>>>>>> What I need to do is count how many rows there is valid data,   
>>>>>>> in this
>>>>>>> case either a one (1) or a zero (0) in b
>>>>>>
>>>>>> I'm guessing that you mean to apply that test to the column in b
>>>>>> labeled "(all)" . If that's the case, then something like    
>>>>>> (obviously
>>>>>> untested):
>>>>>>
>>>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I have a report to construct for tomorrow Mon so any help  
>>>>>>> would be
>>>>>>> appreciated
>>>>>>>
>>>>>>> Regards
>>>>>>> Steve
>>>>>>
>>>>>> David Winsemius, MD
>>>>>> Heritage Laboratories
>>>>>> West Hartford, CT
>>>>>>
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>>>
>>>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
> <Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list