[R] selecting a subset of a matrix based on a value occurring in 5 records

Bob Green bgreen at dyson.brisnet.org.au
Wed Dec 24 13:16:55 CET 2008


Hello,

>I am hoping for some advice as to how I might create a subset of a 
>matrix. The matrix is  176  x 3530. The rows are individual records 
>and the columns words. I want to create a new matrix that only 
>consists of words which occur in at least 5 records.  For example, 
>if column 7 is "charges" and this only appears in 4 records/rows 
>this variable would not be included, whereas if column 109 was the 
>word "monitor" and occurred in 95 records it would be saved into the 
>new matrix. Values in the matrix are  numbers, such that if a word 
>does not occur in a  record the cell contains a zero, whereas if it 
>occurs 7 times there is a value of 7 for that record. It is the 
>number of records rather than the than the column total that is the 
>criteria for determing inclusion into the matrix.


Any suggestions on how I might reduce the size of this matrix so as 
to include only those columns in which a word occurs at least in 5 
records is much appreciated,

regards

Bob



More information about the R-help mailing list