[R] cleanse columns and unwanted rows

David Winsemius dwinsemius at comcast.net
Sat Nov 14 03:20:07 CET 2009


On Nov 13, 2009, at 2:32 PM, frenchcr wrote:

> hello folks,
>
> Im trying to clean out a large file with data i dont need.
> The column im manipulating in the file is called "legal status"
> Their are three kinds of rows i want to remove.
> Those that have "Private", "Private (Op", or "Unknown" in the  
> legal_status
> column.
>
>
> I wrote this code but it syas im missing a TRUE/ False thingy...im
> lost...heres the code...
>

Come on, "frenchcr". Just copy and post the damned error message.

>
> cleanse <- function(a){
> data1<-a
>
>  for (i in 1:dim(data1)[1])

>  {
>    if (data1[i,"
>    {
>    data1[i,"legal_status"]<-data1[-i,"legal_status"]

That will return every thing but one particular row
>    }
>    if (data1[i,""){
>    data1[i,"legal_status"]<-data1[-i,"legal_status"]

ditto
>    }
>    if (data1[i,""){
>    data1[i,"legal_status"]<-data1[-i,"legal_status"]
>    }
> }

Makes for a lot of data.frame copying even if you hadn't sabotaged up  
the registration of the indexing with the shrinking dataframe.

> return(data1)
> }
> new_data<-cleanse(data)

new_data <- subset(data, legal_status != "Private" & legal_status !=  
"Private(Op" & legal_status != "Unknown")

Or maybe:
"%not-in%" <- function(x, table) match(x, table, nomatch = 0) == 0
new_data <- subset(data, legal_status %not-in% c( "Private" ,  
"Private(Op" , "Unknown") )
>
-- 


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list