[R] cleanse columns and unwanted rows
David Winsemius
dwinsemius at comcast.net
Sat Nov 14 03:20:07 CET 2009
On Nov 13, 2009, at 2:32 PM, frenchcr wrote:
> hello folks,
>
> Im trying to clean out a large file with data i dont need.
> The column im manipulating in the file is called "legal status"
> Their are three kinds of rows i want to remove.
> Those that have "Private", "Private (Op", or "Unknown" in the
> legal_status
> column.
>
>
> I wrote this code but it syas im missing a TRUE/ False thingy...im
> lost...heres the code...
>
Come on, "frenchcr". Just copy and post the damned error message.
>
> cleanse <- function(a){
> data1<-a
>
> for (i in 1:dim(data1)[1])
> {
> if (data1[i,"
> {
> data1[i,"legal_status"]<-data1[-i,"legal_status"]
That will return every thing but one particular row
> }
> if (data1[i,""){
> data1[i,"legal_status"]<-data1[-i,"legal_status"]
ditto
> }
> if (data1[i,""){
> data1[i,"legal_status"]<-data1[-i,"legal_status"]
> }
> }
Makes for a lot of data.frame copying even if you hadn't sabotaged up
the registration of the indexing with the shrinking dataframe.
> return(data1)
> }
> new_data<-cleanse(data)
new_data <- subset(data, legal_status != "Private" & legal_status !=
"Private(Op" & legal_status != "Unknown")
Or maybe:
"%not-in%" <- function(x, table) match(x, table, nomatch = 0) == 0
new_data <- subset(data, legal_status %not-in% c( "Private" ,
"Private(Op" , "Unknown") )
>
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list