[R] applying strsplit to a whole column

David Winsemius dwinsemius at comcast.net
Wed Aug 4 20:10:53 CEST 2010


On Aug 4, 2010, at 1:42 PM, Dimitri Liakhovitski wrote:

> I am sorry, I'd like to split my column ("names") such that all the
> beginning of a string ("X..") is gone and only the rest of the text is
> left.

I could not tell whether it was the string "X.." or the pattern "X.."  
that was your goal for matching and removal.
>
> x<-data.frame(names=c("X..aba","X..abb","X..abc","X..abd"))
> x$names<-as.character(x$names)

a) Instead of "names" which is heavily used function name, use  
something more specific. Otherwise you get:
 > names(x)
"names"  # and thereby avoid list comments about canines.

b) Instead of coercing a character vector back to a character vector,  
use stringsAsFactors = FALSE.

 > x<-data.frame(nam1=c("X..aba","X..abb","X..abc","X..abd"),  
stringsAsFactors=FALSE)
#Thus is the pattern version:

 > x$nam1 <- gsub("X..",'', x$nam1)
 > x
   nam1
1   aba
2   abb
3   abc
4   abd

This is the string version:
 > x<-data.frame(nam1=c("X......aba","X.y.abb","X..abc","X..abd"),  
stringsAsFactors=FALSE)
 >  x$nam1 <- gsub("X\\.+",'', x$nam1)
 > x
    nam1
1   aba
2 y.abb
3   abc
4   abd


> (x)
> str(x)
>
> Can't figure out how to apply strsplit in this situation - without
> using a loop. I hope it's possible to do it without a loop - is it?
-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list