[R] how to separate char and num within a variable
Bill Hyman
billhyman1 at yahoo.com
Fri Feb 6 01:46:47 CET 2009
Thx a lot!
----- Original Message ----
From: Marc Schwartz <marc_schwartz at comcast.net>
To: Bill Hyman <billhyman1 at yahoo.com>
Cc: r-help at r-project.org
Sent: Thursday, February 5, 2009 3:39:53 PM
Subject: Re: [R] how to separate char and num within a variable
on 02/05/2009 05:20 PM Bill Hyman wrote:
> Hi all,
>
> I read in a column which looks like "chr1:000889594-000889638", and
> need to break them into three columns like "chr1:", "000889594" and
> "000889638". How shall I do in R. Thanks a lot for your suggestions!
See ?strsplit
Vec <- "chr1:000889594-000889638"
> Vec
[1] "chr1:000889594-000889638"
# Use a regular expression, defining the 'split' character
# as either ":" or "-", where the vertical bar means 'or':
> strsplit(Vec, split = ":|-")
[[1]]
[1] "chr1" "000889594" "000889638"
Note that the split characters are not retained in the result.
Let's presume that you have a column in a data frame of the original
data and wish to split it into 3 columns:
DF <- data.frame(Col = rep(Vec, 10))
> DF
Col
1 chr1:000889594-000889638
2 chr1:000889594-000889638
3 chr1:000889594-000889638
4 chr1:000889594-000889638
5 chr1:000889594-000889638
6 chr1:000889594-000889638
7 chr1:000889594-000889638
8 chr1:000889594-000889638
9 chr1:000889594-000889638
10 chr1:000889594-000889638
Note that by default, 'Col' will be a factor and strsplit() expects a
character vector, thus we do the coercion and use do.call() to create a
character matrix, via rbind(), from the result:
> do.call(rbind, strsplit(as.character(DF$Col), split = ":|-"))
[,1] [,2] [,3]
[1,] "chr1" "000889594" "000889638"
[2,] "chr1" "000889594" "000889638"
[3,] "chr1" "000889594" "000889638"
[4,] "chr1" "000889594" "000889638"
[5,] "chr1" "000889594" "000889638"
[6,] "chr1" "000889594" "000889638"
[7,] "chr1" "000889594" "000889638"
[8,] "chr1" "000889594" "000889638"
[9,] "chr1" "000889594" "000889638"
[10,] "chr1" "000889594" "000889638"
See ?regex, ?do.call and ?rbind for more information.
HTH,
Marc Schwartz
More information about the R-help
mailing list