[R] how to extract word before /// in a data frame contain many thousands rows.
arun
smartpink111 at yahoo.com
Fri Aug 1 07:28:38 CEST 2014
Try:
If dat is the dataset.
library(stringr)
res <- str_extract(dat$Gene.Symbol, perl('[[:alnum:]]+(?= \\/)'))
res[!is.na(res)]
#[1] "CDH23"
A.K.
On Thursday, July 31, 2014 9:54 PM, Stephen HK Wong <honkit at stanford.edu> wrote:
Dear All,
I appreciate if you can help me out this. I have a data frame contains many thousand of rows, with some rows that has /// symbol, as shown in in row 2, I want to extract word before ///, such as in this case, CDH23. Many thanks.
Probe.Set.ID Gene.Symbol
1 1552301_a_at CORO6
2 1552436_a_at CDH23 /// LOC100653137
3 1552477_a_at IRF6
4 1552685_a_at GRHL1
5 1552742_at KCNH8
6 1552752_a_at CADM2
7 1552799_at TSNARE1
8 1552897_a_at KCNG3
9 1552902_a_at FOXP2
10 1552903_at B4GALNT2
structure(list(Probe.Set.ID = c("1552301_a_at", "1552436_a_at",
"1552477_a_at", "1552685_a_at", "1552742_at", "1552752_a_at",
"1552799_at", "1552897_a_at", "1552902_a_at", "1552903_at"),
Gene.Symbol = c("CORO6", "CDH23 /// LOC100653137", "IRF6",
"GRHL1", "KCNH8", "CADM2", "TSNARE1", "KCNG3", "FOXP2", "B4GALNT2"
)), .Names = c("Probe.Set.ID", "Gene.Symbol"), row.names = c(NA,
10L), class = "data.frame")
Stephen HK Wong
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list