[BioC] remove NA from named character vector
axel.klenk at actelion.com
axel.klenk at actelion.com
Fri Jul 22 13:11:19 CEST 2011
Hi Iain,
you cannot test for NA using the == operator, you'll have to use is.na(),
eg.
which(is.na(egs))
or, if you just want to get rid of them:
na.omit(egs)
HTH,
- axel
Axel Klenk
Research Informatician
Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123 Allschwil /
Switzerland
From:
Iain Gallagher <iaingallagher at btopenworld.com>
To:
bioconductor <bioconductor at stat.math.ethz.ch>
Date:
22.07.2011 13:03
Subject:
[BioC] remove NA from named character vector
Sent by:
bioconductor-bounces at r-project.org
Hi List
This is likely a trivial problem but it's annoying me. I am mapping from
Bos taurus ensembl ids to symbols. I can do this in biomaRt but use of the
org.Bt.eg.db package means I'm not tied to an internet connection.
A toy example:
library(org.Bt.eg.db)
ens <- c('ENSBTAG00000004218', 'ENSBTAG00000004270', 'ENSBTAG00000004578',
'ENSBTAG00000004608')
egs <- unlist(mget(ens, revmap(org.Bt.egENSEMBL), ifnotfound=NA))
egs
ENSBTAG00000004218 ENSBTAG00000004270 ENSBTAG00000004578
ENSBTAG00000004608
"617660" "407106" NA "100138951"
# a named character vector with one NA
#now get symbols
syms <- unlist(mget(egs, org.Bt.egSYMBOL, ifnotfound=NA))
#throws and error - fair enough - need to drop the NA
which(egs == NA)
#gives named integer(0) - hmm
class(egs)
#gives [1] "character" - so I'm quite confused now.
NA %in% egs
#gives [1] TRUE
How do I identify which entries in 'egs' are NA so I can remove them? It's
trivial here but the dataset I'm working with is in the thousands.
Thanks
iain
> sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C
[3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8
[7] LC_PAPER=en_GB.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] org.Bt.eg.db_2.5.0 RSQLite_0.9-4 DBI_0.2-5
[4] AnnotationDbi_1.14.1 Biobase_2.10.0
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged.
It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email.
The content of this email is not legally binding unless confirmed by letter.
Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorised to state them to be the views of the sender's company. For further information about Actelion please see our website at http://www.actelion.com
More information about the Bioconductor
mailing list