[BioC] remove NA from named character vector
Iain Gallagher
iaingallagher at btopenworld.com
Fri Jul 22 13:28:39 CEST 2011
Hi Axel
I'm sure I knew that! Leaky brain!
Thanks
i
--- On Fri, 22/7/11, axel.klenk at actelion.com <axel.klenk at actelion.com> wrote:
> From: axel.klenk at actelion.com <axel.klenk at actelion.com>
> Subject: Re: [BioC] remove NA from named character vector
> To: "Iain Gallagher" <iaingallagher at btopenworld.com>
> Cc: "bioconductor" <bioconductor at stat.math.ethz.ch>, bioconductor-bounces at r-project.org
> Date: Friday, 22 July, 2011, 12:11
> Hi Iain,
>
> you cannot test for NA using the == operator, you'll have
> to use is.na(),
> eg.
>
> which(is.na(egs))
>
> or, if you just want to get rid of them:
>
> na.omit(egs)
>
> HTH,
>
> - axel
>
>
> Axel Klenk
> Research Informatician
> Actelion Pharmaceuticals Ltd / Gewerbestrasse 16 / CH-4123
> Allschwil /
> Switzerland
>
>
>
>
> From:
> Iain Gallagher <iaingallagher at btopenworld.com>
> To:
> bioconductor <bioconductor at stat.math.ethz.ch>
> Date:
> 22.07.2011 13:03
> Subject:
> [BioC] remove NA from named character vector
> Sent by:
> bioconductor-bounces at r-project.org
>
>
>
> Hi List
>
> This is likely a trivial problem but it's annoying me. I am
> mapping from
> Bos taurus ensembl ids to symbols. I can do this in biomaRt
> but use of the
> org.Bt.eg.db package means I'm not tied to an internet
> connection.
>
> A toy example:
>
> library(org.Bt.eg.db)
> ens <- c('ENSBTAG00000004218', 'ENSBTAG00000004270',
> 'ENSBTAG00000004578',
> 'ENSBTAG00000004608')
> egs <- unlist(mget(ens, revmap(org.Bt.egENSEMBL),
> ifnotfound=NA))
>
> egs
>
> ENSBTAG00000004218 ENSBTAG00000004270 ENSBTAG00000004578
> ENSBTAG00000004608
> "617660"
> "407106"
> NA "100138951"
>
>
> # a named character vector with one NA
>
> #now get symbols
> syms <- unlist(mget(egs, org.Bt.egSYMBOL,
> ifnotfound=NA))
>
> #throws and error - fair enough - need to drop the NA
>
> which(egs == NA)
>
> #gives named integer(0) - hmm
> class(egs)
> #gives [1] "character" - so I'm quite confused now.
>
> NA %in% egs
> #gives [1] TRUE
>
>
> How do I identify which entries in 'egs' are NA so I can
> remove them? It's
> trivial here but the dataset I'm working with is in the
> thousands.
>
> Thanks
>
> iain
>
> > sessionInfo()
> R version 2.13.1 (2011-07-08)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.utf8
> LC_NUMERIC=C
> [3] LC_TIME=en_GB.utf8
> LC_COLLATE=en_GB.utf8
> [5] LC_MONETARY=C
> LC_MESSAGES=en_GB.utf8
> [7] LC_PAPER=en_GB.utf8
> LC_NAME=C
> [9] LC_ADDRESS=C
> LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices
> utils datasets
> methods base
>
> other attached packages:
> [1] org.Bt.eg.db_2.5.0 RSQLite_0.9-4
> DBI_0.2-5
> [4] AnnotationDbi_1.14.1 Biobase_2.10.0
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
> The information of this email and in any file transmitted
> with it is strictly confidential and may be legally
> privileged.
> It is intended solely for the addressee. If you are not the
> intended recipient, any copying, distribution or any other
> use of this email is prohibited and may be unlawful. In such
> case, you should please notify the sender immediately and
> destroy this email.
> The content of this email is not legally binding unless
> confirmed by letter.
> Any views expressed in this message are those of the
> individual sender, except where the message states otherwise
> and the sender is authorised to state them to be the views
> of the sender's company. For further information about
> Actelion please see our website at http://www.actelion.com
>
>
>
More information about the Bioconductor
mailing list