[BioC] NA geneSymbol with lumi
Sebastien Gerega
seb at gerega.net
Fri Nov 16 01:11:12 CET 2007
Sebastien Gerega <seb at ...> writes:
>
> Hi,
> I am using the lumi package to analyse illumina microarray data.
> When it finally comes to getting the top 10 DE genes with topTable I get
> many hits with
> the geneSymbol <NA>. However, if I look up the ProbeID corresponding to
> the nuID
> that provide <NA>, I find that they do correspond to genes. Why aren't
> they being
> displayed in the topTable?
> thanks,
> Sebastien
>
> ID geneSymbol logFC t P.Value
> adj.P.Val B
> 1917 fwfUovXT3rjAjqbpJU S100A8 -5.307223 -50.43759 9.854174e-09
> 0.0001383625 8.724832
> 12632 Qd_S7V4OkLjsX3jkt4 KRT6B -5.281406 -39.54237 3.896317e-08
> 0.0002735409 8.229157
> 12149 BjSTT6BOqGLhpKKFGI <NA> -3.118669 -30.01505 1.844180e-07
> 0.0008631377 7.451766
> 7474 6ipCUUDxcp4ryIj6Uk <NA> -3.155916 -24.45685 5.835502e-07
> 0.0013366890 6.716048
> 3831 3nivfFfvk55Rd18lLk <NA> -2.690362 -24.10891 6.324511e-07
> 0.0013366890 6.659617
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at ...
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
I have looked into this problem a little more...
I downloaded the Human6_v2_sequence spreadsheet from the Illumina
website and found that many of the targets that provide NA as
gene symbol have no symbol in the Illumina database either.
For example:
ID geneSymbol
5903 ILMN_21212 FAM43A
3103 ILMN_1425 FOXO4
11993 ILMN_6504 PPL
5153 ILMN_19390 ST3GAL4
1723 ILMN_12716 CREB3L2
4484 ILMN_17676 TNS3
2700 ILMN_138461 <NA>
1358 ILMN_12133 FSCN1
3507 ILMN_15271 CITED4
12401 ILMN_73087 <NA>
ILMN_73087 provides NA as gene symbol and does not have a gene
symbol in the Illumina DB either.
However, ILMN_138461 provides NA as gene symbol but does have a
gene symbol in the Illumina DB. It is APM-1.
In addition ILMN_73087 has no entries in either the
Illumina or BioC DB but when I do a search for ILMN_73087 in
Ensembl I a hit that has multiple EntrezGene listings.
Is there any fix for the NA entries? Is this problem being addressed?
thanks,
Sebastien
More information about the Bioconductor
mailing list