[BioC] Missing ProbeSets in Affymetrix MoGene 1.0 ST chips
Mark Cowley
m.cowley at garvan.org.au
Thu Sep 4 11:55:28 CEST 2008
Dear list,
There are 93 transcript_cluster_id's on the MoGene 1.0 ST chip that
are listed in the csv annotation file, and searchable in the MoGene
chip at NetAffx, but that are not present in the [unsupported] CDF
file from netaffx.
45 of these ID's are present in the MoGene PGF file, and correspond to
the antigenomic probesets, but the remaining 48 are not in the PGF
file either.
From NetAffx, the 48 non-control probesets are: 11 snRNA's, a RefSeq
gene (Lphn2) and 2 other novel transcripts, with the remaining 44
having no annotation other than their genomic location. This isn't a
problem, unless Lphn2 is your gene of interest, which it isn't in my
case, but it would be nice to know what's going on here!
If you RMA normalise using the CDF file (like genespring does) then
you end up with 93 rows of missing data, or if you normalise using the
PGF/CLF files then you will end up missing out on the remaining 48
probesets.
Has anyone else come across this and know what is going on here??
These transcript_cluster_ids are:
c("10361826", "10362430", "10362444", "10362452", "10502768",
"10532622", "10349381", "10350469", "10354866", "10362438",
"10362872", "10369759", "10374030", "10391748", "10395778",
"10411504", "10422960", "10436496", "10436660", "10446349",
"10453719", "10457089", "10458079", "10460144", "10461932",
"10481652", "10482786", "10487009", "10498317", "10501216",
"10502040", "10503414", "10513713", "10521665", "10535929",
"10546555", "10552810", "10553535", "10560364", "10582560",
"10582566", "10582570", "10582576", "10585872", "10586931",
"10592453", "10601614", "10602194", "10338002", "10338005",
"10338006", "10338007", "10338008", "10338009", "10338010",
"10338011", "10338012", "10338013", "10338014", "10338015",
"10338016", "10338018", "10338019", "10338020", "10338021",
"10338022", "10338023", "10338024", "10338027", "10338028",
"10338030", "10338031", "10338032", "10338033", "10338034",
"10338038", "10338039", "10338040", "10338043", "10338045",
"10338046", "10338048", "10338049", "10338050", "10338051",
"10338052", "10338053", "10338054", "10338055", "10338057",
"10338058", "10338061", "10338062")
cheers,
Mark
-----------------------------------------------------
Mark Cowley, BSc (Bioinformatics)(Hons)
Peter Wills Bioinformatics Centre
Garvan Institute of Medical Research, Sydney, Australia
More information about the Bioconductor
mailing list