[BioC] Inconsistent annotation of affy probeset on Affymetrix chip for rat: 230.2
Marc Carlson
mcarlson at fhcrc.org
Wed Jul 2 19:47:27 CEST 2008
Christoph Preuss wrote:
> Hi everyone,
>
> We analyzed a global exression microarray data set using gcrma for the
> normalization step and limma for finding differentially expressed
> genes. One of the most significant probesets (ProbeSetID annotation
> "1375535_at") in terms of d.e is annotated as :
> Probeset "1375535_at"
> -Gene Symbol: Lpin1
> - Location: Chr 6
>
> in the bioconductor package "rat2302" / "rat2302.db".
>
> We also looked at the Affymetrix web site, where the same probeset was
> annoted as "Transcribed sequence" on chromosome X.
>
> Affymetrix Annotation RG 230 2.0 Chip:
> -ProbeSetID: 1375535_at
> -Target Sequence:
>
>> RAT230_2:1375535_AT
>>
> gaagttagagagctgtttccccactttacattttaaaatatgtatgccaggatntaatca
> ttcctttaagtgtacacttcaaggagagatgtgccgaataagaaaatagctttctctagc
> gtgaagggttttgcgtccgccgagttcttaaggtcttttttaagagctactgtgtatgag
> tgtgtgtatgtgtgcgcatgcatgttcctgcgactagtcattcattcacatggtgatcag
> acaacaatgggagctggttcgtctaccttatcttgtgggtcctggagttcaatctcagat
> catcaggctgggcagcaagtgccttcaccctccgagccatcttgccatcccacagctgag
> cgtctaatatgacattgccgatga
>
> Interestingly, the given target sequence for the probeset matches only
> a mouse sequence and not even a rat mRNA (blastn search).
>
> The question is which annotation should we trust?
> Is there any chance to validate the probeset annotation?
> Many thanks in advance for any help.
>
> cheers,
>
> Christoph Preuss
>
> (Leibniz-Institute for Arteriosclerosis Research, University of
> Muenster Germany )
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
Hi Christoph,
I can only really speak for the Bioconductor annotations which are
generated from public sources along with an initial mapping of the probe
or probeset to a public accession (usually this is a Genbank, Entrez ID
or a related type of ID). In the case of "1375535_at", the probeset is
an Affymetrix probeset and so we are ultimately at the mercy of
Affymetrix to accurately tell us what this probeset is in this initial
mapping, but after this we do the rest ourselves by using public
sources. We map the probeset to ID information onto additional
information gathered from public sources (primarily NCBI) to get the
rest of the information in the package. The file that you get from
Affymetrix may also have a lot of the same data as our packages, but
unless they describe it somewhere, I don't think we actually know for
certain where they collected all of their information from. The only
information that we ever actually take from them is the initial mapping
of their probeset onto a public accession.
I dug up the latest Affymetrix mapping files that we used to generate
this package and investigated. From the file that I have (which was
collected in late March) the probeset you listed is indicated to be
Lpin1, and also to be located on Chromosome 6 which agrees completely
with the information that we gathered from NCBI and GoldenPath from
this time. As of this morning, NCBI still lists this gene as being
Lipin1 and being located on Chromosome 6. However, there is also a
field right next to that in the Affymetrix file that is called
"Alignments" which lists the X chromosome. But when I pull up an even
more recent file from Affymetrix, then I see that they no longer list
the location of this gene and have now replaced that value with a "---",
they also no longer list the genes name or symbol. But they still list
Chromosome "X" in the alignment field and have even assigned different
accessions to this probeset.
So the short answer is that Affymetrix has changed their mind about what
they are claiming this probeset is measuring.
I hope this helps you,
Marc
More information about the Bioconductor
mailing list