[BioC] Re-mapped Affy CDF files
Jenny Drnevich
drnevich at uiuc.edu
Wed Jan 11 18:30:56 CET 2006
Hi all,
I looked at the alternative mappings a few months ago after attending a
seminar given by Stanley Watson, Director of Mental Health Research
Institute at University of Michigan. He recommended that the alternative
mappings always be used because of the large discrepancies they found
between Affymetrix's mapping and their mappings of the probes. I don't know
whether they have any documentation on whether their mappings yield results
that are more often validated through alternative methodologies or not, but
they do have quite a lot of documentation on what they did and why they did
it - see the description of custom CDF files and their new paper from links
on the page Jim put in his first post. Even if Ensembl or Affymetrix
updates their annotation based on remapping, the CDFs aren't changed, so
the summarization and statistical analysis are done using probes that may
not all map to the same "gene" uniquely. What these alternative mapping do
is to remap each probe, then redefine probe sets based on all the probes
that map to a "gene", and that it's these re-groupings that are most
important. Many of the alternative mappings are subsets of other ones,
like taking only the first 11 probes from the 3' end in cases where there
are more than 11 probes, so there are not quite as many alternative
mappings as it first appears.
I do agree with Jim that coming up with a defensible rationale is
important, as I was having trouble deciding which mapping might be the best
to use. Stan Watson would argue that any of them are better than the
outdated Affymetrix groupings. If Affy did theirs based on Unigene
clustering, then the new mapping & grouping based on Unigene might be a
defensible choice. In the end, I succumbed to historical inertia and went
with Affymetrix's CDF, in part because I do analyses for many organisms,
and MBNI only has alternative CDFs for human, mouse, and rat. However, I
was able to get the alternative CDFs to work in Bioconductor with little
trouble.
As far as validating the genes on the magical "significant list", I did get
some advice at a recent conference to ALWAYS first check the current probe
mappings for those significant genes, then only concentrate on those that
have most or all of their probes where they should be. Does anyone do this
routinely? Should we, but we don't because it is too time consuming?
Cheers,
Jenny
At 08:51 AM 1/11/2006, James W. MacDonald wrote:
>Sean Davis wrote:
> > I'm not sure what their build process is, but doesn't Ensembl do some
> > probe-based mappings?
>
>Maybe. I couldn't find anything obvious in a cursory glance at their
>website.
>
>Anyway, the main question for me is not the number or type of
>alternative mappings that exist for Affy arrays (there are 19 different
>CDFs that the MBNI folks produce, including several based on Ensembl
>mappings). I am more concerned with being able to establish a defensible
>rationale for using a particular mapping.
>
>I guess what we do right now with the Affy CDFs isn't defensible except
>on a historical basis, but the weight of history is pretty strong. For
>instance, attributing significance at an alpha of < 0.05 has no
>rationale AFAIK, but is pretty much written in stone due to precedent.
>
>OTOH, most if not all microarray data are caveat emptor - it is
>incumbent on the end user to take the magical list of differentially
>expressed genes and validate them with an alternative methodology.
>
>Given that state of affairs, is it not reasonable to choose the probe
>mappings that one uses with the same logic that one uses for choosing
>the preferred way of computing expression values?
>
>Jim
>
>
>
>
> >
> > Sean
> >
> >
>
>
>--
>James W. MacDonald
>Affymetrix and cDNA Microarray Core
>University of Michigan Cancer Center
>1500 E. Medical Center Drive
>7410 CCGC
>Ann Arbor MI 48109
>734-647-5623
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu
More information about the Bioconductor
mailing list