[BioC] Affymetrix Human Exon Array MPS and PS files contain different probeset groups

Daniel Brewer daniel.brewer at icr.ac.uk
Tue Jun 26 18:46:11 CEST 2007


This is not strictly a bioconductor question, but it is in the
processing I use bioconductor and someone might have a similar experience.

I use "apt-probeset-summarize" to produce Exon-level and gene-level
signals.  Different probesets are assigned to a gene or Exon based on
the evidence to support this association.  I use the "core" grouping.
This grouping is defined by two files, one a probeset file (PS) which is
 simply a list of identifiers and the meta-probeset file which is a file
with four columns:
1) probeset_id
2) transcript_cluster_id (Always same as 2)
3) probeset_list (list of probesets associated with the transcription
cluster)
4) probe_count (the total number of probes)

I might be confused about the true meaning of the meta probeset file but
from what I can see, the probesets in a particular grouping should be in
both the mps and the ps files if associated with a gene. This does not
appear to be the case. For example if we look at the PTEN gene (3256689).

The mps file (HuEx-1_0-st-v2.r2.dt1.hg18.core.mps) has the following line:
3256689 3256689 3256702 3256703 3256704 3256705 3256740 3256780 24
i.e. there are 6 probesets associated (3256702, 3256703,3256704,
3256705,3256740 & 3256780).

Using NETAFFX or
HuEx-1_0-st-v2.r2.dt1.hg18.core.ps+HuEx-1_0-st-v2.na21.hg18.probeset.csv
suggest that there are 23 core probesets associated with this gene
("3256702"
"3256703","3256704","3256705","3256706",
"3256707","3256708","3256709","3256710",
"3256711","3256725","3256736","3256738",
"3256739","3256740","3256764","3256767",
"3256772","3256773","3256777","3256778", "3256779" & "3256780").

This difference could significantly effect the gene summary results.
Does anyone know whether this discrepancy is on purpose? and if so way?
Am I using the correct mps file?

Thanks
-- 
**************************************************************
Daniel Brewer, Ph.D.

Institute of Cancer Research
Email: daniel.brewer at icr.ac.uk
**************************************************************

The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP.

This e-mail message is confidential and for use by the addre...{{dropped}}



More information about the Bioconductor mailing list