[BioC] hs133phsentrezg metadata
James W. MacDonald
jmacdon at med.umich.edu
Tue Oct 17 22:31:50 CEST 2006
Hi Manhong,
OK, I understand that part. However, for most of the annotation data
(including the chromosomal location), what is normally supplied is the
information at the gene level, rather than the probe level. I guess one
could argue that knowing where exactly the probesets are supposed to
bind might be of interest, but the annotation packages are intended to
annotate probesets to genes.
While it is true that some of the probes might bind to different parts
of the genome, this can be handled by supplying multiple locations. For
instance, in the hgu133plus2 package we have:
> get("1007_s_at", hgu133plus2CHRLOC)
> get("1007_s_at", hgu133plus2CHRLOC)
6_qbl_hap2 6 6_cox_hap1 6_qbl_hap2 6_cox_hap1
2098794 30959839 2300465 2099260 2300931
6 6_cox_hap1 6 6_qbl_hap2
30960305 2305069 30964443 2103398
Best,
Jim
Manhong Dai wrote:
> Hi Jim,
>
> In our custom cdf, some hits<1 probes would be used. For example, when
> a probe has a hit with an allele of a snp, and the snp's another allele
> has hits=1 match with genome, although the probe has no hit with genome
> at all, we would use this probe and its genome location as a candidate
> for all custom CDFs, although the portion of this kind of probes is
> small.
>
>
> Our UG and ENTREZG custom CDF does have a rule that each probe must
> only hit one genome location and one UG cluster.
>
>
> But in REFSEQ custom cdf, when a probe has match to a REFSEQ sequence,
> but no match to genome at all. The probe would still be used because
> REFSEQ is more reliable than genome.
>
> For example, probe 4 of
> http://arrayanalysis.mbni.med.umich.edu/ps/ps_pb.jsp?p=NM_000019_at&c=Hs133P_Hs_REFSEQ_8 has no match to genome.
>
>
> Best,
> Manhong Dai
>
>
>
> On Tue, 2006-10-17 at 14:46 -0400, James W. MacDonald wrote:
>
>>Hi Manhong,
>>
>>Manhong Dai wrote:
>>
>>>Hi An,
>>>
>>> Our custom CDF annotation package has only gene name for each probeset
>>>because we designed it this way.
>>>
>>> A probeset's probes could have matches on different location or
>>>chromosomes, even some probes have no match on genome at all, but they
>>>belong to this probeset because they all have perfect match on the
>>>gene's sequence.
>>
>>This doesn't make sense to me. How can a probe not match to the genome,
>>yet have a perfect match to a gene's sequence?
>>
>>I was also under the impression that the matching for the probes that
>>remain in an MBNI cdf was first done to the genome, and those probes
>>that didn't blast to the genome were discarded. From
>>
>>http://brainarray.mhri.med.umich.edu/Brainarray/Database/CustomCDF/cdfreadme.htm
>>
>>I get:
>>
>>A probe must only hit one UniGene cluster and one genomic location
>>
>>A probe must hit only one genomic location
>>
>>Does this mean a probe that hits < 1 genomic location will be included?
>>I assumed this meant a probe had to hit exactly one location.
>>
>>Best,
>>
>>Jim
>>
>>
>>
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioconductor
mailing list