Hi Tim,

Thanks so much for such a quick response!

Here is a probeset ID that maps to a gene but not to any exon or transcript.

library(xmapcore)
xmap.connect("mouse")
probeset.to.transcript("4448480", as.vector=FALSE)
#NULL
probeset.to.exon("4448480", as.vector=FALSE)
#NULL
probeset.to.gene("4448480", as.vector=TRUE)
#[1] "ENSMUSG00000021708"

I looked up the detailed information of this probeset as follows.
probeset.details("4448480")
stable_id array_name probe_count hit_score gene_score transcript_score
1   4448480   MoEx-1_0           4         1          1                0
  exon_score est_gene_score est_transcript_score est_exon_score
1          0              0                    0              0
  prediction_transcript_score prediction_exon_score protein_score
1                           0                     0             0
  domain_score
1            0

It looks like this probeset has one or more of its probes missing the transcript/exon target but uniquely aligned to a gene. Is it correct that this probeset is mapped to the un-transcribed region of the gene?

Here is an example that a probeset is mapped to both gene and transcript but not to any exon.

probeset.details("4305509")
  stable_id array_name probe_count hit_score gene_score transcript_score
1   4305509   MoEx-1_0           4         1          1                2
  exon_score est_gene_score est_transcript_score est_exon_score
1          0              1                    2              0
  prediction_transcript_score prediction_exon_score protein_score
1                           1                     0             0
  domain_score
1            0

Is it correct that this probeset is aligned to the intron region of the transcript?

Thanks so much for your help!

Best regards,

Julie



On 12/13/10 12:29 PM, "Tim Yates" <TYates@picr.man.ac.uk> wrote:

>
> Hi there!
>
> How are you doing the mapping from probeset to gene, exon, transcript, etc?
>
> Do you have an example where you believe something is wrong?
>
> Cheers :-)
>
> Tim
>
>
>
> ----- Reply message -----
> From: "Zhu, Lihua \(Julie\)" <Julie.Zhu@umassmed.edu>
> Date: Mon, Dec 13, 2010 17:20
> Subject: Xmapcore package
> To: "bioconductor@r-project.org" <bioconductor@r-project.org>
> Cc: "Tim Yates" <TYates@picr.man.ac.uk>
>
> Tim,
>
> While annotating a list of probesets to exons, transcripts and genes, I
> noticed that there are more probesets (e.g.,4448480) mapped to genes than
> those mapped to transcripts and the least number of probesets mapped to the
> exons. Is this expected? I suppose if one probe is aligned to multiple exons
> in a gene, then the exon mapping was removed while the gene mapping was
> kept. Could you please elaborate? Thanks so much for your help!
>
> Best regards,
>
> Julie
>
> library(xmapcore)
> xmap.connect("mouse")
>> probeset.to.transcript("4448480", as.vector=FALSE)
> NULL
>> probeset.to.exon("4448480", as.vector=FALSE)
> NULL
>> probeset.to.gene("4448480", as.vector=FALSE)
> RangedData with 1 row and 9 value columns across 1 space
>         space               ranges |         IN1          stable_id
> strand
>   <character>            <IRanges> | <character>        <character>
> <integer>
> 1          13 [92020005, 92901611] |     4448480 ENSMUSG00000021708
> -1
>          biotype      status
>      <character> <character>
> 1 protein_coding       KNOWN
>
> description
>
> <character>
> 1 RAS protein-specific guanine nucleotide-releasing factor 2 Gene
> [Source:MGI (curated);Acc:MGI:109137]
>   db_display_name      symbol
>       <character> <character>
> 1   MGI (curated)     Rasgrf2
>                                                symbol_description
>                                                       <character>
> 1 RAS protein-specific guanine nucleotide-releasing factor 2 Gene
>> temp= transcript.to.probeset(gene.to.transcript(probeset.to.gene("4448480",
> as.vector=TRUE), as.vector=TRUE), as.vector=FALSE)
>
>> temp[temp$stable_id == "4448480",]
>  [1] IN1                         stable_id
>  [3] array_name                  probe_count
>  [5] hit_score                   gene_score
>  [7] transcript_score            exon_score
>  [9] est_gene_score              est_transcript_score
> [11] est_exon_score              prediction_transcript_score
> [13] prediction_exon_score       protein_score
> [15] domain_score
> <0 rows> (or 0-length row.names)
>
>  sessionInfo()
> R version 2.11.1 (2010-05-31)
> x86_64-apple-darwin9.8.0
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] mouseexonpmcdf_1.1 xmapcore_1.2.8     digest_0.4.2
> [4] IRanges_1.6.11     RMySQL_0.7-5       DBI_0.2-5
>
> loaded via a namespace (and not attached):
> [1] tools_2.11.1
> --------------------------------------------------------
> This email is confidential and intended solely for the...{{dropped:19}}

