[BioC] Nested Design (Again) & Subset WithinArray Correlation
Jenny Drnevich
drnevich at illinois.edu
Thu Jul 29 00:12:45 CEST 2010
Hi Everyone,
I've been helping Osee with the second question he posted today. I'll
explain it a bit further, as I'd like some help on how to interpret
his results. He has an array where some of the probes (ENSGACT) were
designed from known transcript sequences and other probes (GENSCAN)
were designed from predicted sequences from a sequencing project.
Further annotation of the predicted sequences has revealed that many
of them actually overlap with the known transcripts sequences. He
would like to estimate how correlated the expression values from the
GENSCAN probes are to their matching ENSGACT probes. I thought this
could be done by treating the probe pairs as technical replicates and
running duplicateCorrelation() on them. On an array with true
technical replication of probes, you'd hope the consensus correlation
would be strongly positive, close to 1. Well, the consensus
correlation for the GENSCAN:ENSGACT pairs is strongly _negative_ :
between -0.8 and -0.92 depending on the subset of pairs we use. I
can't quite figure out what the strong negative correlation means -
it's probably something simple that I'm overlooking. We have no idea
right now how much overlap there may be between ENSGACT probe oligo
sequences and their corresponding GENSCAN probe oligo sequences.
Anyone have an explanation for the strong negative correlation?
Thanks,
Jenny
>Question 2: Testing Subset of within array replicates with different gene
>names. I have a subset of "overlapping" gene list [as below] and I
>would like
>to see how they correlate to
>assess the hybridization efficiency on the chip. The sequences and the
>probes are not identical, but overlap significantly. From reading the
>postings, I know I can't use duplicaleCorrelation, because the probes are
>randomly scattered on the array and I was not sure about how to use
>"avedups" in a subset of genes with different names.
>
>GENSCAN_ID Matched transcript ID
>GENSCAN00000010293 ENSGACT00000002218
>GENSCAN00000003508 ENSGACT00000001310
>GENSCAN00000021873 ENSGACT00000000225
>GENSCAN00000007931 ENSGACT00000000496
>GENSCAN00000022171 ENSGACT00000002296
>GENSCAN00000026278 ENSGACT00000000071
>GENSCAN00000000631 ENSGACT00000002139
>GENSCAN00000008636 ENSGACT00000002427
>GENSCAN00000008635 ENSGACT00000002432
>GENSCAN00000022111 ENSGACT00000007564
>
>Thank you so much and my apologies if this has been addressed before (You
>can
>point me to the discussion).
>
>Cheers,
>
>Osee
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu
More information about the Bioconductor
mailing list