[BioC] shan

Tue Oct 2 19:35:38 CEST 2012

Hi Shan,

On Mon, Oct 1, 2012 at 4:08 PM, wang peter <wng.peter at gmail.com> wrote:
> thx steve
> i trid the correlation of 51 library with 100 library
> it is more than 0.9 on average
>
> so very good technical replicate

Well, in the absence of telling us how that number compares to the
correlation between the two 100bp libs, I guess we don't really know.

If the correlation is really no different between a 51bp and a 100bp
sample and a 100bp vs 100bp sample, then I guess the original 51bp was
too short after all, right? ;-)

Honestly curious, though, what was the ultimate reason that the powers
that be decided 50bp was too short? Was there a particular gene (or
set of genes) that is highly unmappable at 50bp? Better
splice-junction mapping? Are you trying to do some gene fusion
detection, or something?Maybe transcript assembly?

Anyway, I'd still dig deeper to see if you find systematic bias
between your samples (do the cluster together, or similar).

Without having done any of that, I'll take a shot in the dark as to
what I'm *guessing* might be "just fine" if I were trying to use this
data for differential expression analysis:

I bet that trimming your 100bp reads back to 50 bp reads and running
the "analysis" by treating the appropriate libraries as technical
replicates of each other, you're probably going to be "playing" with
the same sets of genes as any other method (more or less).

I say that because I'm guessing that treating the 100bp and 50bp reads
as biological replicates (when they're not) isn't exactly correct
either, so I'm erring to the side of being more conservative.

Hopefully you might get some better ideas from others, too ...

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact