[BioC] Differential expresson in more than 2 samples using NGS?
Xiaohui Wu
wux3 at muohio.edu
Tue Aug 24 22:27:50 CEST 2010
Hi Martin,
Thank you very much for your response.
I'm reading the chipseq mannual now, it introduces peak detection process as you suggested like slice().
What I mean multiple samples is: for example, I have 8 libs for 4 tissues, each tissue has two replicates. And I want to know what DE genes are among these 4 tissues. If I need to compare two tissues each time to find DE gene between these two tissues, then for 4 tissues, I need to compare C(4,2)=6 times to get any DE genes between each two of the 4 tissues. So I want to know whether there is any tool can compare many samples one time.
Xiaohui
-------------------------------------------------------------
On 08/24/2010 09:49 AM, Xiaohui Wu wrote:
> Hi all,
>
>
> I have about 30 libraries of SBS data (millions of 20nt tags) to
> analyze the differences between or among different libraries, and
> lots of these tags are in intergenic regions.
>
> For gene regions, I think I can use DESeq or EdgeR to analyze the DE
> genes. But it seems that DESeq or EdgeR can only deal with two
> samples, is there any package to compare multiple samples one time.
> For example, to find genes expressed highly in one or some libraries
> but not in other libs.
>
> But for intergenic tags, I think first I should use some peak
> detection package to find peak in intergenic, then treat these peaks
> as genes to find DE regions.
>
> Is there any peak detection package for NGS? and package for DE
> analysis among multiple libs?
If your starting point is BAM files of ungapped alignments and you're
looking for flexibility in peak calling, you might start with
Rsamtools::scanBam() to extract the position and width of each
alignment, manipulate that into a GRanges object, use
IRanges::coverage() and IRanges::slice() and friends to identify and
summarize peaks.
It's unclear whether you mean more than two samples (handled by edgeR
and DESeq, I think) or more than one factor with two levels; in the
latter an approach is to use the normalization and transformation
methods offered by either of the packages (e.g.,
getVarianceStabilizedData from DESeq, I think), and to analyze these
with standard R methods on the hopes that the data is normal and
homoscedastic enough.
Hopefully others will answer with better advice.
Martin
>
> Thank you!
>
> Regards, Xiaohui
>
> [[alternative HTML version deleted]]
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
.
More information about the Bioconductor
mailing list