[Bioc-sig-seq] Reducing Solexa's export.txt in preparation for aChIP-seq analysis.

ig2ar-saf2 at yahoo.co.uk ig2ar-saf2 at yahoo.co.uk
Thu Mar 19 16:36:29 CET 2009


Hi João, Robert and everyone,

First, thank you for your responses.

It was easy to predict that the question about discarding duplicates would get responses faster as it is the one that you can address quickly and accurately with common sense.

Briefly, I fully agree, we need to remove the PCR bias whenever possible. I was actually wondering about the false negative error introduced by this filter. Now that I think about it a little better, the benefit of the the decrease in false positives must comfortably outweigh the increase of false negative unless the ChIP peaks are very sharp. Good.

Question one still remains:

I read in the data with ShortRead. Now, how do I filter it and export it to a fomat that will allow me to follow along the example workflow? Class matters, doesn't it?

> load("alignedLocs.rda")
> ls()
[1] "alignedLocs"
> class(alignedLocs)
[1] "AlignedList"
attr(,"package")
[1] "chipseq"

Thank you,

Ivan






----- Original Message ----
From: João Fadista <Joao.Fadista at agrsci.dk>
To: ig2ar-saf2 at yahoo.co.uk; bioc-sig-sequencing at r-project.org
Sent: Thursday, 19 March, 2009 10:31:24
Subject: RE: [Bioc-sig-seq] Reducing Solexa's export.txt in preparation for aChIP-seq analysis.


Hi,

Removing duplicates is a step that you can do in order to minimize the possible bias due to the amplification in sample preparation. 

Best,
João

-----Original Message-----
From: bioc-sig-sequencing-bounces at r-project.org [mailto:bioc-sig-sequencing-bounces at r-project.org] On Behalf Of ig2ar-saf2 at yahoo.co.uk
Sent: Thursday, March 19, 2009 3:24 PM
To: bioc-sig-sequencing at r-project.org
Subject: [Bioc-sig-seq] Reducing Solexa's export.txt in preparation for aChIP-seq analysis.


Hello,

In preparation to analyse my own ChIP-seq data, I am trying to follow the steps described in this sample workflow:

http://www.bioconductor.org/workshops/2008/SeattleNov08/ChIP-seq/workflow.pdf

The document starts by loading data that has been "reduced to a set of alignment start positions (including orientation)".

Can somebody elaborate on that a little bit or, ideally, show it with one example?

Also, as part of the reduction, the procedure "removed all duplicate reads and applied a quality score cutoff". The score cutoff is fine but how is removing duplicates justified?

Thank you,

Ivan




_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing







More information about the Bioc-sig-sequencing mailing list