[Bioc-sig-seq] Input from multiple Solexa runs

Deepayan Sarkar deepayan.sarkar at gmail.com
Fri Apr 24 00:33:44 CEST 2009


On Thu, Apr 23, 2009 at 3:22 PM,  <ig2ar-saf2 at yahoo.co.uk> wrote:
>
> Hi Deepayan,
>
> When I do
>
> control1 <- combineLaneReads(c(expt1_analysis1[c("1", "2")],
> expt1_analysis2[c("3", "4")]))
>
> is there a way to filter reads so that I only get one read per genomic position?

combineLaneReads is a very simple function:

combineLaneReads <- function(laneList, chromList = names(laneList[[1]])) {
    names(chromList) = chromList ##to get the return value named
    GenomeData(lapply(chromList,
                      function(chr) {
                          list("+" = unlist(lapply(laneList,
function(x) x[[chr]][["+"]]), use.names = FALSE),
                               "-" = unlist(lapply(laneList,
function(x) x[[chr]][["-"]]), use.names = FALSE))
                      }))
}

and you can just wrap a unique() around the unlist() to make the start
positions unique. But why would you want that? Within a lane,
duplicates are likely to be PCR artifacts, but for data from different
lanes, aren't duplicates more likely to be real? We could easily add
an argument to support this if you have a valid use-case.

-Deepayan



More information about the Bioc-sig-sequencing mailing list