[Bioc-sig-seq] A myriad of classes

Deepayan Sarkar deepayan.sarkar at gmail.com
Mon Apr 20 22:50:46 CEST 2009


On Mon, Apr 20, 2009 at 1:27 PM,  <ig2ar-saf2 at yahoo.co.uk> wrote:
> Hello Michael,
>
> A high level vignette with the infrastructure of the BioC would be great.
>
> Also, I can be more specific about a class problem I am facing. It concerns a developmental package that I am privileged to be allowed to test. It's chipseq.
>
> I am trying to follow a typical workflow guide as shown here:
>
> http://www.bioconductor.org/workshops/2009/SeattleJan09/ChIP-seq/ChipSeqWorkflow.pdf
>
> As you can see, the data that the package uses is not raw data but data that has been read in and labelled somehow beforehand. The document shows
>
> load("../data/alignedLocs.rda")
>
> That is not the scenario a user will find. A user will have one or several s_X_export.txt files.

Right. The document refers to data that was provided to those who
attended the course (which unfortunately we cannot yet make public).

> So, my attempts to get my data read in in the simplest case is this
>
>> library(chipseq)
>> library(lattice)
>> setwd('/scratch1/igregore/ChIPseq/runs/09-04-10/GERALD_14-04-2009_niddk/')
>> pattern <- "s_1_export.txt"
>> alignedLocs <- as(readAligned(".",
> +                               pattern,
> +                               "SolexaExport",
> +                               filter = alignDataFilter(expression(filtering == "Y"))),
> +                   "GenomeData")
>> class(alignedLocs)
> [1] "GenomeData"
> attr(,"package")
> [1] "BSgenome"

Perfect.

> The guide says that alignedLocs should be a GenomeDataList class object but it shows up as class GenomeData. The guide also shows

That's because 'alignedLocs' contained several such objects,
representing the data obtained from multiple lanes (possibly across
multiple runs). To create such an object, you can do

alignedLocs <- GenomeDataList(list(a = alignedLocs1, b = alignedLocs2))

etc. where alignedLocs1, alignedLocs2, etc. are "GenomeData" objects
(from individual calls to readAligned).

>> alignedLocs
>  A GenomeDataList instance of length 3
>
> but when I try it as is I get:
>
>>  alignedLocs
>   A GenomeData instance of length 51154

That is a bit odd though. Individual lanes should consist of
chromosome-level data, and this suggests that you have 51154
chromosomes. Perhaps you can give us the output of

str(head(as.list(alignedLocs)))

-Deepayan



More information about the Bioc-sig-sequencing mailing list