[Bioc-sig-seq] WIG from coverage

Ivan Gregoretti ivangreg at gmail.com
Wed Sep 30 17:49:30 CEST 2009


Hello Michael,

Would you mind showing us a mini-workflow so that we see how to use
this new functionality?

I would like to start using it as soon as the updated changes get
pushed to the package repositories.

Thank you,

Ivan


Ivan Gregoretti, PhD
National Institute of Diabetes and Digestive and Kidney Diseases
National Institutes of Health
5 Memorial Dr, Building 5, Room 205.
Bethesda, MD 20892. USA.
Phone: 1-301-496-1592
Fax: 1-301-496-9878



On Wed, Sep 30, 2009 at 2:22 AM, Michael Lawrence <mflawren at fhcrc.org> wrote:
> On Tue, Sep 29, 2009 at 7:23 PM, Michael Lawrence <mflawren at fhcrc.org>wrote:
>
>>
>>
>> On Tue, Sep 29, 2009 at 4:28 PM, Seidel, Christopher <CWS at stowers.org>wrote:
>>
>>> Hello,
>>>
>>> In an earlier thread: BED to WIG format conversion (Thu 9/17/2009 4:50 PM)
>>> Michael Lawrence mentioned:
>>>
>>> Ignoring efficiency, it's pretty straight-forward. In the development
>>> version of IRanges, one can convert an Rle (or RleList for multiple
>>> chromosomes) as output to coverage() to a RangedData. In the development
>>> version of rtracklayer, this is done implicitly:
>>>
>>> > export(cov, "coverage.wig")
>>>
>>> as a way to create a wig file from a coverage() result. When I try this,
>>> my wig file looks like the following:
>>>
>>> track name="R Track" type=wiggle_0
>>> .       0       2       168
>>> .       2       4       175
>>> .       4       5       177
>>> .       5       7       178
>>> .       7       8       182
>>> .       8       12      185
>>> .       12      14      188
>>> .       14      15      191
>>> etc.
>>>
>>> This isn't compatible with any of the UCSC wig formats that I'm aware of
>>> (is it?).
>>
>>
>> No, it should be outputting the chromosome names instead of ".", following
>> the BED-line WIG variant.
>>
>> The coercion from Rle to RangedData yields a single-chromosome RangedData
>> with NULL names. This becomes <NA> upon coercion to a UCSCData (which tries
>> to generate conforming names but fails). <NA> shows up as "." in BED.
>>
>> New policy: RangedData cannot have NULL names. Made a validity check for
>> this. Fixed Rle->RangedData coercion to yield default of "1". This mimics
>> the behavior of RangedData().
>>
>> Shouldn't rtracklayer be putting out a variableStep format for this?
>>
>>
>> Well, come to think of it, that should be possible. Right now, rtracklayer
>> only uses variableStep when the span is the same across an entire
>> chromosome. Should be possible to split the chromosome into multiple blocks
>> though.
>>
>
> I just made this work. It generates much smaller WIG files for coverage-like
> tracks. Thanks for the suggestion.
>
> Michael
>
>
>>
>>
>>> Am I doing something wrong?
>>>
>>>
>>
>> Note that in general your code will not work, because you do not tell
>> rtracklayer the name of the chromosome. This makes me worry that generating
>> a default chromosome name will often hide user errors.
>>
>>
>>
>>> By the way, here is my work flow for going from solexa export to coverage:
>>>
>>> # read in some IP data
>>> aln <- readAligned(spath, pattern="s_1_export.txt", type="SolexaExport",
>>> filter=filt)
>>> # convert to GenomeData object
>>> s1 <- as(aln,"GenomeData")
>>>
>>> # read in some WCE data
>>> aln <- readAligned(spath, pattern="s_3_export.txt", type="SolexaExport",
>>> filter=filt)
>>> # convert to GenomeData object
>>> s3 <- as(aln,"GenomeData")
>>>
>>> # create GenomeDataList from ip and wce
>>> ypd <- GenomeDataList(list(ip=s1, wce=s3))
>>>
>>> # extend reads
>>> ip1 <- extendReads(ypd$ip$chrI, seqLen=150)
>>> wce1 <- extendReads(ypd$wce$chrI, seqLen=150)
>>>
>>> # find coverage for chromosome 1
>>> ip1.cov <- coverage(ip1, width=230208)
>>> wce1.cov <- coverage(wce1, width=230208)
>>>
>>> # make a track
>>> export(ip1.cov, "ip1_coverage.wig")
>>>
>>> > sessionInfo()
>>> R version 2.10.0 Under development (unstable) (2009-09-03 r49555)
>>> x86_64-unknown-linux-gnu
>>>
>>> locale:
>>> [1] C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] rtracklayer_1.5.13 RCurl_1.2-0        bitops_1.0-4.1
>>> chipseq_0.1.25
>>> [5] ShortRead_1.3.34   lattice_0.17-25    BSgenome_1.13.11
>>> Biostrings_2.13.39
>>> [9] IRanges_1.3.75
>>>
>>> loaded via a namespace (and not attached):
>>> [1] Biobase_2.5.6 XML_2.6-0     grid_2.10.0   hwriter_1.1
>>>
>>> -Chris
>>>
>>> Chris Seidel, Ph.D.
>>> http://research.stowers-institute.org/cws
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>
>>
>>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>



More information about the Bioc-sig-sequencing mailing list