[Bioc-sig-seq] GenomicFeatures, error in type conversion RangeData to GRanges

Martin Morgan mtmorgan at fhcrc.org
Thu Apr 1 16:22:54 CEST 2010


On 04/01/2010 07:12 AM, Michael Lawrence wrote:
> On Thu, Apr 1, 2010 at 7:09 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> 
>> On 03/31/2010 07:11 PM, pterry at huskers.unl.edu wrote:
>>>  Dear bioc-sig-sequencing,
>>>
>>> I would like to annotate chip-seq peaks for the arabidopsis genome.  In
>> trying to work thru the GenomicFeatures vignette dated 03/27/10, I need to
>> convert my ChIPSeq peaks from a RangedData object to a GRanges object.  In a
>> recent, but previous Bioconductor development version, the conversion with
>> this particular RangedData object worked fine.
>>>
>>> In this more recent Bioconductor development version, I get the following
>> error message:
>>>
>>>> gr_ChSeqPks <- as(rd0_chr1_s_8_trt_vs_INPctl, "GRanges")
>>> Error in validObject(.Object) :
>>>   invalid class "GRanges" object: slot 'strand' contains missing values
>>>> rd0_chr1_s_8_trt_vs_INPctl
>>> RangedData with 57 rows and 2 value columns across 1 space
>>>           space               ranges   |     ARAB8 ARAB7INPCTL
>>>     <character>            <IRanges>   | <integer>   <integer>
>>> 1          chr1   [ 617092,  617094]   |        24           0
>>> 2          chr1   [1808262, 1808262]   |         8           0
>>> 3          chr1   [3889445, 3889452]   |        64           0
>>> 4          chr1   [4404410, 4404410]   |         8           0
>>> 5          chr1   [7081127, 7081127]   |         8           0
>>> 6          chr1   [7128574, 7128581]   |        64           0
>>> 7          chr1   [7128592, 7128649]   |       464           0
>>> 8          chr1   [7530777, 7530781]   |        40           0
>>> 9          chr1   [7530784, 7530786]   |        24           0
>>> ...         ...                  ... ...       ...         ...
>>
>> Hi,
>>
>>> rd = RangedData(IRanges(1, 10))
>>> as(rd, "GRanges")
>> Error in validObject(.Object) :
>>  invalid class "GRanges" object: slot 'strand' contains missing values
>>> rd[["strand"]] = "*"
>>> as(rd, "GRanges")
>> GRanges with 1 range and 0 elementMetadata values
>>    seqnames    ranges strand |
>>       <Rle> <IRanges>  <Rle> |
>> [1]        1   [1, 10]      * |
>>
>> seqlengths
>>  1
>> NA
>>
>> Martin
>>
>>
> Shouldn't the coerce function just do this automatically?

Currently GRanges thinks of strand as '+', '-', '*', whereas IRanges
allows NA as well (hence the error) so coercing NA to * represents a
decision on the part of the investigator that '*' (strand irrelevant) is
synonymous with NA (no information about strand available). Part of the
motivation for this current state of affairs is that the use case for
both NA and * were unclear, but course corrections welcome.

Martin
> 
>>>
>>>> sessionInfo()
>>> R version 2.12.0 Under development (unstable) (2010-03-30 r51506)
>>> x86_64-unknown-linux-gnu
>>>
>>> locale:
>>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] biomaRt_2.3.5         GenomicFeatures_0.5.0 GenomicRanges_0.1.0
>>> [4] IRanges_1.5.73
>>>
>>> loaded via a namespace (and not attached):
>>> [1] Biobase_2.7.5      Biostrings_2.15.26 BSgenome_1.15.20   DBI_0.2-5
>>> [5] RCurl_1.3-1        RSQLite_0.8-4      rtracklayer_1.7.11 tools_2.12.0
>>> [9] XML_2.8-1
>>>>
>>>
>>>
>>> Thanks,
>>> P. Terry
>>> pterry at huskers.unl.edu
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list