[BioC] rtracklayer proposal for ISSUE: import.gff3 asRangedData=FALSE fails when strand is '.'
Cook, Malcolm
MEC at stowers.org
Wed Apr 18 18:04:58 CEST 2012
Hi, rtracklayerers,
import.gff3 with asRangedData=TRUE passes a period through to the strand of imported RangedData, however, calling it with asRangedData=FALSE errors:
> gff.str<-"2L\tFlyBase\tgene\t7529\t9484\t0\t.\t0\tID=FBgn0031208;Name=CG11023"
> import.gff3(textConnection(gff.str),asRangedData=TRUE)
RangedData with 1 row and 7 value columns across 1 space
space ranges | type source phase strand ID Name score
<factor> <IRanges> | <factor> <factor> <factor> <factor> <character> <character> <numeric>
1 2L [7529, 9484] | gene FlyBase 0 NA FBgn0031208 CG11023 0
> import.gff3(textConnection(gff.str),asRangedData=FALSE)
Error in strand(runValue(strand)) : strand values must be in '+' '-' '*'
The GFF3 spec allows '.' (and '?') to appear as value of strand:
Column 7: "strand"
The strand of the feature. + for positive strand (relative to the
landmark), - for minus strand, and . for features that are not
stranded. In addition, ? can be used for features whose strandedness
is relevant, but unknown.
Arguably, import.gff{,2,3} should provide some control over interpretation of '.' and '?' appearing in the strand column, allowing it to comport with strand and GRanges
I propose the following as an intended backwards compatible fix.
New argument to import.gff{,2,3}
strandMap: control for mapping out-of-band values (FALSE,TRUE,a string, a list), understood as follows
FALSE: the default - do not map out of band values to '*'
TRUE: map all out of band values to '*'
any 0 length character vector: map out of band values to it (presumably it will be one of '*', '-','+'
a list: lookup how to map out of band values in the list by name.
If it is agreed that this is the best resolution, and the rtracklayer gods wish it, I will take this as my first opportunity to contribute and will follow-up accordingly....
Else?
Cheers,
Malcolm
More information about the Bioconductor
mailing list