[Bioc-sig-seq] as.data.frame on GRanges object with DNAStringSet in values

Janet Young jayoung at fhcrc.org
Wed Jun 15 21:37:53 CEST 2011


Hi there,

I'm trying to as as.data.frame on a GRanges object. On regular GRanges objects it works fine but I have some objects that contain a DNAStringSet in the values column, which isn't built in to the as.data.frame method.  Is it possible to add the ability to coerce the DNAStringSet too, please?

Here's some code that demonstrates the issue:

################
library(GenomicRanges)
library(Biostrings)

gr1 <- GRanges(seqnames=rep("chr1",3),ranges=IRanges(start=c(1,101,201),width=50),strand=c("+","-","+"), genenames=c("seq1","seq2","seq3") )

as.data.frame(gr1)
# works

gr2 <- gr1
values(gr2)[,"myseqs"] <- DNAStringSet(c ("AACGTG", "ACGGTGGTGTT", "GAGGCTG"))

as.data.frame(gr2)
# Error in as.data.frame.default(y, optional = TRUE, ...) : 
#   cannot coerce class 'structure("DNAStringSet", package = "Biostrings")' into a data.frame
################

and here's   sessionInfo() output:

R version 2.13.0 (2011-04-13)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Biostrings_2.20.1   GenomicRanges_1.4.6 IRanges_1.10.4     

################


You might wonder why I'm storing sequences in the GRanges values - in my real data they're sequencing reads that have mapped back to that region, but I'm still curious to maintain the sequence itself (for the moment) because it's not always identical to the underlying genomic sequence of that region (investigating mapping issues).

(and my desire to use as.data.frame relates to a suggestion from Herve to let me workaround some issues with the identical function)

thanks,

Janet



More information about the Bioc-sig-sequencing mailing list