[Bioc-devel] Getting pointers to data inside XStringSet object

Ulrich Bodenhofer bodenhofer at bioinf.jku.at
Fri Dec 14 10:04:17 CET 2012


Some colleagues and I are currently developing some R packages with the 
aim to integrate some existing C/C++ libraries into R. The algorithms we 
are trying to integrate, not too surprisingly, expect sequences as "char 
*" objects. Of course, we would like our packages to be nicely 
interoperable with Biostrings. In particular, since the algorithms do 
not change their input data, it would be nice to avoid copying. That is 
why I tried to find out how to get pointers to the data hidden inside 
XStringSet objects. As far as I understood it, an XStringSet object 
consists of one large data container of class "SharedRaw_Pool" and a 
"GroupedIRanges" object that defines the views on this container. I had 
no problem disecting the GroupedIRanges object in my C++ code, but I 
could not yet find a way to get the pointer to the container right. I 
searched the web and could not find any information. Biostrings and 
IRanges are indeed very well documented, but only on a user level, not 
the internals. I also looked at the C code included in these two 
packages, but, to be frank, I got lost. So, let me ask you the following 

- Is there a way to get a plain "char *" pointer that points to the 
first element of the data container?
- Are sequences actually encoded as plain text or not (it would make 
sense to me to encode DNA/RNA sequences as four letters per byte)? If 
not, my approach is not reasonable anyway and I will have to resort to a 
conversion to character vectors anyway.

Thanks a lot in advance for your inputs!

Best regards,

*Dr. Ulrich Bodenhofer*
Associate Professor
Institute of Bioinformatics

*Johannes Kepler University*
Altenberger Str. 69
4040 Linz, Austria

Tel. +43 732 2468 4526
Fax +43 732 2468 4539
bodenhofer at bioinf.jku.at <mailto:bodenhofer at bioinf.jku.at>
http://www.bioinf.jku.at/ <http://www.bioinf.jku.at>

More information about the Bioc-devel mailing list