[Bioc-devel] Developing a package that handle BAM file

Hao Feng hx|155 @end|ng |rom c@@e@edu
Sun Dec 18 22:07:21 CET 2022

Hi Bioconductor team,

We’re developing a package to analyze a `new` type of high-throughput sequencing data. We have some BAM files ready. We’d like the package to be able to take BAM files as the input. In addition, to maintain our method’s statistical rigor, we’d like to implement repeated sampling for multiple times (a large number). 

This brings concern to a package development, which I’d like to collect some feedback prior to proceed:
1. The best practice to use BAM files as the example data? BAM files are large in general. Although we can use a part of the BAM, instead of the BAM from the whole genome, as the example data, I still don’t think it’s feasible to include those external files in the package submission. And due to the novelty of this sequencing experiment, I don’t think the current experiment data packages have them. Would you suggest us submitting a new experiment data package (of several dozen MBs), along with our software package? 
2. Deal with long example run time? Due to repeated sampling and the handling of BAM files, we suspect our software code examples may take a long running time. Do you have any suggestions on this so it can pass the software checking, which is time-bound?

Thank you very much in advance. 


Hao Feng, Ph.D.
Assistant Professor in Biostatistics
Department of Population and Quantitative Health Sciences
Case Western Reserve University School of Medicine
Harland Wood Building - Room WG-82T 
10900 Euclid Avenue
SOM / Room G82T 
Cleveland, OH 44106-4945
Phone: 216-368-5510
Lab website: https://hfenglab.org/
Official website: http://epbiwww.case.edu/haofeng-phd/
Email: hxf155 using case.edu <mailto:hxf155 using case.edu>

	[[alternative HTML version deleted]]

More information about the Bioc-devel mailing list