[Bioc-devel] I would like to publish a bioconductor package.

Davide Rambaldi davide.rambaldi at ieo.eu
Wed Feb 27 12:03:44 CET 2013

Hi all, 

I am working on a library called flowFit, the purpose of this library is to analyze the FACS data coming from proliferation tracking dyes study.

The library depends on the flowCore and flowViz bioconductor libraries and use minpack.lm (levenberg-marquadt algorithm) to fit a set of peaks over the FACS data.

A typical experimental pipeline:

1) Acquire with FACS a sample of unlabelled cells
2) Acquire with FACS a sample of labeled and unstimulated cells (the Parent Population)
3) Acquire with FACS a sample of labeled and stimulated cells (the Proliferative Population)

In R we can use the flowCore functions to transform the raw data and to gate the population of interest. Once we have gated the correct population, with 2 commands of flowFit you can perform the fitting:

> library(flowFit)
> parent <- parentFitting(QuahAndParish[[1]], "<FITC-A>")
> fitting <- proliferationFitting(QuahAndParish[[2]],  "<FITC-A>", parent.fitting.cfse at parentPeakPosition,  parent.fitting.cfse at parentPeakSize)

The function can generate also some graphical output with:

> plot(fitting.cfse)

To demonstrate the correctness of the fitting I have made some in silico simulations and a retrospective analysis of the data from the paper:

"New and improved methods for measuring lymphocyte proliferation in vitro and in vivo using CFSE-like fluorescent dyes", Benjamin J.C. Quah ⁎, Christopher R. Parish, Journal of Immunological Methods (2012)

In this paper, the same population of lymphocytes (proliferation with the same growth conditions) was stained with 3 different proliferation tracking dyes: if the fitting algorithm is working as expected, we expect to estimate the same % of cells for generation in the 3 sample. 

Comparing the 3 samples we didn't see any significant difference in the estimation of the % of cell for generations, suggesting us that the algorithm is correctly estimating the % of cells / generation.

I have posted a graphical output example with the Quah and Parish data (pdf) here:


The dataset will be included in the library (in the data subdir).

Actually I am writing the vignette (I am following the guidelines in http://www.bioconductor.org/developers/package-guidelines/) and fixing some graphical bugs (like the legend oversized …). 

The package Pass R CMD build and R CMD CHECK (time: 86 seconds) with no errors on OSX and Linux (I have to find a windows machine somewhere ...), I still have to test with the R-devel version of R.

The library is bigger than expected (4.2 Mb) because the example datasets (FCS files converted in .Rdata) are big (3.7M) and I don't know how to solve this issue...

My question is, How I proceed from here?

I would like to publish the library/methods in a paper (Bioinformatics Journal may be?) and submit the library to Bioconductor, which is the correct way to proceed?


P.S: If I miss (again!) some FAQ please apologize me 

