[Bioc-devel] suggestions for minfi

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Sun Mar 20 21:18:10 CET 2016


Shaping up for the next Bioconductor release, I realize I only replied to
this email in my head.

These are all great suggestions.  In the future, feel free to add
suggestions like these to the minfi Github page under issues, the url is
  https://github.com/kasperdanielhansen/minfi/issues
(as can be seen in the package DESCRIPTION file).  Please start a new issue
for each suggestion (ie. these 3 comments should result in 3 different
issues).  If you add or respond to the issues I have created (#54-#56), you
will be notified of updates.

Here are some immediate answers.
1) Very sensible suggestion; this has to be balanced with making the
function easy to use and with limiting my maintenance responsibility.  I
don't think we have done a formal analysis of percent variance explained
vs. funnorm's ability to correct the data along the lines laid out in our
manuscript, so I'm not sure whether I think it is a good idea to look at
for the casual user.  But I can understand the desire for experts to tease
the algorithm apart.  Conclusion: I will have a closer look at this, but no
promises.

2) There have been multiple requests for an ability to remove probes prior
to various normalization routines, for example based on detection P
values.  Whether this should be done by completely removing rows in the
object or by allowing NAs in the object, is unclear to me at present.  One
argument against NAs in the object is that it adds (IMO) some frailty: now
everything has to be able to deal with NAs, which implies different number
of observations for each CpG.  Conclusion: I think I'll make it easier to
remove rows, and to remove rows based on detectionP.

3) No.  But it is easy to add support across multiple cores using
mclapply.  And if you wish to speed it up by running different computers,
you can always combine different RGChannelSet's using combine().  So your
request will be partly addressed.

Best,
Kasper

On Fri, Feb 26, 2016 at 3:15 AM, Maarten van Iterson <mviterson at gmail.com>
wrote:

> Dear developers of minfi,
>
> I have a few suggestions for minfi:
>
> 1) In a few analyses with >500 samples we noticed that often the number of
> PCs should be larger than the default of 2 for functional normalization.
> Now we extract some code from minfi, not exported, using the triple colon
> operator, to create such plots. It would be nice if there was a function to
> plot the variance explained of the first few pc of the control matrix. For
> example, something along the lines of:
>
> controlMatrix <- .buildControlMatrix450k(.extractFromRGSet450k(RGset))
> pc <- prcomp(controlMatrix)
> ##set nmax e.g. to 10 or so
> nmax <- ifelse(nmax > nrow(controlMatrix), nrow(controlMatrix), nmax)
> barplot(summary(pc)$importance[2,1:nmax], ylab="Proportion of Variance")
>
> and optionally return the pc-object e.g. to correlate with known covariates
> for further inspection.
>
> 2) Add function argument na.rm=FALSE/TRUE to detectionP which should be
> passed to  colMedians and colMads such that detectionP can handle NAs in
> the Red and Green intensity matrices of an rgSet. If na.rm=TRUE some
> detection P-values will be NA, if these were NA on the probe-level, but
> this is we want. For example, we use this for some probe-level filtering
> steps e.g. on the number of
> beads minimally required.
>
> 3) Are there any plans to support reading idats-files in parallel using the
> BiocParallel functionality? For example, read.450k.exp could easily be
> parallelized and for the reduce step 'combine' from Biobase can be used.
>
> If you wish I can share some code on these suggestions.
>
> Kind regards,
> Maarten
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list