[BioC] DESeq estimateDispersion options for lower depth miRNA-seq

Wolfgang Huber whuber at embl.de
Mon Mar 26 19:23:55 CEST 2012


Dear Praful

thanks for your message.

1. You can try with "fit-only", but then please visualise the data for 
the miRNAs that you identify that way and see whether these are 
plausible. E.g. do the 6 MA-plots for all pairs of samples and see where 
the hits are in there. The big drawback of the "fit-only" option is that 
"significant" calls might be made based on outlier measurements or other 
sources of high variability in the data.

2. Using genefilter::shorth as the argument for locfunc in 
estimateSizeFactors: in principle, this can be useful when the counts 
per-gene are low, but it requires that there are many genes (the shorth 
as an estimator is less efficient than, say, the median). Since you are 
working with miRNAs, you have few genes and low counts, and I am not 
sure that will improve much. I think it is fair to try both and see 
where you find more concordance between replicates and differences 
between conditions.

You could call 'arrayQualityMetrics' on the variance-stabilised version 
of the data with both normalisation options, and see which of the 
cluster dendrograms and PCA plots you prefer.

	Best wishes
	Wolfgang




Mar/26/12 5:49 PM, Aggarwal, Praful scripsit::
> Hello,
>
>
>
> I am trying to use DESeq for miRNA sequencing data. We have 2
> replicates (treated and untreated) i.e. a total of 4 samples.
> However, we have only around 200K-300K reads mapping to known miRNAs.
> I know this number is probably small, but we would still like to
> check for differential expression. The default options seem to be too
> conservative in our condition (may be due to the low number of
> reads), so I think may be using the "fit-only" option might be
> better. Since, we have lower reads I am thinking of using
> "genefilter's shorth" to estimate the size factors.
>
>
>
> I am trying these different options but am wondering what according
> to you could be the best options for usage in our case. I am aware of
> the potential noise in our data, but we still expect to see something
> significant which is being lost in all this noise. I hope my question
> makes sense. I would appreciate any help on this.
>
>
>
> Kindly let me know if you have any questions.
>
>
>
> Thanks, Praful
>

-- 
Best wishes
	Wolfgang

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioconductor mailing list