hi Mali,
You would certainly want to normalize each sample based on library size,
which happens in all methods, although the DESeq and edgeR methods for
library size normalization use a robust estimator, while the 'sum of mapped
reads' is is strongly influenced by the features with highest counts.
I can't tell from the documentation if the standardise() function in the
cycle package is creating mean 0, variance 1 along the rows or columns, but
this would be relevant to your question.
If you were calculating simple gene-gene distances and looking for genes
which cluster together, it would be important to center the log-scale
counts along the genes, as you would probably like to find genes which rise
and fall together regardless of the base level. (see the sweep() function
in base R)
Another factor is the dependence of the variance of log-transformed counts
on the mean. From a quick read of the vignette, the model in cycle package
has normally-distributed error with a variance that depends on time, but
not explicitly on the mean. You might try first using one of the
transformations described in the DESeq2 vignette and then working with the
variance stabilized values.
Mike
On Tue, Apr 1, 2014 at 8:33 AM, mali salmon wrote:
> Hello List
> I have RNA-seq data from different time points and I would like to find
> oscillating genes.
> I thought of using the "cycle" package (which is based on Fourier score) ,
> but I'm not sure what values to use: FPKM or DESeq/edgeR normalized values.
> Any suggestion what would be more appropriate?
> Thanks
> Mali
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
[[alternative HTML version deleted]]