[BioC] phyloseq/DESeq gives negative transformed values

Sophie Josephine Weiss Sophie.Weiss at colorado.edu
Sat Apr 19 00:32:52 CEST 2014


Hi Mike,
Could you please check whether I am running this correctly?  I have double
checked all the parameters, but for some reason, I am getting negatives
using the R script on the attached .biom dataset.  There are no replicates
in this microbial dataset.
Thanks for your advice,
Sophie


On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss <
Sophie.Weiss at colorado.edu> wrote:

> Thanks Mike, that is what I thought.  What if we wanted to perform kruskal
> wallis, or is it possible to perform anova on the variance-stabilized
> matrix?
>
>
> On Wed, Apr 16, 2014 at 2:29 PM, Michael Love <michaelisaiahlove at gmail.com
> > wrote:
>
>> hi Sophie,
>>
>> We recommend using the standard DESeq() function for differential
>> expression.
>>
>> This is mentioned in the first line of the vignette section on
>> transformations:
>>
>> "In order to test for diff erential expression, we operate on raw
>> counts and use discrete distributions as
>> described in the previous section"
>>
>> Also, in the McMurdie and Holmes, they are using the DESeq() function,
>> as shown in their supplemental material:
>>
>>
>> http://joey711.github.io/waste-not-supplemental/simulation-differential-abundance/simulation-differential-abundance-server.html
>>
>> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss
>> <Sophie.Weiss at colorado.edu> wrote:
>> > Please help with this?  Thanks again.
>> >
>> >
>> > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss
>> > <Sophie.Weiss at colorado.edu> wrote:
>> >>
>> >> Thanks again Mike - would it be ok to do chi-2 and other significance
>> >> tests on the DESeq transformed datasets using independent code, or is
>> it
>> >> necessary to do the differential expression tests strictly within
>> DESeq2?
>> >>
>> >> Sophie
>> >>
>> >>
>> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love
>> >> <michaelisaiahlove at gmail.com> wrote:
>> >>>
>> >>> hi Sophie,
>> >>>
>> >>> The VST code is the same in DESeq and DESeq2. The estimation of
>> >>> dispersion is slightly different (details are in the vignette "Changes
>> >>> from DESeq to DESeq2"), but the fitted line (which is used by the VST)
>> >>> should be very similar.
>> >>>
>> >>> Mike
>> >>>
>> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss
>> >>> <Sophie.Weiss at colorado.edu> wrote:
>> >>> > Hi Mike,
>> >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization -
>> do
>> >>> > you
>> >>> > think that is ok, or would it be better to use DESeq 2?
>> >>> > Thanks again,
>> >>> > Sophie
>> >>> >
>> >>> >
>> >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love
>> >>> > <michaelisaiahlove at gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> hi Sophie,
>> >>> >>
>> >>> >>
>> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss
>> >>> >> <Sophie.Weiss at colorado.edu> wrote:
>> >>> >> >
>> >>> >> > Hi Mike,
>> >>> >> > Thanks for the references.  By "threshold at 0" do you mean set
>> any
>> >>> >> > negative values equal to 0?
>> >>> >>
>> >>> >>
>> >>> >> yes.
>> >>> >>
>> >>> >>
>> >>> >> >
>> >>> >> > Do you think this is the best approach?
>> >>> >>
>> >>> >>
>> >>> >> I haven't explored this area, and would defer to the McMurdie and
>> >>> >> Holmes paper for the best combinations of distance and
>> transformation.
>> >>> >>
>> >>> >>
>> >>> >> >
>> >>> >> > Thanks again,
>> >>> >> > Sophie
>> >>> >> >
>> >>> >> >
>> >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love
>> >>> >> > <michaelisaiahlove at gmail.com> wrote:
>> >>> >> >>
>> >>> >> >> I tried poking around here
>> >>> >> >> http://joey711.github.io/phyloseq/distance
>> >>> >> >> but couldn't see if the authors did anything for distances
>> >>> >> >> requiring
>> >>> >> >> non-negative data. It appears
>> >>> >> >>
>> >>> >> >>
>> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003531
>> >>> >> >> that VST was tested with Bray-Curtis distance. I think the
>> distance
>> >>> >> >> is
>> >>> >> >> designed for counts, but you could always threshold at 0 to
>> insist
>> >>> >> >> that the
>> >>> >> >> log2-like quantity act more like a count.
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss
>> >>> >> >> <Sophie.Weiss at colorado.edu> wrote:
>> >>> >> >>>
>> >>> >> >>> Hi Mike,
>> >>> >> >>> Thanks for explaining more.  I am used to working with rarefied
>> >>> >> >>> microbial datasets, that is why.  Instead of rarefying I would
>> >>> >> >>> like to use
>> >>> >> >>> the DESeq method.
>> >>> >> >>>
>> >>> >> >>> How would you then suggest going about calculating bray-curtis
>> >>> >> >>> distance, or summarized taxa diagrams with these new
>> transformed
>> >>> >> >>> matrices
>> >>> >> >>> with negative values?
>> >>> >> >>> Thanks again,
>> >>> >> >>> Sophie
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love
>> >>> >> >>> <michaelisaiahlove at gmail.com> wrote:
>> >>> >> >>>>
>> >>> >> >>>> hi Sophie,
>> >>> >> >>>>
>> >>> >> >>>> Can you explain why you don't want negative values in the
>> >>> >> >>>> transformed
>> >>> >> >>>> values?  Adding one to the raw counts is not sufficient. I
>> should
>> >>> >> >>>> have said
>> >>> >> >>>> in my previous email, "the expected counts on the common
>> scale".
>> >>> >> >>>> If the
>> >>> >> >>>> size factor for a sample is 2, then an expected count of 1
>> leads
>> >>> >> >>>> to an
>> >>> >> >>>> expected count of 1/2 on the common scale (after accounting
>> for
>> >>> >> >>>> size
>> >>> >> >>>> factors).
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss
>> >>> >> >>>> <Sophie.Weiss at colorado.edu> wrote:
>> >>> >> >>>>>
>> >>> >> >>>>> Hi Mike,
>> >>> >> >>>>> Thanks for your reply!  Ok, makes sense, but I added 1 to
>> all my
>> >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - there
>> >>> >> >>>>> are still
>> >>> >> >>>>> negatives?
>> >>> >> >>>>> Thanks again,
>> >>> >> >>>>> Sophie
>> >>> >> >>>>>
>> >>> >> >>>>>
>> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love
>> >>> >> >>>>> <michaelisaiahlove at gmail.com> wrote:
>> >>> >> >>>>>>
>> >>> >> >>>>>> hi Sophie,
>> >>> >> >>>>>>
>> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like
>> >>> >> >>>>>> transformations. If the expected count is between 0 and 1,
>> the
>> >>> >> >>>>>> values can be
>> >>> >> >>>>>> negative, this does not indicate a problem.
>> >>> >> >>>>>>
>> >>> >> >>>>>> Mike
>> >>> >> >>>>>>
>> >>> >> >>>>>>
>> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss
>> >>> >> >>>>>> <Sophie.Weiss at colorado.edu> wrote:
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> Hello,
>> >>> >> >>>>>>> I have microbiome data with no replicates, from different
>> >>> >> >>>>>>> conditions.  I am
>> >>> >> >>>>>>> trying to transform the data using the DESeq method, as
>> >>> >> >>>>>>> described
>> >>> >> >>>>>>> in
>> >>> >> >>>>>>> McMurdie and Holmes 2014.
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> The attached file is the definition I am using, as per the
>> >>> >> >>>>>>> supplemental
>> >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am
>> >>> >> >>>>>>> using.
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> Thank you for your help,
>> >>> >> >>>>>>> Sophie
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> _______________________________________________
>> >>> >> >>>>>>> Bioconductor mailing list
>> >>> >> >>>>>>> Bioconductor at r-project.org
>> >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>> >> >>>>>>> Search the archives:
>> >>> >> >>>>>>>
>> >>> >> >>>>>>>
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>> >> >>>>>>
>> >>> >> >>>>>>
>> >>> >> >>>>>
>> >>> >> >>>>
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>>
>
>


More information about the Bioconductor mailing list