[BioC] DEXSeq for 1 gene

Mallon, Eamonn B. (Dr.) ebm3 at leicester.ac.uk
Wed May 29 10:11:20 CEST 2013


Hi Alejandro
Thanks for your reply.I haven't checked other genes. An edgeR analysis
worked out fine. It definitely could be a low expression gene. Thanks for
all your help

Eamonn


Dr Eamonn Mallon
Lecturer in Evolutionary Biology
Adrian 220
Biology Department
University of Leicester

http://www2.le.ac.uk/departments/biology/people/mallon





On 28/05/2013 16:56, "Alejandro Reyes" <alejandro.reyes at embl.de> wrote:

>Dear Eamonn,
>
>With such low counts for a each exon, you can not infer any isoform
>regulation using either DEXSeq or any tool, since the signal to noise
>ratio is very low.
>Do all your genes have similar low numbers? If so, I would check the
>data quality or the quality of the transcript assemblies that you are
>using.  If other genes have higher counts it could also mean that this
>gene is simply not so expressed in your samples.
>
>Best regards,
>Alejandro Reyes
>
>
>
>> Hi Wolfgang
>> Although I have 20 samples, only 11 are under the conditions I'm
>> interested in this analysis. Not sure why I mentioned 20 in the first
>> place sorry. Of those 11, 2 are giving odd results from top hat so I
>> haven't included them yet until I figure out what is wrong. So I have 9
>>to
>> work with
>>
>>> counts(ecs)
>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>> XLOC_000001:E001 1 0 1 0 1 0 0 0 0
>> XLOC_000001:E002 1 0 0 0 2 1 0 1 1
>> XLOC_000001:E003 0 0 0 0 3 0 0 0 0
>> XLOC_000001:E004 1 0 1 0 1 0 0 1 0
>> XLOC_000001:E005 1 3 3 2 3 0 0 0 0
>> XLOC_000001:E006 0 0 0 0 0 0 0 0 0
>> XLOC_000001:E007 0 1 2 2 2 0 3 2 0
>> XLOC_000001:E008 2 1 2 0 0 2 0 1 0
>> XLOC_000001:E009 1 2 0 0 1 0 0 2 0
>> XLOC_000001:E010 4 0 1 1 0 0 0 1 0
>> XLOC_000001:E011 0 0 0 0 0 1 0 0 0
>> XLOC_000001:E012 0 0 1 1 0 2 0 2 0
>> XLOC_000001:E013 1 1 3 2 2 2 1 3 0
>> XLOC_000001:E014 3 3 1 2 2 1 3 0 1
>> XLOC_000001:E015 1 2 1 2 4 2 0 0 3
>> XLOC_000001:E016 3 5 2 2 7 1 0 5 0
>> XLOC_000001:E017 1 2 0 2 2 2 1 3 1
>> XLOC_000001:E018 0 4 3 1 6 3 2 2 1
>> XLOC_000001:E019 2 2 3 2 2 2 0 5 3
>> XLOC_000001:E020 3 3 2 0 3 1 2 4 2
>> XLOC_000001:E021 2 1 2 4 3 1 0 3 1
>> XLOC_000001:E022 2 0 1 4 3 1 0 2 1
>> XLOC_000001:E023 0 0 3 1 0 2 0 2 1
>> XLOC_000001:E024 3 1 0 1 0 0 0 1 1
>> XLOC_000001:E025 0 1 0 2 2 2 2 0 0
>> XLOC_000001:E026 0 0 2 2 1 2 0 1 0
>> XLOC_000001:E027 6 1 2 3 0 0 0 4 1
>> XLOC_000001:E028 7 2 3 3 0 4 1 5 0
>> XLOC_000001:E029 5 4 1 5 0 5 5 3 1
>> XLOC_000001:E030 0 1 1 0 3 0 0 0 0
>> XLOC_000001:E031 1 1 0 0 1 0 0 1 0
>>
>> The low read count I put down to only looking at a single gene. Is my
>>data
>> a loss cost for the dexseq pipeline?
>>
>> Thanks for your help
>>
>> Eamonn
>>
>>
>>
>> Dr Eamonn Mallon
>> Lecturer in Evolutionary Biology
>> Adrian 220
>> Biology Department
>> University of Leicester
>>
>> http://www2.le.ac.uk/departments/biology/people/mallon
>>
>>
>>
>>
>>
>> On 26/05/2013 13:12, "Wolfgang Huber" <whuber at embl.de> wrote:
>>
>>> Dear Eamonn
>>>
>> >from your initial post -i.e. the output of head(counts(ecs))- it looks
>>> like your counts are very low? This would make estimateSizeFactors
>>> produce lots of NA values, but even if it did not, it would mean that
>>> you'd have essentially no statistical power to detect anything useful
>> >from such low counts.
>>> You mentioned "20 bumblebee samples" but the table whose head you show
>>> only has 9?
>>>
>>> Can you post the full output of 'counts(ecs)' and make sure its content
>>> is plausible based on your understanding of the data?
>>>
>>> 	Best wishes
>>> 	Wolfgang
>>>
>>>
>>> On May 23, 2013, at 2:03 pm, Eamonn Mallon <ebm3 at leicester.ac.uk>
>>>wrote:
>>>
>>>> Hi Alejandro,
>>>> Thanks very much for getting back to me.
>>>>
>>>> I think I got the second parameter idea from DESeq. I see I am wrong.
>>>> so I ran
>>>>
>>>>> ecs<-estimateSizeFactors(ecs)
>>>>> sizeFactors(ecs)
>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>>>> NA  NA  NA  NA  NA  NA  NA  NA  NA
>>>>
>>>> Is there any way to get DEXSeq to estimate size factors with my small
>>>> dataset?
>>>>
>>>> If you are only interested in a single gene maybe you could visualize
>>>> directly the counts per exon using the function plotDEXSeq.
>>>>
>>>> I'm not sure how to do this I tried
>>>>
>>>>> plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition",
>>>> norCounts=FALSE, expression=FALSE, splicing=FALSE,
>>>> displayTranscripts=FALSE, legend=TRUE)
>>>>
>>>> Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition",
>>>> norCounts = FALSE,  :
>>>>   Please estimate sizeFactors first
>>>>
>>>> I guess this is because modelFrameForGene requires sizefactors.
>>>>
>>>> Any ideas?
>>>>
>>>> Eamonn
>>>>
>>>>
>>>> On 22/05/13 18:07, Alejandro Reyes wrote:
>>>>> Dear Eamonn,
>>>>>
>>>>> Thanks for your interest in DEXSeq.
>>>>>
>>>>> In your size factor estimation, where did you get your second
>>>>>parameter
>>>>> from? (check ?estimateSizeFactors) There is no need to specify
>>>>>anything
>>>>> else than the ExonCountSet object and that is the reason you are
>>>>> getting
>>>>> the error message.
>>>>>
>>>>> If you are only interested in a single gene maybe you could visualize
>>>>> directly the counts per exon using the function plotDEXSeq.
>>>>>
>>>>> Best regards,
>>>>> Alejandro Reyes
>>>>>
>>>>>
>>>>>> Dear All,
>>>>>> I have RNA-seq data for 20 bumblebee samples divided into treatment
>>>>>> and control. I am interested in differential exon usage.
>>>>>>Unfortunately
>>>>>> the bumblebee genome is not at the stage where I can do a complete
>>>>>> DEXSeq analysis genome wide. I decided to look at a single gene
>>>>>>where
>>>>>> I have a decent gene model. Using TopHat2 and the python scripts
>>>>>> included in DEXSeq I was able to produce an ExonCountSet object.
>>>>>>
>>>>>>> head(counts(ecs))
>>>>>>                    K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>>>>>> XLOC_000001:E001   1   0   1   0   1   0   0   0   0
>>>>>> XLOC_000001:E002   1   0   0   0   2   1   0   1   1
>>>>>> XLOC_000001:E003   0   0   0   0   3   0   0   0   0
>>>>>> XLOC_000001:E004   1   0   1   0   1   0   0   1   0
>>>>>> XLOC_000001:E005   1   3   3   2   3   0   0   0   0
>>>>>> XLOC_000001:E006   0   0   0   0   0   0   0   0   0
>>>>>>
>>>>>> As I expected, its all come crashing down round my ears at the
>>>>>> analysis stage
>>>>>>
>>>>>>> sizeFactors(ecs)
>>>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>>>>>>    NA  NA  NA  NA  NA  NA  NA  NA  NA
>>>>>>
>>>>>> I tried using the shorth value
>>>>>>
>>>>>>> 
>>>>>>>ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts(ecs
>>>>>>>))
>>>>>>> ,tie.action="min")))
>>>>>> Error in .local(object, ...) : unused argument (0.2)
>>>>>>
>>>>>> I understand that my main problem is that the DEXSeq analysis should
>>>>>> be genome wide (my data has too few counts).
>>>>>>
>>>>>> Is there a way to use DEXSeq to ONLY look at a single gene? If not
>>>>>> can anyone think of a statistical way to analyse my data (cross
>>>>>>tabs?
>>>>>> How should I normalise?)
>>>>>>
>>>>>> Thanks for your time
>>>>>>
>>>>>> Eamonn
>>>>>>
>>>>>>
>>>>>>> sessionInfo()
>>>>>> R version 3.0.0 (2013-04-03)
>>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>>>>
>>>>>> locale:
>>>>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] parallel  stats     graphics  grDevices utils     datasets
>>>>>> methods   base
>>>>>>
>>>>>> other attached packages:
>>>>>> [1] genefilter_1.42.0  DEXSeq_1.6.0       Biobase_2.20.0
>>>>>> BiocGenerics_0.6.0
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>>    [1] annotate_1.38.0      AnnotationDbi_1.22.5 biomaRt_2.16.0
>>>>>> Biostrings_2.28.0    bitops_1.0-5         DBI_0.2-7
>>>>>>    [7] GenomicRanges_1.12.3 hwriter_1.3          IRanges_1.18.1
>>>>>> RCurl_1.95-4.1       Rsamtools_1.12.3     RSQLite_0.11.3
>>>>>> [13] splines_3.0.0        statmod_1.4.17       stats4_3.0.0
>>>>>> stringr_0.6.2        survival_2.37-4      tools_3.0.0
>>>>>> [19] XML_3.95-0.2         xtable_1.7-1         zlibbioc_1.6.0
>>>>>> Dr Eamonn Mallon
>>>>>> Lecturer in Evolutionary Biology
>>>>>> Adrian 220
>>>>>> Biology Department
>>>>>> University of Leicester
>>>>>>
>>>>>> http://www2.le.ac.uk/departments/biology/people/mallon
>>>>>>
>>>>>>
>>>>>> 	[[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives:
>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list