[BioC] DEXSeq for 1 gene
Mallon, Eamonn B. (Dr.)
ebm3 at leicester.ac.uk
Tue May 28 09:20:22 CEST 2013
Hi Wolfgang
Although I have 20 samples, only 11 are under the conditions I'm
interested in this analysis. Not sure why I mentioned 20 in the first
place sorry. Of those 11, 2 are giving odd results from top hat so I
haven't included them yet until I figure out what is wrong. So I have 9 to
work with
> counts(ecs)
K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
XLOC_000001:E001 1 0 1 0 1 0 0 0 0
XLOC_000001:E002 1 0 0 0 2 1 0 1 1
XLOC_000001:E003 0 0 0 0 3 0 0 0 0
XLOC_000001:E004 1 0 1 0 1 0 0 1 0
XLOC_000001:E005 1 3 3 2 3 0 0 0 0
XLOC_000001:E006 0 0 0 0 0 0 0 0 0
XLOC_000001:E007 0 1 2 2 2 0 3 2 0
XLOC_000001:E008 2 1 2 0 0 2 0 1 0
XLOC_000001:E009 1 2 0 0 1 0 0 2 0
XLOC_000001:E010 4 0 1 1 0 0 0 1 0
XLOC_000001:E011 0 0 0 0 0 1 0 0 0
XLOC_000001:E012 0 0 1 1 0 2 0 2 0
XLOC_000001:E013 1 1 3 2 2 2 1 3 0
XLOC_000001:E014 3 3 1 2 2 1 3 0 1
XLOC_000001:E015 1 2 1 2 4 2 0 0 3
XLOC_000001:E016 3 5 2 2 7 1 0 5 0
XLOC_000001:E017 1 2 0 2 2 2 1 3 1
XLOC_000001:E018 0 4 3 1 6 3 2 2 1
XLOC_000001:E019 2 2 3 2 2 2 0 5 3
XLOC_000001:E020 3 3 2 0 3 1 2 4 2
XLOC_000001:E021 2 1 2 4 3 1 0 3 1
XLOC_000001:E022 2 0 1 4 3 1 0 2 1
XLOC_000001:E023 0 0 3 1 0 2 0 2 1
XLOC_000001:E024 3 1 0 1 0 0 0 1 1
XLOC_000001:E025 0 1 0 2 2 2 2 0 0
XLOC_000001:E026 0 0 2 2 1 2 0 1 0
XLOC_000001:E027 6 1 2 3 0 0 0 4 1
XLOC_000001:E028 7 2 3 3 0 4 1 5 0
XLOC_000001:E029 5 4 1 5 0 5 5 3 1
XLOC_000001:E030 0 1 1 0 3 0 0 0 0
XLOC_000001:E031 1 1 0 0 1 0 0 1 0
The low read count I put down to only looking at a single gene. Is my data
a loss cost for the dexseq pipeline?
Thanks for your help
Eamonn
Dr Eamonn Mallon
Lecturer in Evolutionary Biology
Adrian 220
Biology Department
University of Leicester
http://www2.le.ac.uk/departments/biology/people/mallon
On 26/05/2013 13:12, "Wolfgang Huber" <whuber at embl.de> wrote:
>Dear Eamonn
>
>from your initial post -i.e. the output of head(counts(ecs))- it looks
>like your counts are very low? This would make estimateSizeFactors
>produce lots of NA values, but even if it did not, it would mean that
>you'd have essentially no statistical power to detect anything useful
>from such low counts.
>
>You mentioned "20 bumblebee samples" but the table whose head you show
>only has 9?
>
>Can you post the full output of 'counts(ecs)' and make sure its content
>is plausible based on your understanding of the data?
>
> Best wishes
> Wolfgang
>
>
>On May 23, 2013, at 2:03 pm, Eamonn Mallon <ebm3 at leicester.ac.uk> wrote:
>
>> Hi Alejandro,
>> Thanks very much for getting back to me.
>>
>> I think I got the second parameter idea from DESeq. I see I am wrong.
>>so I ran
>>
>> > ecs<-estimateSizeFactors(ecs)
>> > sizeFactors(ecs)
>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>> NA NA NA NA NA NA NA NA NA
>>
>> Is there any way to get DEXSeq to estimate size factors with my small
>>dataset?
>>
>> If you are only interested in a single gene maybe you could visualize
>> directly the counts per exon using the function plotDEXSeq.
>>
>> I'm not sure how to do this I tried
>>
>> >plotDEXSeq(ecs,"XLOC_000001", fitExpToVar="condition",
>>norCounts=FALSE, expression=FALSE, splicing=FALSE,
>>displayTranscripts=FALSE, legend=TRUE)
>>
>> Error in plotDEXSeq(ecs, "XLOC_000001", fitExpToVar = "condition",
>>norCounts = FALSE, :
>> Please estimate sizeFactors first
>>
>> I guess this is because modelFrameForGene requires sizefactors.
>>
>> Any ideas?
>>
>> Eamonn
>>
>>
>> On 22/05/13 18:07, Alejandro Reyes wrote:
>>> Dear Eamonn,
>>>
>>> Thanks for your interest in DEXSeq.
>>>
>>> In your size factor estimation, where did you get your second parameter
>>> from? (check ?estimateSizeFactors) There is no need to specify anything
>>> else than the ExonCountSet object and that is the reason you are
>>>getting
>>> the error message.
>>>
>>> If you are only interested in a single gene maybe you could visualize
>>> directly the counts per exon using the function plotDEXSeq.
>>>
>>> Best regards,
>>> Alejandro Reyes
>>>
>>>
>>>> Dear All,
>>>> I have RNA-seq data for 20 bumblebee samples divided into treatment
>>>>and control. I am interested in differential exon usage. Unfortunately
>>>>the bumblebee genome is not at the stage where I can do a complete
>>>>DEXSeq analysis genome wide. I decided to look at a single gene where
>>>>I have a decent gene model. Using TopHat2 and the python scripts
>>>>included in DEXSeq I was able to produce an ExonCountSet object.
>>>>
>>>>> head(counts(ecs))
>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>>>> XLOC_000001:E001 1 0 1 0 1 0 0 0 0
>>>> XLOC_000001:E002 1 0 0 0 2 1 0 1 1
>>>> XLOC_000001:E003 0 0 0 0 3 0 0 0 0
>>>> XLOC_000001:E004 1 0 1 0 1 0 0 1 0
>>>> XLOC_000001:E005 1 3 3 2 3 0 0 0 0
>>>> XLOC_000001:E006 0 0 0 0 0 0 0 0 0
>>>>
>>>> As I expected, its all come crashing down round my ears at the
>>>>analysis stage
>>>>
>>>>> sizeFactors(ecs)
>>>> K61 K62 K63 K83 Q61 Q62 Q63 Q81 Q82
>>>> NA NA NA NA NA NA NA NA NA
>>>>
>>>> I tried using the shorth value
>>>>
>>>>>
>>>>>ecs<-estimateSizeFactors(ecs,(locfunc=genefilter::shorth((counts(ecs))
>>>>>,tie.action="min")))
>>>> Error in .local(object, ...) : unused argument (0.2)
>>>>
>>>> I understand that my main problem is that the DEXSeq analysis should
>>>>be genome wide (my data has too few counts).
>>>>
>>>> Is there a way to use DEXSeq to ONLY look at a single gene? If not
>>>>can anyone think of a statistical way to analyse my data (cross tabs?
>>>>How should I normalise?)
>>>>
>>>> Thanks for your time
>>>>
>>>> Eamonn
>>>>
>>>>
>>>>> sessionInfo()
>>>> R version 3.0.0 (2013-04-03)
>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] parallel stats graphics grDevices utils datasets
>>>>methods base
>>>>
>>>> other attached packages:
>>>> [1] genefilter_1.42.0 DEXSeq_1.6.0 Biobase_2.20.0
>>>>BiocGenerics_0.6.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] annotate_1.38.0 AnnotationDbi_1.22.5 biomaRt_2.16.0
>>>>Biostrings_2.28.0 bitops_1.0-5 DBI_0.2-7
>>>> [7] GenomicRanges_1.12.3 hwriter_1.3 IRanges_1.18.1
>>>>RCurl_1.95-4.1 Rsamtools_1.12.3 RSQLite_0.11.3
>>>> [13] splines_3.0.0 statmod_1.4.17 stats4_3.0.0
>>>>stringr_0.6.2 survival_2.37-4 tools_3.0.0
>>>> [19] XML_3.95-0.2 xtable_1.7-1 zlibbioc_1.6.0
>>>> Dr Eamonn Mallon
>>>> Lecturer in Evolutionary Biology
>>>> Adrian 220
>>>> Biology Department
>>>> University of Leicester
>>>>
>>>> http://www2.le.ac.uk/departments/biology/people/mallon
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list