[BioC] Error DEXSeq

Mallon, Eamonn B. (Dr.) ebm3 at leicester.ac.uk
Tue May 27 14:03:32 CEST 2014


Hi Alejandro,
That worked great. Thanks for your help

Eamonn
Dr Eamonn Mallon
Lecturer in Evolutionary Biology
Adrian 220
Biology Department
University of Leicester

http://www2.le.ac.uk/departments/biology/people/mallon





On 22/05/2014 15:28, "Alejandro Reyes" <alejandro.reyes at embl.de> wrote:

>Hi Mallon,
>
>I think what you want is this:
>
>formulaFullModel = ˜ sample + exon + strain:exon + colony:exon +
>strain:colony:exon
>formulaReducedModel = ˜ sample + exon + strain:exon + colony:exon
>
>Don't forget to specify "formulaFullModel" when you create your
>DEXSeqDataSet object.
>Or if you already created your object you have to modify your design by
>doing before calling
>estimateDispersions
>
>design( dxd ) <- formulaFullModel
>
>Best regards,
>Alejandro
>
>> Hi Alejandro,
>> Thanks for the quick reply. That appears to be working, but its going to
>> take me a while to figure out my interaction model for the new syntax.
>> Simon Anders helped me originally and I include it below as an FYI
>>
>> Thanks for the help
>>
>> Eamonn
>>
>> On 06/06/13 16:22, Mallon, Eamonn B. (Dr.) wrote:
>>> Dear all
>>> I have 11 samples for a cross infection experiment where there are two
>>> colonies (the hosts K or Q) and 2 strains (6 or 8).
>>>
>>> sample colony strain type
>>> K61 k six single-read
>>> K62 k six single-read
>>> K63 k six single-read
>>> K81 k eight single-read
>>> K82 k eight single-read
>>> K83 k eight single-read
>>> Q61 q six single-read
>>> Q62 q six single-read
>>> Q63 q six single-read
>>> Q81 q eight single-read
>>> Q82 q eight single-read
>>>
>>> I am most interested in finding exon usage differences related to the
>>> interaction of the colony and strain factors. Following the vignette I
>>> put the following code together
>>>
>>>
>>> formuladispersion<-count~sample+(colony:strain)+exon
>>> ecs<-estimateDispersions(ecs, formula=formuladispersion)
>>> ecs<-fitDispersionFunction(ecs)
>>>
>>> formula0<-count~sample+exon+(colony:strain)
>>> formula1<-count~sample+exon+(colony:strain)*I(exon==exonID)
>>> ecs<-testForDEU(ecs, formula0=formula0, formula1=formula1)
>>>
>>> Does this make sense?
>> Not quite. This tests for main effects and interaction in one go, so it
>> returns as hits all exons whose usage differs between colonies _and/or_
>> between strains. You are looking for interactions, i.e., for exons whose
>> usage is different between strains _and_ where this difference itself
>> differs between colonies. So, you need
>>
>>     formuladispersion<-count~sample+(colony*strain)*exon
>>
>>     formula0<-count~sample+exon+(colony+strain)*exon
>> 
>>formula1<-count~sample+exon+(colony+strain)*exon+(colony:strain)*I(exon==
>>ex
>> onID)
>>
>> Things look a bit simpler if you use the new "TRT" functions:
>>
>>     formuladispersion<-count~sample+(colony*strain)*exon
>>
>>     formula0<-count~sample+exon+(colony+strain)*exon
>>     formula1<-count~sample+exon+(colony*strain)*exon
>>
>>     Simon
>>
>>
>> Dr Eamonn Mallon
>> Lecturer in Evolutionary Biology
>> Adrian 220
>> Biology Department
>> University of Leicester
>>
>> http://www2.le.ac.uk/departments/biology/people/mallon
>>
>>
>>
>>
>>
>> On 22/05/2014 12:13, "Alejandro Reyes" <alejandro.reyes at embl.de> wrote:
>>
>>> Dear Eamonn Mallon,
>>>
>>> Thanks for reporting this!
>>>
>>> There was an error in the DEXSeq code, that was not considering the
>>> different ways to specify when the strand is unknown.  The gtf format
>>> uses a ".", but the GRanges object uses "*" and does not accept a "."
>>> instead.  I have added a fix to convert the "." to "*" when reading the
>>> data from a gtf file. So this should work with the newests versions in
>>> the svn. (Let me know if it doesn't!)
>>>
>>> Best regards,
>>> Alejandro
>>>
>>>> Hi Alejandro,
>>>> I was having a similar problem. I finished the analysis earlier in the
>>>> year (on an earlier version of DEXSeq) but now would like to draw some
>>>> of
>>>> the pictures separately. I tried to run the analysis again. I¹ve
>>>>changed
>>>> to DEXSeqDataSeqFromHTSeq and now have the error
>>>>
>>>> Error in .local(x, ...) : strand values must be in '+' '-' Œ*'
>>>>
>>>>
>>>> I guess this means something is now wrong with my GFF file.
>>>>
>>>> Its first line looks like
>>>>
>>>>
>>>> 
>>>>gi|313870964|gb|AELG01010669.1|	dexseq_prepare_annotation.py	aggregate_
>>>>ge
>>>> ne
>>>> 	2	1088	.	.	.	gene_id "XLOC_000001"
>>>>
>>>>
>>>>
>>>>
>>>> Any help would be appreciated.
>>>>
>>>> Eamonn
>>>>
>>>>
>>>> Dr Eamonn Mallon
>>>> Lecturer in Evolutionary Biology
>>>> Adrian 220
>>>> Biology Department
>>>> University of Leicester
>>>>
>>>> http://www2.le.ac.uk/departments/biology/people/mallon
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 22/05/2014 10:53, "Alejandro Reyes" <alejandro.reyes at embl.de>
>>>>wrote:
>>>>
>>>>> Hi Roberta,
>>>>>
>>>>> Yes, the newer versions of DEXSeq have lots of updates, in the new
>>>>> vignette
>>>>> you will the description of the new functions.  As the error message
>>>>> indicates,
>>>>> some functions were deprecated and substituted for new ones,
>>>>>including
>>>>> the
>>>>> functions that created the objects.  In your case, you need to use
>>>>> DEXSeqDataSeqFromHTSeq instead of read.HTSeqCounts.
>>>>>
>>>>> Let me know if it works!
>>>>> Best regards,
>>>>> Alejandro
>>>>>
>>>>> ps.  I think we might have more efficient/effective communication if
>>>>>we
>>>>> keep the e-mails concerning
>>>>> the same issue in the same thread, rather than double posting and
>>>>> having
>>>>> many parallel
>>>>> conversations: I have cc-ed Simon in this e-mail :)
>>>>>
>>>>>
>>>>>
>>>>>> Hi Alejandro,
>>>>>> I updated to the current release versione of DEXSeq as you can see.
>>>>>>
>>>>>> sessionInfo()
>>>>>> R version 3.1.0 (2014-04-10)
>>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>>>>
>>>>>> locale:
>>>>>> [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] parallel  stats     graphics  grDevices utils     datasets
>>>>>> methods   base
>>>>>>
>>>>>> other attached packages:
>>>>>>    [1] DEXSeq_1.10.3           BiocParallel_0.6.0 DESeq2_1.4.5
>>>>>>          RcppArmadillo_0.4.300.0 Rcpp_0.11.1
>>>>>> GenomicRanges_1.16.3 GenomeInfoDb_1.0.2      IRanges_1.22.6
>>>>>>    Biobase_2.24.0
>>>>>> [10] BiocGenerics_0.10.0     BiocInstaller_1.14.2
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>>    [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2
>>>>>> BBmisc_1.6           biomaRt_2.20.0 Biostrings_2.32.0
>>>>>> bitops_1.0-6         brew_1.0-6 codetools_0.2-8
>>>>>> DBI_0.2-7            digest_0.6.4
>>>>>> [12] fail_1.2             foreach_1.4.2 genefilter_1.46.1
>>>>>> geneplotter_1.42.0   grid_3.1.0 hwriter_1.3
>>>>>> iterators_1.0.7      lattice_0.20-29 locfit_1.5-9.1
>>>>>> plyr_1.8.1           RColorBrewer_1.0-5
>>>>>> [23] RCurl_1.95-4.1       Rsamtools_1.16.0 RSQLite_0.11.4
>>>>>> sendmailR_1.1-2      splines_3.1.0 statmod_1.4.19
>>>>>> stats4_3.1.0         stringr_0.6.2 survival_2.37-7
>>>>>> tools_3.1.0          XML_3.98-1.1
>>>>>> [34] xtable_1.7-3         XVector_0.4.0        zlibbioc_1.10.0
>>>>>>
>>>>>> Now I have another error message in the previous steps:
>>>>>>
>>>>>> ecs <- read.HTSeqCounts (sampleTable$countFile, sampleTable ,
>>>>>> "genes.gff")
>>>>>> Errore in checkAtAssignment("character", "annotationFile",
>>>>>> "character")
>>>>>> :
>>>>>>     ŒannotationFile¹ is not a slot in class ³character²
>>>>>> Inoltre: Warning message:
>>>>>> 'newExonCountSet' is deprecated.
>>>>>> Use 'DEXSeqDataSet' instead.
>>>>>> See help("Deprecated")
>>>>>>
>>>>>> How can I solve it?
>>>>>>
>>>>>> Thank you
>>>>>>
>>>>>>
>>>>>> Alejandro Reyes <alejandro.reyes at embl.de> ha scritto:
>>>>>>
>>>>>>> Dear Carriero,
>>>>>>>
>>>>>>> I think you forgot to copy the output of your sessionInfo()?
>>>>>>>
>>>>>>> I think you might be using a very old version of DEXSeq. If so,
>>>>>>>could
>>>>>>> you please update at least to the current release version of DEXSeq
>>>>>>> (1.10.3), try again and write back if it keeps giving you error
>>>>>>> messages?
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Alejandro
>>>>>>>
>>>>>>>> I have a warning message when I estimate the dipersion parameter.
>>>>>>>>I
>>>>>>>> can't carry on because of some other error messages in the
>>>>>>>>following
>>>>>>>> steps, as I attached below.
>>>>>>>> Thanks in advance
>>>>>>>>
>>>>>>>>    -- output of sessionInfo():
>>>>>>>>
>>>>>>>> sizeFactors (ecs)
>>>>>>>>    7A31.counts  7A32.counts  7A33.counts 46BR1.counts 46BR2.counts
>>>>>>>> 46BR3.counts
>>>>>>>>      1.0177858    1.0753635    1.1738656    0.9085247 0.9827212
>>>>>>>> 0.9198726
>>>>>>>>> ecs<- estimateDispersions ( ecs )
>>>>>>>> Dispersion estimation. (Progress report: one dot per 100 genes)
>>>>>>>>
>>>>>>>>
>>>>>>>> 
>>>>>>>>...................................................................
>>>>>>>>..
>>>>>>>> ..
>>>>>>>> ............................................................
>>>>>>>>
>>>>>>>> Warning messages:
>>>>>>>> 1: In .local(object, ...) :
>>>>>>>>     Exons with less than 11 counts will be discarded. For more
>>>>>>>> details
>>>>>>>> read the documentation, parameter minCount
>>>>>>>> 2: In .local(object, ...) :
>>>>>>>>     Genes with more than 70 testable exons will be kicked out of
>>>>>>>>the
>>>>>>>> analysis. For more details read the documentation, parameter
>>>>>>>>maxExon
>>>>>>>>> ecs <- fitDispersionFunction(ecs)
>>>>>>>> Error in if (sum(log(coefs/oldcoefs)^2) < 0.005) break :
>>>>>>>>     missing value where TRUE/FALSE needed
>>>>>>>> In addition: Warning messages:
>>>>>>>> 1: In glmgam.fit(mm, disps[good], start = coefs) :
>>>>>>>>     Too much damping - convergence tolerance not achievable
>>>>>>>> 2: In log(coefs/oldcoefs) : NaNs produced
>>>>>>>>> head( fData (ecs)$dispBeforeSharing)
>>>>>>>> [1] 0.000000e+00 6.003931e-03 4.852941e-03 0.000000e+00
>>>>>>>>2.771527e-10
>>>>>>>> [6] 0.000000e+00
>>>>>>>>> ecs at dispFitCoefs
>>>>>>>> [1] NA NA
>>>>>>>>> head(fData(ecs)$dispFitted)
>>>>>>>> [1] NA NA NA NA NA NA
>>>>>>>>> plotDispEsts(ecs)
>>>>>>>> Error: could not find function "plotDispEsts"
>>>>>>>>> ecs<-testForDEU(ecs)
>>>>>>>> Error in testForDEU(ecs) :
>>>>>>>>     No dispersion values found, call function
>>>>>>>>fitDispersionFunction
>>>>>>>> first.
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list