[BioC] Error DEXSeq

Alejandro Reyes alejandro.reyes at embl.de
Thu May 22 16:28:16 CEST 2014


Hi Mallon,

I think what you want is this:

formulaFullModel = ˜ sample + exon + strain:exon + colony:exon + 
strain:colony:exon
formulaReducedModel = ˜ sample + exon + strain:exon + colony:exon

Don't forget to specify "formulaFullModel" when you create your 
DEXSeqDataSet object.
Or if you already created your object you have to modify your design by 
doing before calling
estimateDispersions

design( dxd ) <- formulaFullModel

Best regards,
Alejandro

> Hi Alejandro,
> Thanks for the quick reply. That appears to be working, but its going to
> take me a while to figure out my interaction model for the new syntax.
> Simon Anders helped me originally and I include it below as an FYI
>
> Thanks for the help
>
> Eamonn
>
> On 06/06/13 16:22, Mallon, Eamonn B. (Dr.) wrote:
>> Dear all
>> I have 11 samples for a cross infection experiment where there are two
>> colonies (the hosts K or Q) and 2 strains (6 or 8).
>>
>> sample colony strain type
>> K61 k six single-read
>> K62 k six single-read
>> K63 k six single-read
>> K81 k eight single-read
>> K82 k eight single-read
>> K83 k eight single-read
>> Q61 q six single-read
>> Q62 q six single-read
>> Q63 q six single-read
>> Q81 q eight single-read
>> Q82 q eight single-read
>>
>> I am most interested in finding exon usage differences related to the
>> interaction of the colony and strain factors. Following the vignette I
>> put the following code together
>>
>>
>> formuladispersion<-count~sample+(colony:strain)+exon
>> ecs<-estimateDispersions(ecs, formula=formuladispersion)
>> ecs<-fitDispersionFunction(ecs)
>>
>> formula0<-count~sample+exon+(colony:strain)
>> formula1<-count~sample+exon+(colony:strain)*I(exon==exonID)
>> ecs<-testForDEU(ecs, formula0=formula0, formula1=formula1)
>>
>> Does this make sense?
> Not quite. This tests for main effects and interaction in one go, so it
> returns as hits all exons whose usage differs between colonies _and/or_
> between strains. You are looking for interactions, i.e., for exons whose
> usage is different between strains _and_ where this difference itself
> differs between colonies. So, you need
>
>     formuladispersion<-count~sample+(colony*strain)*exon
>
>     formula0<-count~sample+exon+(colony+strain)*exon
> formula1<-count~sample+exon+(colony+strain)*exon+(colony:strain)*I(exon==ex
> onID)
>
> Things look a bit simpler if you use the new "TRT" functions:
>
>     formuladispersion<-count~sample+(colony*strain)*exon
>
>     formula0<-count~sample+exon+(colony+strain)*exon
>     formula1<-count~sample+exon+(colony*strain)*exon
>
>     Simon
>
>
> Dr Eamonn Mallon
> Lecturer in Evolutionary Biology
> Adrian 220
> Biology Department
> University of Leicester
>
> http://www2.le.ac.uk/departments/biology/people/mallon
>
>
>
>
>
> On 22/05/2014 12:13, "Alejandro Reyes" <alejandro.reyes at embl.de> wrote:
>
>> Dear Eamonn Mallon,
>>
>> Thanks for reporting this!
>>
>> There was an error in the DEXSeq code, that was not considering the
>> different ways to specify when the strand is unknown.  The gtf format
>> uses a ".", but the GRanges object uses "*" and does not accept a "."
>> instead.  I have added a fix to convert the "." to "*" when reading the
>> data from a gtf file. So this should work with the newests versions in
>> the svn. (Let me know if it doesn't!)
>>
>> Best regards,
>> Alejandro
>>
>>> Hi Alejandro,
>>> I was having a similar problem. I finished the analysis earlier in the
>>> year (on an earlier version of DEXSeq) but now would like to draw some
>>> of
>>> the pictures separately. I tried to run the analysis again. I¹ve changed
>>> to DEXSeqDataSeqFromHTSeq and now have the error
>>>
>>> Error in .local(x, ...) : strand values must be in '+' '-' Œ*'
>>>
>>>
>>> I guess this means something is now wrong with my GFF file.
>>>
>>> Its first line looks like
>>>
>>>
>>> gi|313870964|gb|AELG01010669.1|	dexseq_prepare_annotation.py	aggregate_ge
>>> ne
>>> 	2	1088	.	.	.	gene_id "XLOC_000001"
>>>
>>>
>>>
>>>
>>> Any help would be appreciated.
>>>
>>> Eamonn
>>>
>>>
>>> Dr Eamonn Mallon
>>> Lecturer in Evolutionary Biology
>>> Adrian 220
>>> Biology Department
>>> University of Leicester
>>>
>>> http://www2.le.ac.uk/departments/biology/people/mallon
>>>
>>>
>>>
>>>
>>>
>>> On 22/05/2014 10:53, "Alejandro Reyes" <alejandro.reyes at embl.de> wrote:
>>>
>>>> Hi Roberta,
>>>>
>>>> Yes, the newer versions of DEXSeq have lots of updates, in the new
>>>> vignette
>>>> you will the description of the new functions.  As the error message
>>>> indicates,
>>>> some functions were deprecated and substituted for new ones, including
>>>> the
>>>> functions that created the objects.  In your case, you need to use
>>>> DEXSeqDataSeqFromHTSeq instead of read.HTSeqCounts.
>>>>
>>>> Let me know if it works!
>>>> Best regards,
>>>> Alejandro
>>>>
>>>> ps.  I think we might have more efficient/effective communication if we
>>>> keep the e-mails concerning
>>>> the same issue in the same thread, rather than double posting and
>>>> having
>>>> many parallel
>>>> conversations: I have cc-ed Simon in this e-mail :)
>>>>
>>>>
>>>>
>>>>> Hi Alejandro,
>>>>> I updated to the current release versione of DEXSeq as you can see.
>>>>>
>>>>> sessionInfo()
>>>>> R version 3.1.0 (2014-04-10)
>>>>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>>>>
>>>>> locale:
>>>>> [1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8
>>>>>
>>>>> attached base packages:
>>>>> [1] parallel  stats     graphics  grDevices utils     datasets
>>>>> methods   base
>>>>>
>>>>> other attached packages:
>>>>>    [1] DEXSeq_1.10.3           BiocParallel_0.6.0 DESeq2_1.4.5
>>>>>          RcppArmadillo_0.4.300.0 Rcpp_0.11.1
>>>>> GenomicRanges_1.16.3 GenomeInfoDb_1.0.2      IRanges_1.22.6
>>>>>    Biobase_2.24.0
>>>>> [10] BiocGenerics_0.10.0     BiocInstaller_1.14.2
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>>    [1] annotate_1.42.0      AnnotationDbi_1.26.0 BatchJobs_1.2
>>>>> BBmisc_1.6           biomaRt_2.20.0 Biostrings_2.32.0
>>>>> bitops_1.0-6         brew_1.0-6 codetools_0.2-8
>>>>> DBI_0.2-7            digest_0.6.4
>>>>> [12] fail_1.2             foreach_1.4.2 genefilter_1.46.1
>>>>> geneplotter_1.42.0   grid_3.1.0 hwriter_1.3
>>>>> iterators_1.0.7      lattice_0.20-29 locfit_1.5-9.1
>>>>> plyr_1.8.1           RColorBrewer_1.0-5
>>>>> [23] RCurl_1.95-4.1       Rsamtools_1.16.0 RSQLite_0.11.4
>>>>> sendmailR_1.1-2      splines_3.1.0 statmod_1.4.19
>>>>> stats4_3.1.0         stringr_0.6.2 survival_2.37-7
>>>>> tools_3.1.0          XML_3.98-1.1
>>>>> [34] xtable_1.7-3         XVector_0.4.0        zlibbioc_1.10.0
>>>>>
>>>>> Now I have another error message in the previous steps:
>>>>>
>>>>> ecs <- read.HTSeqCounts (sampleTable$countFile, sampleTable ,
>>>>> "genes.gff")
>>>>> Errore in checkAtAssignment("character", "annotationFile",
>>>>> "character")
>>>>> :
>>>>>     ŒannotationFile¹ is not a slot in class ³character²
>>>>> Inoltre: Warning message:
>>>>> 'newExonCountSet' is deprecated.
>>>>> Use 'DEXSeqDataSet' instead.
>>>>> See help("Deprecated")
>>>>>
>>>>> How can I solve it?
>>>>>
>>>>> Thank you
>>>>>
>>>>>
>>>>> Alejandro Reyes <alejandro.reyes at embl.de> ha scritto:
>>>>>
>>>>>> Dear Carriero,
>>>>>>
>>>>>> I think you forgot to copy the output of your sessionInfo()?
>>>>>>
>>>>>> I think you might be using a very old version of DEXSeq. If so, could
>>>>>> you please update at least to the current release version of DEXSeq
>>>>>> (1.10.3), try again and write back if it keeps giving you error
>>>>>> messages?
>>>>>>
>>>>>> Best regards,
>>>>>> Alejandro
>>>>>>
>>>>>>> I have a warning message when I estimate the dipersion parameter. I
>>>>>>> can't carry on because of some other error messages in the following
>>>>>>> steps, as I attached below.
>>>>>>> Thanks in advance
>>>>>>>
>>>>>>>    -- output of sessionInfo():
>>>>>>>
>>>>>>> sizeFactors (ecs)
>>>>>>>    7A31.counts  7A32.counts  7A33.counts 46BR1.counts 46BR2.counts
>>>>>>> 46BR3.counts
>>>>>>>      1.0177858    1.0753635    1.1738656    0.9085247 0.9827212
>>>>>>> 0.9198726
>>>>>>>> ecs<- estimateDispersions ( ecs )
>>>>>>> Dispersion estimation. (Progress report: one dot per 100 genes)
>>>>>>>
>>>>>>>
>>>>>>> .....................................................................
>>>>>>> ..
>>>>>>> ............................................................
>>>>>>>
>>>>>>> Warning messages:
>>>>>>> 1: In .local(object, ...) :
>>>>>>>     Exons with less than 11 counts will be discarded. For more
>>>>>>> details
>>>>>>> read the documentation, parameter minCount
>>>>>>> 2: In .local(object, ...) :
>>>>>>>     Genes with more than 70 testable exons will be kicked out of the
>>>>>>> analysis. For more details read the documentation, parameter maxExon
>>>>>>>> ecs <- fitDispersionFunction(ecs)
>>>>>>> Error in if (sum(log(coefs/oldcoefs)^2) < 0.005) break :
>>>>>>>     missing value where TRUE/FALSE needed
>>>>>>> In addition: Warning messages:
>>>>>>> 1: In glmgam.fit(mm, disps[good], start = coefs) :
>>>>>>>     Too much damping - convergence tolerance not achievable
>>>>>>> 2: In log(coefs/oldcoefs) : NaNs produced
>>>>>>>> head( fData (ecs)$dispBeforeSharing)
>>>>>>> [1] 0.000000e+00 6.003931e-03 4.852941e-03 0.000000e+00 2.771527e-10
>>>>>>> [6] 0.000000e+00
>>>>>>>> ecs at dispFitCoefs
>>>>>>> [1] NA NA
>>>>>>>> head(fData(ecs)$dispFitted)
>>>>>>> [1] NA NA NA NA NA NA
>>>>>>>> plotDispEsts(ecs)
>>>>>>> Error: could not find function "plotDispEsts"
>>>>>>>> ecs<-testForDEU(ecs)
>>>>>>> Error in testForDEU(ecs) :
>>>>>>>     No dispersion values found, call function fitDispersionFunction
>>>>>>> first.
>>>>>>>
>>>>>>> -- 
>>>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list