[Bioc-devel] the character to collapse the geneNames when using the disjointExons function with aggregateGenes=TRUE
Nicolas Delhomme
delhomme at embl.de
Thu Aug 1 17:45:48 CEST 2013
Fantastic!
Cheers,
Nico
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
On Jul 31, 2013, at 10:41 PM, Alejandro Reyes wrote:
> Dear all,
>
> No problem from my side, I can adapt DEXSeq to those changes.
>
> Best regards,
> Alejandro Reyes
>
>> Mike, Alejandro,
>>
>> I also wonder about getting rid of the 'exonID' metadata column. This is redundant with 'exonic_part_number'. Do you have a preference for keeping one or the other?
>>
>> Valerie
>>
>>
>> On 07/31/2013 10:04 AM, Valerie Obenchain wrote:
>>> Hi Nico,
>>>
>>> (Adding Mike and Alejandro.)
>>>
>>> Because disjointExons() came from DEXSeq I wanted to preserve the
>>> behavior for backwards compatibility and familiarity to DEXSeq users.
>>> There are a couple of changes I'd like to make so disjointExons() is
>>> consistent with the other extractors in GenomicFeatures.
>>>
>>> (1) Change metadata column names from 'geneNames' and 'transcripts' to
>>> 'gene_id' and tx_name'.
>>>
>>> (2) Instead of '+' or ';' to separate gene id's or transcript names,
>>> these columns would each be a CharacterList.
>>>
>>> If Mike and Alejandro are ok with these I'll go ahead and implement them.
>>>
>>> Valerie
>>>
>>>
>>>
>>> On 07/31/2013 06:29 AM, Nicolas Delhomme wrote:
>>>> Hej Val, I believe that one is for you :-)
>>>>
>>>> When using the aggregateGenes=TRUE parameter of the disjointExons
>>>> function, the gene names are separated by a "+" character. Is there a
>>>> particular reason for that? The reason I'm asking is that in the
>>>> "transcripts" column the transcripts ID are separated by a semi-column
>>>> and I was wondering if the "separator" could not be unified - i.e.
>>>> using semi-colon for both the geneNames and transcripts column. Here a
>>>> visual example of what I mean:
>>>>
>>>> GRanges with 1 range and 4 metadata columns:
>>>> seqnames ranges strand |
>>>> <Rle> <IRanges> <Rle> |
>>>> [1] Chr03 [4541747, 4541782] - |
>>>> geneNames
>>>> <character>
>>>> [1] Potri.003G035500+Potri.003G035600+Potri.003G035700
>>>> transcripts
>>>> <character>
>>>> [1] PAC:26999771;PAC:26999331;PAC:26999330;PAC:26999332;PAC:26999333
>>>> exonic_part_number exonID
>>>> <integer> <character>
>>>> [1] 1 E001
>>>> ---
>>>> seqlengths:
>>>> Chr01 Chr02 Chr03 ... scaffold_99
>>>> scaffold_991
>>>> NA NA NA ...
>>>> NA NA
>>>>
>>>>
>>>> What do you say?
>>>>
>>>> Cheers,
>>>>
>>>> Nico
>>>>
>>>> ---------------------------------------------------------------
>>>> Nicolas Delhomme
>>>>
>>>> Genome Biology Computational Support
>>>>
>>>> European Molecular Biology Laboratory
>>>>
>>>> Tel: +49 6221 387 8310
>>>> Email: nicolas.delhomme at embl.de
>>>> Meyerhofstrasse 1 - Postfach 10.2209
>>>> 69102 Heidelberg, Germany
>>>> ---------------------------------------------------------------
>>>>
>>>> My sessionInfo()R version 3.0.1 (2013-05-16)
>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>
>>>> locale:
>>>> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
>>>> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
>>>> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
>>>> [7] LC_PAPER=C LC_NAME=C
>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] parallel stats graphics grDevices utils datasets methods
>>>> [8] base
>>>>
>>>> other attached packages:
>>>> [1] Rsamtools_1.13.26 Biostrings_2.29.14 DEXSeq_1.7.6
>>>> [4] GenomicFeatures_1.13.21 AnnotationDbi_1.23.18 Biobase_2.21.6
>>>> [7] GenomicRanges_1.13.35 XVector_0.1.0 IRanges_1.19.19
>>>> [10] BiocGenerics_0.7.3 BiocInstaller_1.11.4
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] biomaRt_2.17.2 bitops_1.0-5 BSgenome_1.29.1 DBI_0.2-7
>>>> [5] hwriter_1.3 RCurl_1.95-4.1 RSQLite_0.11.4
>>>> rtracklayer_1.21.9
>>>> [9] statmod_1.4.17 stats4_3.0.1 stringr_0.6.2
>>>> tools_3.0.1
>>>> [13] XML_3.98-1.1 zlibbioc_1.7.0
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
More information about the Bioc-devel
mailing list