[Bioc-devel] GSEABase::getOBOCollection() missing children

Robert Castelo robert.castelo at upf.edu
Mon Jun 8 11:12:37 CEST 2015


hi Martin,

thanks for the quick fix!!!

best regards,

robert.

On 06/05/2015 08:05 PM, Martin Morgan wrote:
> On 06/05/2015 08:51 AM, Robert Castelo wrote:
>> hi,
>>
>> importing an OBO file with GSEABase::getOBOCollection() I have
>> observed missing
>> children in the imported ontology. Here is an example with the
>> Sequence Ontology:
>
> Thanks Robert, the import went ok, but the coercion to graphNEL was
> flawed. This is fixed in 1.31.2 in devel, and will be ported to release
> / available via biocLite tomorrow afternoon (all being well...)
>
>
> Martin
>
>>
>> library(GSEABase)
>>
>> oboSOXP <-
>> getOBOCollection("http://sourceforge.net/p/song/svn/HEAD/tree/trunk/so-xp.obo")
>>
>> Warning message:
>> In readLines(src) :
>> incomplete final line found on
>> 'http://sourceforge.net/p/song/svn/HEAD/tree/trunk/so-xp.obo'
>> gSOXP <- as(oboSOXP, "graphNEL")
>> edges(gSOXP)[["SO:0001622"]]
>> [1] "SO:0001968"
>>
>> so the term SO:0001622 in principle has only one child term
>> SO:0001968. However,
>> a free text search for this entry in the OBO file shows the following:
>>
>> [Term]
>> id: SO:0001622
>> name: UTR_variant
>> def: "A transcript variant that is located within the UTR." [SO:ke]
>> synonym: "UTR variant" EXACT []
>> synonym: "UTR_" EXACT ebi_variants
>> [http://ensembl.org/info/docs/variation/index.html]
>> is_a: SO:0001791 ! exon_variant
>> is_a: SO:0001968 ! coding_transcript_variant
>> created_by: kareneilbeck
>> creation_date: 2010-03-23T11:22:58Z
>>
>> that is, it has two children, not just one. The child SO:0001791 is
>> missing.
>> Actually, looking to the distribution of the number of children per
>> term, they
>> all have at most one child:
>>
>> nchild <- sapply(edges(gSOXP), length)
>> table(nchild)
>> nchild
>> 0 1
>> 206 2072
>>
>> I have not found in the manual page of getOBOCollection() that this
>> function
>> cannot import more than one child per term, so I guess this is either
>> a bug or
>> an oversight issue.
>>
>> cheers,
>>
>> robert.
>>
>
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioc-devel mailing list