[BioC] Howto annotate blast subject.id with AnnotationDbi
Arnaud Mounier
arnaud.mounier at dijon.inra.fr
Thu May 30 08:28:21 CEST 2013
Le 29/05/2013 19:54, Marc Carlson a écrit :
> Unfortunately no. Those IDs are not present in the org.At.eg.db package
> as this is a gene-level annotation package. These kinds of IDs have
> never been included in this package, although I guess that we could
> consider adding them at some point in the future.
Indeed, it could be a good idea because there some issues which can't
avoid this future (I think).
Here an example from a blastp :
> df.blast.report[df.blast.report$"query.id" == "medtr7g099680.1",]
query.id subject.id identity alignment.length mismatches
gap.opens q.start q.end s.start s.end evalue bit.score
99 medtr7g099680.1 AT1G79930.2 35.62 438 266
4 10 434 4 438 2e-85 289
100 medtr7g099680.1 AT1G79930.1 35.62 438 266
4 10 434 4 438 2e-85 290
101 medtr7g099680.1 AT2G32120.2 33.98 512 310
9 9 508 30 525 7e-85 282
102 medtr7g099680.1 AT2G32120.1 33.98 512 310
9 9 508 30 525 7e-85 282
103 medtr7g099680.1 AT1G79920.1 35.62 438 266
4 10 434 4 438 1e-84 288
104 medtr7g099680.1 AT1G79920.2 35.62 438 266
4 10 434 4 438 2e-84 287
Each row couple (99-100, 101-102, 103-104) have the same query.id and
the difference between each subject.id in this 3 couples is only at gene
model level. Information should be lost after this point.
You can notice that despite the gene model's difference, all other
information are the same. But here another example with 3 gene model
different for the same locus and 3 differents hits.
> df.blast.report[df.blast.report$"query.id" == "medtr8g081490.1",]
query.id subject.id identity alignment.length mismatches
gap.opens q.start q.end s.start s.end evalue bit.score
188 medtr8g081490.1 AT4G13940.3 80.35 453 43
2 1 452 1 408 0 734
189 medtr8g081490.1 AT4G13940.2 89.43 331 34
1 123 452 2 332 0 622
190 medtr8g081490.1 AT4G13940.4 89.29 308 32
1 1 307 1 308 0 559
Thank's for you reply,
Ar.
--
« Le soleil filtre à travers les branches des arbres par éclairs, comme
le sens à travers la langue. »
Nancy Huston
Arnaud Mounier
INRA - UMR Agroécologie 1347
CNRS - ERL IPM 6300 (Plant-Microorganism Interaction)
17, rue Sully - BP 86510 - F-21065 Dijon Cedex - France
Work phone : +33 380 693 167 - Fax : +33 380 693 753
https://www6.dijon.inra.fr/umragroecologie/Personnel/IPM/ITA/MOUNIER-Arnaud
More information about the Bioconductor
mailing list