[BioC] Using UniProt.ws to retrieve protein features

David_Gomez [guest] guest at bioconductor.org
Thu Jan 2 12:49:11 CET 2014

I have just downloaded Bioconductor (v2.13) and the UniProt.ws (2.0.1) packages. I want to retrieve all the feature information from a protein, but when I search the column “FEATURES” I only get the number of each type of feature, but not their whole lists. Is there any way to get them?

Code example:
taxId(UniProt.ws) <- 9606
keys <- c("Q9UNQ0") #Human BCRP
columns <- c("UNIPROTKB", "FEATURES")
res <- select(UniProt.ws, keys, columns, kt)
>1    Q9UNQ0
>1 Alternative sequence (2); Chain (1); Disulfide bond (2); Domain (2); Frameshift (2); Glycosylation (1); Mutagenesis (11); Natural variant (18); Nucleotide binding (1); Sequence conflict (9); Site (2); Topological domain (7); Transmembrane (6)

And I would like to have all the information for every one of those, as we found them in UniProt
e.g. from http://www.uniprot.org/uniprot/Q9UNQ0.txt:
FT   CHAIN         1    655       ATP-binding cassette sub-family G member
FT                                2.
FT                                /FTId=PRO_0000093386.
FT   TOPO_DOM      1    395       Cytoplasmic (Potential).
FT   TRANSMEM    396    416       Helical; (Potential).
FT   TOPO_DOM    417    428       Extracellular (Potential).
FT   TRANSMEM    429    449       Helical; (Potential).

I have already read the UniProt.ws documentation and searched for older topics but I don’t find useful information.

Thank you in advance for any help.
David Gomez

 -- output of sessionInfo(): 

R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

[1] LC_COLLATE=Spanish_Mexico.1252  LC_CTYPE=Spanish_Mexico.1252   
[3] LC_MONETARY=Spanish_Mexico.1252 LC_NUMERIC=C                   
[5] LC_TIME=Spanish_Mexico.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] UniProt.ws_2.2.0 RCurl_1.95-4.1   bitops_1.0-6     RSQLite_0.11.4  
[5] DBI_0.2-7       

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.24.0 Biobase_2.22.0       BiocGenerics_0.8.0  
[4] IRanges_1.20.6       parallel_3.0.2       stats4_3.0.2        

Sent via the guest posting facility at bioconductor.org.

More information about the Bioconductor mailing list