[BioC] trouble reading DNA stringset from keggGet function
Elliot [guest]
guest at bioconductor.org
Tue Sep 10 19:55:20 CEST 2013
I am having some difficulty making fasta files out of files returned by the keggGet function in the KEGGREST package. The file returned is apparently a DNA string set, but readDNAStringSet will not process it. I've tried it with other data and with different kinds of sequences (amino acid) and received the same error message -- I'm sure I must be missing something. My R output is below. Thanks so much for any help!
-- output of sessionInfo():
> genes<-keggLink("ath00906")
> head(genes)
[,1] [,2] [,3]
[1,] "path:ath00906" "ath:AT1G06820" "reverse"
[2,] "path:ath00906" "ath:AT1G08550" "reverse"
[3,] "path:ath00906" "ath:AT1G10830" "reverse"
[4,] "path:ath00906" "ath:AT1G30100" "reverse"
[5,] "path:ath00906" "ath:AT1G31800" "reverse"
[6,] "path:ath00906" "ath:AT1G52340" "reverse"
> sequences<-keggGet(genes[1:10,2],"ntseq")
> head(sequences)
A DNAStringSet instance of length 6
width seq names
[1] 1788 ATGGATTTGTGTTTTC...AGGACACTCGCATAG ath:AT1G06820 CRT...
[2] 1389 ATGGCAGTAGCTACAC...AGGAAGGTCAGGTAG ath:AT1G08550 NPQ...
[3] 858 ATGGCGGTTTATCATC...ATTGGATTTTTATGA ath:AT1G10830 Z-I...
[4] 1770 ATGGCTTGTTCTTACA...TTAAACCAGGCTTAA ath:AT1G30100 NCE...
[5] 1788 ATGGCTATGGCCTTTC...TCTGCTCTTTCTTAA ath:AT1G31800 CYP...
[6] 858 ATGTCAACGAACACTG...AAAGTCTTCAGATGA ath:AT1G52340 ABA...
> readDNAStringSet(sequences,"fasta")
Error in .normargInputFilepath(filepath) :
'filepath' must be a character vector with no NAs
> class(sequences) #confirm that the input is a DNA string set
[1] "DNAStringSet"
attr(,"package")
[1] "Biostrings"
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list