[BioC] read sequences from fasta file starting with > sign and untill next > sign
Jack [guest]
guest at bioconductor.org
Fri Sep 14 15:11:08 CEST 2012
Hi:
I am trying to read sequences from a fasta file starting with > till the next > sign:
library(ShortRead)
setwd("fastafolder");
con <- file("somefastafile.fa");
open(con)
pattern <- as.character("TACC")
while(length(res <- readLines(con, n=1)))
{
#do something
}
close(con)
With this while statement I am able to read a single line from the fasta file each time. But I want to read a chunk of links each time from the fasta file starting with > sign and till the next > sign.
Example
>AAATTT
TAGGCT
ATTTGC
>CGATTT
And I want to read the following in the first run of while loop
>AAATTT
TAGGCT
ATTTGC
Thanks for your help.
Regards:
Jack
-- output of sessionInfo():
> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] CHNOSZ_0.9-7 ShortRead_1.14.4 latticeExtra_0.6-24 RColorBrewer_1.0-5 Rsamtools_1.8.6 lattice_0.20-10 Biostrings_2.24.1
[8] GenomicRanges_1.8.13 IRanges_1.14.4 BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] Biobase_2.16.0 bitops_1.0-4.1 grid_2.15.1 hwriter_1.3 stats4_2.15.1 tools_2.15.1 zlibbioc_1.2.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list