[BioC] retrieve genomic coordinates of FASTA file

James W. MacDonald jmacdon at med.umich.edu
Tue Jan 27 17:29:10 CET 2009


Hi João,

João Fadista wrote:
> Hi,
>  
> I would like to know if there is any bioconductor (or other) tool that is able to retrieve a batch of sequences from a fasta file, given the desired genomic coordinates.
>  
> Desired genomic coordinates: 
> chr1 1 5
> chr2 8 13
>  
> Example of a fasta input file:
>> pig_chr1
> acacactagagata
>> pig_chr2
> gagagagcgcgcatgtgt
>  
> Example of a fasta output file:
>> pig_chr1_start1_end5
>  acaca
>> pig_chr2_start8_end13
> cgcgca

 > library(Biostrings)
 > tmp <- read.DNAStringSet("pig.fa","fasta")
Read 4 items
 > tmp
   A DNAStringSet instance of length 2
     width seq                                               names 

[1]    14 ACACACTAGAGATA                                    pig_chr1
[2]    18 GAGAGAGCGCGCATGTGT                                pig_chr2
 > tmp[[1]][1:5]
   5-letter "DNAString" instance
seq: ACACA
 > tmp[[2]][8:13]
   6-letter "DNAString" instance
seq: CGCGCA
 > as.character(tmp[[1]][1:5])
[1] "ACACA"

Best,

Jim



>  
>  
>  
>  
> 
> Med venlig hilsen / Regards
> 
> João Fadista
> Ph.d. studerende / Ph.d. student
> 
> 
>  	
>  	 AARHUS UNIVERSITET / UNIVERSITY OF AARHUS	
> Det Jordbrugsvidenskabelige Fakultet / Faculty of Agricultural Sciences	
> Inst. for Genetik og Bioteknologi / Dept. of Genetics and Biotechnology	
> Blichers Allé 20, P.O. BOX 50	
> DK-8830 Tjele	
>  	
> Tel:	 +45 8999 1900	
> Direct:	 +45 8999 1342	
> Mobile:	 +45 	
> E-mail:	 Joao.Fadista at agrsci.dk <mailto:Joao.Fadista at agrsci.dk> 	
> Web:	 www.agrsci.dk <http://www.agrsci.dk/> 	
> ________________________________
> 
> DJF udbyder nye uddannelser <http://www.agrsci.dk/ny_navigation/uddannelse/>  / DJF now offers new degree programmes <http://www.agrsci.org/content/view/full/34133> . 
> 
> Tilmeld dig DJF's nyhedsbrev / Subscribe Faculty of Agricultural Sciences Newsletter <http://www.agrsci.dk/user/register?lan=dan-DK> . 
> 
> Denne email kan indeholde fortrolig information. Enhver brug eller offentliggørelse af denne email uden skriftlig tilladelse fra DJF er ikke tilladt. Hvis De ikke er den tiltænkte adressat, bedes De venligst straks underrette DJF samt slette emailen.
> 
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662



More information about the Bioconductor mailing list