[BioC] Automated blasting of short nucleotide sequences against
each other
Sean Davis
sdavis2 at mail.nih.gov
Fri Feb 18 21:03:16 CET 2005
Ken,
Actually, if you think about how blast (or blat or other alignment
programs) works, you just need to blast the fasta against the blast
database of the same sequences. You will get output from blast that
includes each sequence blasted against all others, with the obvious
caveaut that not all sequences are going to align, so there will be
some missing comparisons--no way around that. Then, you just need to
put them into some useful form--consider using bioperl if you have
access to it. Better yet, if these are sequences from the same
organism, just use blat and the output is tab-delimited text which you
can load directly into R. If you use blat, you can just do:
blat db.fasta db.fasta outfile.psl
This should take just a few seconds on a modern machine, depending on
the length of the sequences.
Sean
On Feb 18, 2005, at 2:41 PM, Ken Termiso wrote:
> Hi all,
>
> This may be slightly off-topic, but I'd like to be able to BLAST a
> large set of about 500 nucleotide sequences against itself (i.e.
> sequence #1 gets blasted against the other 499 sequences and so on,
> for a total of 500x500 or 250,000 blasts), and one thing I
> unbelievably cannot google on the net is a script to do it...rather
> than writing one I was hoping that someone could point me to a link
> for this...I found tons of scripts for doing it against a database,
> but nothing with a matrix like I need to BLAST...
>
> My sequences are in plain text. I've got the standalone blast, but
> just need a script...
>
> Presumably this would be very useful for analyzing pseudo-homologous
> probe sequences..?..so maybe it isn't completely off-topic...
>
> Thanks in advance,
> Ken
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
More information about the Bioconductor
mailing list