[BioC] DNA sequence analysis question

Paul Shannon pshannon at fhcrc.org
Wed Mar 14 19:12:47 CET 2012


Your questions may need a longer and more detailed reply, but let me start with these comments.

On Mar 14, 2012, at 8:25 AM, andigoni wrote:

> 
> Is there any package for motif finding in DNA sequences that uses consensus matrices as input?

BiocViews 'SequenceMatching'   (http://www.bioconductor.org/packages/release/BiocViews.html#___SequenceMatching) mentions 'cosmo', about which the author (Oliver Benbom) says:

Cosmo searches a set of unaligned DNA sequences for a shared motif that may, for example, represent a common transcription factor binding site. The algorithm is similar to MEME, but also allows the user to specify a set of constraints that the position weight matrix of the unknown motif must satisfy. Such constraints may include bounds on the information content across certain regions of the unknown motif, for example, and can often be formulated on the basis of prior knowledge about the structure of the transcription factor in question. The unknown motif width, the distribution of motif occurrences (OOPS,ZOOPS, or TCM), as well as the appropriate constraint set can be selected data-adaptively.
 
> 
> In addition, I need to extract genomic features e.g. exons, intron, utr, splice sites etc. given the genomic coordinates for multiple sequences. Is there any package for such analyses?

Just this week I have used GenomicFeatures and TxDb.Hsapiens.UCSC.hg19.knownGene to extract promoter sequence.  Is this the sort of thing you want to do?

 - Paul

> 
> 
> Thanks in advance!!
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list