[Bioc-sig-seq] ChIPseq, repeated regions and multiple matched tags

Romain FENOUIL fenouil at ciml.univ-mrs.fr
Tue Nov 4 17:31:39 CET 2008


Hi all,

I'm quiet new to this mailing list system so I would firstly like to introduce myself.

My name is Romain Fenouil and I'm working as bioinformatician in the Immunology Center of Marseille Luminy.
Until now, we were mainly working on assessing transcription factor binding sites with ChIP-on-Chip experiments and we recently tried some ChIPseq runs.
Firstly, we wanted to assess the differences between these two techniques and we are now interested in going further with this method.

We are working in collaboration with another lab that gives us the Eland aligned files and the Raw data (tag sequences) files.
I managed (with some trouble) to load these aligned data and to have a look at the enrichment profiles for the TF of interest. It's amazing !
I'm also working on using maq alignment software to make a comparison between the eland and maq alignments.

I'm using the ShortRead R package to load the aligned data and play with it. And after some time to understand how it works, I have to say that it's really convenient.
(Thank you to Martin Morgan and Simon Anders who helped me for eland and maq data loading).


So after this big introduction, here is my question :

We are now able to have enrichment profiles for all the genome except repeated regions.
Since we are never satisfied, we began to be interested in some repeated regions (snRNAs for instance).
We already have some ideas on how to assess binding events in case of repeated regions but we need some informations on these regions.

Specially, I would like to know if there is a way to get the list of tags that were implied in multiple match in the genome.
What we would like to have is the number of match of a tag in the genome and the locations of these matches.
I heard that one can have access to different informations on these tags depending on which alignment software he is using.

For instance i have been said that eland doesn't give you much information on tags that have multimatches. What about maq ?
maybe SOAP ? Will I have to deal with bioStrings and try to remap it manually ?

So if you have any general information or ideas about how to deal with it, I would be interested.
Of course, I can give you a lot more information if needed and I hope that this message will be a good introduction for a long discussion.

Thank you very much.

Best regards,

Romain.


PS : I hope I didn't do a mistake in my posting process :-) I you have a basic tutorial on how to correctly use this mailing list (specially answering procedure) I can be interested.


Romain Fenouil / Laboratoire Pierre Ferrier
Centre d'Immunologie de Marseille-Luminy - CIML
Parc Scientifique et technologique de Luminy
Case 906
13288 MARSEILLE CEDEX 9
tél : 04.91.26.94.46 - fax : 04.91.26.94.30
http://www.ciml.univ-mrs.fr



More information about the Bioc-sig-sequencing mailing list