[BioC] find overlap of bed files of different length

Christoph Bartenhagen Christoph.Bartenhagen at ukmuenster.de
Sun Jan 30 03:08:00 CET 2011


Hi,

although this is a bioconductor mailing list, I'd suggest to take a look 
at an independent, non-R program in this case: BEDTools.
It has several functions to process BED-files including a method to find 
overlaps between two BED-files (I think it's called intersectBED and you 
might need to convert your gene reference file into BED-format; columns 
for chromosome, start and end are sufficient). Here you can also 
specifiy the mean overlapping percentage. BEDTools is not very difficult 
to get into and has a quite good manual in my opinion.
Sorry I don't know a suitable solution in R, but this should do exactly 
what you want.

Cheers,
Christoph


Am 30.01.2011 01:33, schrieb Duke:
> Hi all,
>
> I need to find overlap between a text file (BED format) and a gene 
> reference. The BED file contains sequence of different lengths, and I 
> need to find all the sequences that lye inside the gene (meaning 
> overlapping percentage is 100%). I found findOverlaps function in 
> GenomicRanges, but the parameter to control overlap (minoverlap) does 
> not let me control percentage.
>
> Anybody has any suggestion for me?
>
> Thanks so much,
>
> D.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list