[BioC] find overlap of bed files of different length
Christoph Bartenhagen
Christoph.Bartenhagen at ukmuenster.de
Sun Jan 30 03:08:00 CET 2011
Hi,
although this is a bioconductor mailing list, I'd suggest to take a look
at an independent, non-R program in this case: BEDTools.
It has several functions to process BED-files including a method to find
overlaps between two BED-files (I think it's called intersectBED and you
might need to convert your gene reference file into BED-format; columns
for chromosome, start and end are sufficient). Here you can also
specifiy the mean overlapping percentage. BEDTools is not very difficult
to get into and has a quite good manual in my opinion.
Sorry I don't know a suitable solution in R, but this should do exactly
what you want.
Cheers,
Christoph
Am 30.01.2011 01:33, schrieb Duke:
> Hi all,
>
> I need to find overlap between a text file (BED format) and a gene
> reference. The BED file contains sequence of different lengths, and I
> need to find all the sequences that lye inside the gene (meaning
> overlapping percentage is 100%). I found findOverlaps function in
> GenomicRanges, but the parameter to control overlap (minoverlap) does
> not let me control percentage.
>
> Anybody has any suggestion for me?
>
> Thanks so much,
>
> D.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list