[BioC] Determining an overlapping annotation data subset (overlap/overlaps)
Herve Pages
hpages at fhcrc.org
Tue Aug 7 02:51:45 CEST 2007
Hi Stephen,
> A <- data.frame(start=(1:5)*10L, end=(4:8)*10L)
> A
start end
1 10 40
2 20 50
3 30 60
4 40 70
5 50 80
> B <- data.frame(start=c(31L, 39L, 80L), end=c(60L, 40L, 84L))
> B
start end
1 31 60
2 39 40
3 80 84
You can create a logical vector of the length the number of rows in A: for each
A-row it says if there is any B-row inside:
contains_a_Brow <- mapply(function(Astart, Aend) any(Astart <= B$start & B$end <= Aend),
A$start, A$end)
Then use this logical vector to subset A:
A[contains_a_Brow, ]
Cheers,
H.
Stephen Montgomery wrote:
> Hello Bioconductor -
>
> Apologies as this a fairly rookie bioinformatics based R question, but I
> am trying to determine if there is a R one-liner to extract a subset of
> a data frame which possesses annotation contained within it that has
> been stored in another data frame? (For example extracting genomic
> intervals which contain certain features/annotation)
>
> Such that:
> If I have dataframe "A" possessing an "id", "start", and "end"; And
> dataframe "B" also possessing an "id", "start", and "end"; The output is
> all the rows of A which contain an entry of B (B$start, B$end) within
> A$start and A$end.
>
> I have tried my own fairly uninformed variants like this to no-avail
> A[length(B[B$start <= A$end & B$end >= A$start]) > 0,]
> I fear the solution will be trivial but as yet it has eluded me. :/
>
> Thanks for any help! (Theoretically, I can also see doing this in its
> own function by creating a vector of counts for each member of "A" and
> then reporting those that are non-zero but I was wondering if there was
> a more succinct and likely efficient way)
>
> Thanks again,
> Stephen
>
>
>
> Stephen Montgomery, B.A.Sc., Ph.D.
> Postdoctoral Researcher, Team 16
> Wellcome Trust Sanger Institute
> Hinxton, Cambridge CB10 1SA
> Phone: 44-1223-834244 (ext 7297)
> Skype: stephen.b.montgomery
>
>
>
>
More information about the Bioconductor
mailing list