[BioC] comparing two tables
Martin Morgan
mtmorgan at fhcrc.org
Tue Oct 25 15:48:26 CEST 2011
On 10/25/2011 03:42 AM, Assa Yeroslaviz wrote:
> Hi everybody,
>
> I would like to know whether it is possible to compare to tables for certain
> parameters.
> I have these two tables:
> gene table
> name chr start end str accession Length
> gen1 4 646752 646838 + MI0005806 86
> gen12 2L 243035 243141 - MI0005821 106
> gen3 2L 159838 159928 + MI0005813 90
> gen7 2L 1831685 1831799 - MI0011290 114
> gen4 2L 2737568 2737661 + MI0017696 93
> ...
>
> localization table:
> Chr Start End length
> 4 136532 138654 2122
> 3 139870 141970 2100
> 2L 157838 158440 602
> X 160834 162966 2132
> 4 204040 208536 4496
> ...
>
> I would like to check whether a specific gene lie within a certain region.
> For example I want to see if gene 3 on chromosome 2L lies within the region
> given in the second table.
Hi Assa --
In Bioconductor, use the GenomicRanges package. Create two GRanges objects
genes = with(genetable, GRanges(chr, IRanges(start, end), str,
accession=accession, Length=length)
locations = with(locationtable, GRanges(Chr, IRanges(Start, End)))
then
olaps = findOverlaps(genes, locations)
queryHits(olaps) and subjectHits(olaps) index each gene with all
locations it overlaps. The definition of 'overlap' is flexible, see
?findOverlaps.
Martin
>
> What I would like to is like
> 1. check if the gene lies on a specific chromosome
> 1.a if no - go to the next line
> 1.b if yes - go to 2
> 2. check if the start position of the gene is bigger than the start position
> of the localization table AND if it smaller than the end position (if it
> lies between the start and end positions in the localization table)
> 2.a if no - go to the next gene
> 2.b if yes - give it to me.
>
> I was having difficulties doing it without running into three interleaved
> conditional loops (if).
>
> I would appreciate any help.
>
> Thanks
>
> Assa
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list